From seiraiyu-skills
Comprehensive end-to-end testing command. Launches parallel sub-agents to research the codebase (structure, database schema, potential bugs), then uses the Vercel Agent Browser CLI to test every user journey — taking screenshots, validating UI/UX, and querying the database to verify records. Run after implementation to validate everything before code review.
How this skill is triggered — by the user, by Claude, or both
Slash command
/seiraiyu-skills:e2e-test-agent-browserThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
agent-browser requires **Linux, WSL, or macOS**. Check the platform:
agent-browser requires Linux, WSL, or macOS. Check the platform:
uname -s
Linux or Darwin → proceedMINGW, CYGWIN, or native Windows) → stop with:"agent-browser only supports Linux, WSL, and macOS. It cannot run on native Windows. Please run this command from WSL or a Linux/macOS environment."
Stop execution if the platform is unsupported.
Verify the application has a browser-accessible frontend. Check for:
package.json with a dev/start script serving a UIIf no frontend is detected:
"This application doesn't appear to have a browser-accessible frontend. E2E browser testing requires a UI to visit. For backend-only or API testing, a different approach is needed."
Stop execution if no frontend is found.
Check if agent-browser is installed:
agent-browser --version
If the command is not found, install it automatically:
npm install -g agent-browser
After installation (or if it was already installed), ensure the browser engine is set up:
agent-browser install --with-deps
The --with-deps flag installs system-level Chromium dependencies on Linux/WSL. On macOS it is harmless.
Verify installation succeeded:
agent-browser --version
If installation fails, stop with:
"Failed to install agent-browser. Please install it manually with
npm install -g agent-browser && agent-browser install --with-deps, then re-run this command."
Launch three sub-agents simultaneously using the Task tool. All three run in parallel.
Research this codebase thoroughly. Return a structured summary covering:
- How to start the application — exact commands to install dependencies and run the dev server, including the URL and port it serves on
- Authentication/login — if the app has protected routes, how to create a test account or log in (credentials from .env.example, seed data, or sign-up flow)
- Every user-facing route/page — each URL path and what it renders
- Every user journey — complete flows a user can take (e.g., "sign up → create profile → view public page"). For each journey, list the specific steps, interactions (clicks, form fills, navigation), and expected outcomes
- Key UI components — forms, modals, dropdowns, pickers, toggles, and other interactive elements that need testing
Be exhaustive. Testing will only cover what you identify here.
Research this codebase's database layer. Read
.env.exampleto understand environment variables for database connections. DO NOT read.envdirectly. Return a structured summary covering:
- Database type and connection — what database is used (Postgres, MySQL, SQLite, etc.) and the environment variable name for the connection string (from .env.example)
- Full schema — every table, its columns, types, and relationships
- Data flows per user action — for each user-facing action (form submit, button click, etc.), document exactly what records are created, updated, or deleted and in which tables
- Validation queries — for each data flow, provide the exact query to verify records are correct after the action
Analyze this codebase for potential bugs, issues, and code quality problems. Focus on:
- Logic errors — incorrect conditionals, off-by-one errors, missing null checks, race conditions
- UI/UX issues — missing error handling in forms, no loading states, broken responsive layouts, accessibility problems
- Data integrity risks — missing validation, potential orphaned records, incorrect cascade behavior
- Security concerns — SQL injection, XSS, missing auth checks, exposed secrets
Return a prioritized list with file paths and line numbers.
Wait for all three sub-agents to complete before proceeding.
Using Sub-agent 1's startup instructions:
.env.example, copy it to .env and fill in required values. Run database seeding or migrations if needed.npm run dev &)for i in $(seq 1 30); do
curl -s -o /dev/null -w "%{http_code}" http://localhost:PORT | grep -q "200\|304" && break
sleep 1
done
Replace PORT with the actual port from Sub-agent 1's research.agent-browser open <url> and confirm it loadsagent-browser screenshot e2e-screenshots/00-initial-load.pngUsing the user journeys from Sub-agent 1 and findings from Sub-agent 3, create a task (using TaskCreate) for each user journey. Each task should include:
Also create a final task: "Responsive testing across viewports."
For each task, mark it in_progress with TaskUpdate and execute the following.
Use the Vercel Agent Browser CLI for all browser interaction:
agent-browser open <url> # Navigate to a page
agent-browser snapshot -i # Get interactive elements with refs (@e1, @e2...)
agent-browser click @eN # Click element by ref
agent-browser fill @eN "text" # Clear field and type
agent-browser select @eN "option" # Select dropdown option
agent-browser press Enter # Press a key
agent-browser screenshot <path> # Save screenshot
agent-browser screenshot --annotate # Screenshot with numbered element labels
agent-browser set viewport W H # Set viewport (e.g., 375 812 for mobile)
agent-browser wait --load networkidle # Wait for page to settle
agent-browser console # Check for JS errors
agent-browser errors # Check for uncaught exceptions
agent-browser get text @eN # Get element text
agent-browser get url # Get current URL
agent-browser close # End session
Refs become invalid after navigation or DOM changes. Always re-snapshot after page navigation, form submissions, or dynamic content updates (modals, tabs, theme changes).
For each step in a user journey:
e2e-screenshots/ organized by journey (e.g., e2e-screenshots/profile-creation/03-form-submitted.png)agent-browser console and agent-browser errors periodically for JavaScript issuesBe thorough. Go through EVERY interaction, EVERY form field, EVERY button. The goal is that by the time this finishes, every part of the UI has been exercised and screenshotted.
After any interaction that should modify data (form submits, deletions, updates):
psql directly — e.g., psql "$DATABASE_URL" -c "SELECT theme FROM profiles WHERE username = 'testuser'"sqlite3 directly — e.g., sqlite3 db.sqlite "SELECT theme FROM profiles WHERE username = 'testuser'"When an issue is found (UI bug, database mismatch, JS error):
For the responsive testing task, revisit key pages at these viewports:
agent-browser set viewport 375 812agent-browser set viewport 768 1024agent-browser set viewport 1440 900At each viewport, screenshot every major page. Analyze for layout issues, overflow, broken alignment, and touch target sizes on mobile.
After completing each journey, mark its task as completed with TaskUpdate.
After all testing is complete:
agent-browser closePresent a concise summary:
## E2E Testing Complete
**Journeys Tested:** [count]
**Screenshots Captured:** [count]
**Issues Found:** [count] ([count] fixed, [count] remaining)
### Issues Fixed During Testing
- [Description] — [file:line]
### Remaining Issues
- [Description] — [severity: high/medium/low] — [file:line]
### Bug Hunt Findings (from code analysis)
- [Description] — [severity] — [file:line]
### Screenshots
All saved to: `e2e-screenshots/`
After the text summary, ask the user:
"Would you like me to export the full testing report to a markdown file? It includes per-journey breakdowns, all screenshot references, database validation results, and detailed findings — useful as context for follow-up fixes or GitHub issues."
If yes, write a detailed report to e2e-test-report.md in the project root containing:
npx claudepluginhub stonelyd/seiraiyu-skills --plugin seiraiyu-skillsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.