From just-ship
Guides web app testing strategy: test pyramid (unit/integration/component/E2E), framework selection (Vitest/Playwright/Jest), coverage rules, mocking boundaries, and execution.
How this skill is triggered — by the user, by Claude, or both
Slash command
/just-ship:webapp-testingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
⚡ Testing Engineer joined
⚡ Testing Engineer joined
Testing strategy, framework selection, and execution for web applications. Covers the full testing stack — from unit tests to visual verification.
Not all tests are equal. Choose the right level for what you're testing.
╱ E2E ╲ Few — slow, brittle, high confidence
╱───────╲
╱Integration╲ Some — test real boundaries
╱─────────────╲
╱ Unit Tests ╲ Many — fast, isolated, focused
╰─────────────────╯
| Level | Test When... | Examples |
|---|---|---|
| Unit | Pure functions, business logic, data transformations, validation rules, utilities | formatDate(), calculateDiscount(), validateEmail(), schema parsing |
| Integration | Components interact with real boundaries (DB, API, auth, file system) | API endpoint returns correct data, RLS policy blocks unauthorized access, webhook processes payload |
| Component | UI components render correctly and respond to interaction | Button disables on click, form shows validation errors, modal opens/closes |
| E2E | Critical user journeys that span multiple pages/systems | Checkout flow, signup → onboarding, auth → dashboard redirect |
Choose the framework based on the project's stack. Read project.json for the stack.
Stack?
├── Next.js / React → Vitest + @testing-library/react
├── Remix / React Router → Vitest + @testing-library/react
├── Vue / Nuxt → Vitest + @vue/test-utils
├── Svelte / SvelteKit → Vitest + @testing-library/svelte
├── Node.js / Express / Hono → Vitest (or Jest if already configured)
├── TypeScript (no framework) → Vitest
├── Playwright already in project → Playwright for E2E
└── Jest already configured → Keep Jest (don't migrate mid-ticket)
Default choice: Vitest. Faster than Jest, native ESM/TypeScript support, compatible API.
Exception: If the project already uses Jest with significant test infrastructure, keep Jest. Don't migrate frameworks within a feature ticket.
utils.ts → utils.test.tstests/ directory if project.json specifies paths.tests{filename}.test.ts or {filename}.spec.ts — match existing convention in the projectMocking is a tool, not a default. Every mock hides a real interaction.
| What | Why | How |
|---|---|---|
| External HTTP APIs | Slow, unreliable, costs money | msw (Mock Service Worker) or Vitest vi.mock |
| Database in unit tests | Slow, needs setup/teardown | Mock the repository/data layer, not the DB client directly |
| File system | Side effects, cleanup needed | memfs or mock the fs module |
| Timers / Dates | Non-deterministic | vi.useFakeTimers(), vi.setSystemTime() |
| Environment variables | Test isolation | vi.stubEnv() |
| Third-party SDKs (Stripe, SendGrid) | External dependency, costs money | Mock at the SDK boundary |
| What | Why |
|---|---|
| Your own utility functions | They're fast, deterministic — test them for real |
| Framework primitives (React hooks, Svelte stores) | Mocking them tests nothing real |
| Anything that runs in < 50ms | No performance reason to mock |
| The thing you're actually testing | Mocking the SUT = testing nothing |
| Database in integration tests | The whole point is testing the real query |
"If I remove this mock, does the test still make sense?"
- Yes → The mock is hiding a real dependency. Consider removing it.
- No → The mock is simulating an external boundary. Keep it.
Announce at start: "Starting visual verification with Playwright."
Playwright must be installed:
pip install playwright && playwright install chromium
Task -> Static HTML?
|-- Yes -> Read HTML file, identify selectors
| |-- Playwright script with file:// URL
|
|-- No (dynamic app) -> Server already running?
|-- No -> Use with_server.py (see below)
|-- Yes -> Reconnaissance-then-Action:
1. Navigate + wait for networkidle
2. Screenshot or inspect DOM
3. Identify selectors from rendered state
4. Execute actions with found selectors
The framework includes .claude/scripts/with_server.py — starts server, waits for port readiness, runs automation, cleans up.
# Run --help first to see options
python .claude/scripts/with_server.py --help
# Single Server
python .claude/scripts/with_server.py \
--server "npm run dev" --port 5173 \
-- python test_script.py
# Multi-Server (Backend + Frontend)
python .claude/scripts/with_server.py \
--server "cd backend && python server.py" --port 3000 \
--server "cd frontend && npm run dev" --port 5173 \
-- python test_script.py
Automation scripts contain only Playwright logic — servers are managed by with_server.py:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto('http://localhost:5173')
page.wait_for_load_state('networkidle') # CRITICAL: Wait until JS is loaded
# ... automation logic ...
browser.close()
# Take screenshot
page.screenshot(path='/tmp/inspect.png', full_page=True)
# Inspect DOM
content = page.content()
# Discover elements
buttons = page.locator('button').all()
links = page.locator('a[href]').all()
inputs = page.locator('input, textarea, select').all()
Derive correct selectors from screenshot or DOM.
page.click('text=Dashboard')
page.fill('#email', '[email protected]')
page.click('button[type="submit"]')
console_logs = []
def handle_console(msg):
console_logs.append(f"[{msg.type}] {msg.text}")
page.on("console", handle_console)
page.goto('http://localhost:5173')
page.wait_for_load_state('networkidle')
# Evaluate logs after interactions
for log in console_logs:
if log.startswith("[error]"):
print(f"CONSOLE ERROR: {log}")
headless=True — no GUI neededwait_for_load_state('networkidle') before DOM inspection on dynamic appsbrowser.close())text=, role=, CSS selectors, IDs/tmp/ and verify via Read toolDo not inspect the DOM before networkidle is reached — on dynamic apps the initial DOM is empty/incomplete.
When you finish a testing task, end your turn with a Test Matrix block — one row per test target (file, suite, or feature area). The Reporter (skills/reporter/SKILL.md) renders this into the per-role section of the develop-complete block; freeform prose at the end of your turn is off-voice.
This is the shared Test Matrix template — the same shape is used by skills/test-driven-development/SKILL.md. Both skills emit identical structure so the Reporter can merge them seamlessly.
Render the block verbatim. Fill every field. If a field genuinely does not apply, write — (em dash) — never omit the row.
### Test Matrix
| Test Type | Target | Coverage | Passed | Failed |
|---|---|---|---|---|
| {unit\|integration\|e2e\|smoke\|visual} | {file or suite, e.g. `lib/parser.ts`} | {percent or `—`} | {int} | {int} |
| … | … | … | … | … |
Total: {passed_total} / {total} passed · {failed_total} failed
Rules for the table:
unit, integration, e2e, smoke, visual. No other values.87%), or — if coverage was not collected for this row.Skipped column as the 6th column (after Failed); otherwise omit it entirely. When present, include it in every row and in the Total line: Total: {passed_total} / {total} passed · {failed_total} failed · {skipped_total} skipped.tests_passed / tests_total variables.The Reporter consumes the table verbatim — column order is fixed (Test Type, Target, Coverage, Passed, Failed, and optionally Skipped as column 6), header text is fixed. Do not add adjacent prose or commentary; structured data only.
npx claudepluginhub yves-s/just-ship --plugin just-shipTests local web applications using Playwright: verifies frontend functionality, debugs UI behavior, captures screenshots, views logs. Mandatory before declaring implementation complete.
Automates browser tasks and E2E testing with Playwright: auto-detects dev servers, generates scripts for pages, forms, screenshots, responsive design, UX validation, login flows, cross-browser checks in TypeScript/JavaScript/Python projects.
Guides Playwright end-to-end testing for web apps with cross-browser support (Chromium, Firefox, WebKit), visual regression, API testing, and mobile emulation. Use for E2E tests and UI automation workflows.