From foundry
Deep browser-based UI audit. Clicks every button, fills every form, opens every dropdown, reads every console log — like a scrutinizing QA engineer. Supports headed and headless. Optionally feeds fixes into /ralph. Used as foundry SIGHT stream.
How this skill is triggered — by the user, by Claude, or both
Slash command
/foundry:sightThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Foundry Integration:** This skill's browser audit flow is used as the SIGHT phase (Phase F2c) in `/foundry`. SIGHT runs in the main thread (Playwright MCP requirement). Output goes to `foundry/proofs/cycle-N-sight.json`. Skip with `--no-ui`.
Foundry Integration: This skill's browser audit flow is used as the SIGHT phase (Phase F2c) in
/foundry. SIGHT runs in the main thread (Playwright MCP requirement). Output goes tofoundry/proofs/cycle-N-sight.json. Skip with--no-ui.
Ralph-UI uses Playwright MCP to deeply exercise a web application. It doesn't just visit pages — it clicks every button, fills every form, opens every dropdown, expands every accordion, triggers every modal, and watches what happens. It reads console logs, tracks network requests, and documents everything that's wrong.
Think: a meticulous QA engineer or a frustrated user trying to break things.
Supports headed mode (GUI) and headless mode (CI/remote). Optionally
feeds findings into /ralph for fixes.
Playwright MCP configured in .mcp.json with devtools enabled:
Headed (default — GUI available):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest", "--caps", "vision,devtools"]
}
}
}
Headless (no GUI / CI / remote server):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest", "--headless", "--caps", "vision,devtools"]
}
}
}
The target page must be accessible (running locally or deployed)
| Command | Behavior |
|---|---|
/ralph-ui <url> | Deep audit, document everything |
/ralph-ui <url> --fix | Deep audit + generate plan + run /ralph to fix |
/ralph-ui <url> --parity <reference> | Compare against reference design/URL |
DO NOT just visit a page, glance at it, and move on. For every page:
CRITICAL: This skill requires the Playwright MCP browser tools (mcp__playwright__*).
All browser interaction MUST go through these tools — not code reading, not curl, not
fetch. You must actually navigate pages, click elements, and verify outcomes in a real
browser session.
Required Playwright MCP tools:
| Tool | Purpose |
|---|---|
mcp__playwright__browser_navigate | Go to URLs |
mcp__playwright__browser_snapshot | Read DOM / accessibility tree |
mcp__playwright__browser_click | Click buttons, links, tabs, checkboxes |
mcp__playwright__browser_fill_form | Type into text inputs, textareas |
mcp__playwright__browser_select_option | Choose dropdown/select values |
mcp__playwright__browser_press_key | Keyboard: Enter, Escape, Tab |
mcp__playwright__browser_take_screenshot | Capture visual state |
mcp__playwright__browser_console_messages | Read console errors/warnings |
mcp__playwright__browser_network_requests | Check failed API calls |
mcp__playwright__browser_hover | Hover for tooltips, menus |
mcp__playwright__browser_evaluate | Run JS in the page context |
mcp__playwright__browser_wait_for | Wait for elements, navigation |
mcp__playwright__browser_tabs | Manage browser tabs |
mcp__playwright__browser_resize | Test responsive layouts |
mcp__playwright__browser_handle_dialog | Accept/dismiss alert/confirm |
If any mcp__playwright__* call returns "unknown tool", Playwright MCP is not
configured. Run .claude/scripts/ensure-playwright.sh to set it up.
BANNED ACTIONS — These are NOT UI auditing:
The ONLY way to audit a UI is to open it in a browser and interact with it. Reading the source code that generates the UI is a logical audit, not a UI audit.
Before navigating, verify that both the frontend AND backend are accessible.
Frontend: Try to reach the target URL. If it fails:
.claude/scripts/detect-devserver.shBackend: Check if a separate backend/API server exists and is running:
backend/ or server/ or api/ directory)http://localhost:<port> — if it responds, backend is running--skip-start-backend was NOT passed:
--skip-start-backend was passed: skip, assume user manages the backendWhy: A UI audit without a running backend produces hundreds of false network errors. Every API call fails, every form submission fails, every data table is empty. These aren't UI bugs — they're a missing backend.
Your VERY FIRST tool call in this step MUST be:
mcp__playwright__browser_navigate url: "<target_url>"
If this fails with "unknown tool", STOP and report that Playwright MCP is not available. Do NOT fall back to Bash/curl/code reading.
mcp__playwright__browser_navigatemcp__playwright__browser_take_screenshotmcp__playwright__browser_snapshotmcp__playwright__browser_fill_form
and submit with mcp__playwright__browser_clickmcp__playwright__browser_take_screenshotmcp__playwright__browser_snapshotmcp__playwright__browser_console_messagesmcp__playwright__browser_network_requestscreateBrowserRouter, <Route, routes.ts,
routes.tsx, router.ts), Next.js (app/ or pages/ directory), Vue Router (router/index.ts),
Angular (app-routing.module.ts, *.routes.ts), SvelteKit (src/routes/)For each page reachable from navigation:
Walk through the DOM snapshot top-to-bottom. For each interactive element:
Buttons:
Form fields (text inputs, textareas, number inputs, spinbuttons):
Dropdowns/selects/comboboxes:
Tables:
Tabs:
Search/filter inputs:
Toggle/switches:
Navigation links:
Action menus (kebab/... buttons):
After every interaction, verify the expected outcome actually rendered. A button that silently does nothing, a panel that opens empty, a form that submits but shows no result — these are real bugs even if the console is silent and the network returned 200.
Protocol for every interaction:
What "meaningfully changed" means per interaction type:
| Action | Expected Outcome — Verify This |
|---|---|
| Click "View {X}" / "Show {X}" | Content panel/modal appeared AND contains data related to {X} — not empty, not a spinner stuck forever, not a blank panel |
| Click "Edit" / "Edit {X}" | Edit form appeared with pre-populated fields matching the item being edited |
| Click "Delete" / "Remove" | Confirmation dialog appeared, OR item was removed from the list/table |
| Click "Create" / "New" / "Add" | Creation form or wizard appeared with empty/default fields |
| Click "Deploy" / "Run" / "Execute" | Status indicator changed (progress bar, spinner, status badge update) |
| Submit a form | Success message appeared, OR page redirected, OR the created/updated item is visible in a list |
| Toggle a switch | Visual state flipped AND (if applicable) the API call confirmed the change |
| Open a dropdown | Options list appeared AND is populated (not empty) |
| Select a dropdown option | The selected value is displayed AND any dependent UI updated |
| Click a tab | Tab content area changed to show the tab's content — not empty, not the previous tab's content |
| Click a table sort header | Row order changed (or stayed same if already sorted that direction) |
| Click pagination | Different rows are shown, page indicator updated |
| Expand an accordion | Content section appeared with actual content inside |
| Click a navigation link | Page changed, URL updated, new content loaded |
| Search/filter | Results updated to match the query, OR empty state shown for no matches |
Failure classifications:
"{element}" does nothing when clicked — no visible response, no console, no network"Clicking {action} opens {container} but it is empty — expected {what should appear}""Content rendered after {action} does not match context — expected {X}, got {Y}""Loading state after {action} never resolved — spinner visible for 5+ seconds""{element} rendered with missing/broken data: {field} shows {value}"This step is NOT optional. Every interaction in Step 2b must have its outcome verified. Checking console + network is necessary but NOT sufficient — the DOM must prove the action worked.
Backend verification for mutations: After any CREATE, UPDATE, or DELETE action that shows success in the UI:
mcp__playwright__browser_evaluate to make a fetch
call and verify the backend actually has the data:
await fetch('/api/items/' + id).then(r => r.json())
This catches the most insidious bug class: forms that show a success toast but don't actually call the API, or APIs that return 200 but don't write to the database.
For pages with data tables or complex layouts:
After exercising every element on the page, record:
After auditing all pages, write ALL captured console messages to a persistent file:
Create quality_reports/console-logs-iter{N}.md (where N = foundry cycle, or 0 for standalone):
# Console Logs — Iteration {N}
Date: {date}
URL: {base url}
Pages visited: {N}
Total console messages: {N}
Errors: {N} | Warnings: {N} | Info: {N}
## Errors (grouped by unique message)
### CE-1: {short error summary}
- **Page:** {url path where first seen}
- **Trigger:** {what interaction caused it — e.g., "clicking Deploy button"}
- **Message:** `{full console error message}`
- **Stack:** `{stack trace if available}`
- **Frequency:** {N} occurrences across {N} pages
- **User impact:** {what breaks — e.g., "Deploy button does nothing"}
- **Also seen on:** {other pages where same error appears}
### CE-2: ...
## Warnings (grouped by unique message)
### CW-1: {short warning summary}
- **Page:** {url path}
- **Message:** `{full warning}`
- **Frequency:** {N}
- **Type:** React warning | deprecation | other
## Info (only notable — skip routine framework logs)
### CI-1: {notable info message}
This file is persistent and referenced from the sub-spec. Console errors that break user-facing functionality become findings in the foundry defect list.
Create quality_reports/ui-audit-{timestamp}.md with:
# UI Audit: {url}
Date: {date}
Viewport: {width}x{height}
Pages audited: {N}
Elements exercised: {N}
Total interactions: {N}
## Summary
- Critical issues: {N}
- Major issues: {N}
- Minor issues: {N}
- Console errors: {N} (unique), {N} (total)
- Network failures: {N}
- Accessibility: {N}
- Pages broken: {N}/{total}
## Console & Runtime Errors
### E1: {error message summary}
- **Type**: error | warning | uncaught exception | unhandled rejection
- **Message**: {full message}
- **Stack**: {stack trace if available}
- **Trigger**: {what interaction caused this — e.g., "clicking Delete button on row 3"}
- **Page**: {which page}
- **Frequency**: {how many times it fired}
- **Impact**: {what user-facing behavior this causes}
## Network Failures
### N1: {failed request}
- **URL**: {request URL}
- **Method**: {GET/POST/etc}
- **Status**: {status code or error type}
- **Trigger**: {what interaction caused this}
- **Request body**: {summary of what was sent, if visible}
- **Impact**: {what breaks}
## Functional Issues
### F1: {title}
- **Page**: {page URL}
- **Element**: {what element, CSS selector or description}
- **Action**: {what you did — clicked, typed, submitted}
- **Expected**: {what should happen}
- **Actual**: {what actually happened}
- **Console**: {any console errors triggered}
- **Network**: {any network calls triggered}
## Visual Issues
### V1: {title}
- **Page**: {page URL}
- **What**: {description}
- **Where**: {CSS selector or visual location}
- **Viewports affected**: {which breakpoints}
## Data Issues
### D1: {title}
- **Page**: {page URL}
- **What**: {empty columns, wrong counts, missing labels, stale data}
- **Element**: {table/cell/badge description}
- **Expected**: {what should be displayed}
- **Actual**: {what is displayed}
## Accessibility Issues
### A1: {title}
- **Rule**: {WCAG rule violated}
- **Element**: {selector}
- **Fix**: {what to change}
## Route Coverage
- Routes from router config: {N}
- Routes discovered via navigation: {N}
- Routes discovered via direct URL only: {N}
- Routes unreachable (404 or auth-blocked): {N}
- **Coverage: {N}% of defined routes visited**
| Route | Source | Status |
|-------|--------|--------|
| /dashboard | nav | OK |
| /settings/advanced | router config (direct URL) | OK |
| /admin/debug | router config (direct URL) | 403 — auth required |
## Page-by-Page Detail
### {Page Name} — {url path}
**Status**: OK | FUNCTIONAL (has issues) | BROKEN | 404
**Discovery**: navigation | router config (direct URL)
**Console errors on load**: {N}
**Network failures on load**: {N}
**Elements exercised:**
- [ ] {button/link/form} — **Outcome**: {OK: expected content rendered | FAIL: issue ref}
- [ ] {button/link/form} — **Outcome**: {OK: expected content rendered | FAIL: issue ref}
...
### {Next Page} — {url path}
...
This step runs when ralph-ui is invoked as foundry SIGHT stream. In standalone
mode, this step is skipped unless --suggest is passed.
While auditing, actively identify opportunities to improve the user experience. Don't just document what's broken — propose what would make it better.
For each suggestion, verify it can actually be implemented:
Layer 1 — Spec check:
Read the LISA spec (path stored in foundry state or quality_reports/specs/).
spec_supported: yes | no | ambiguousLayer 2 — Codebase check: Search the backend code for supporting infrastructure.
code_supported: yes | no | partialLayer 3 — Runtime check: If the app is running, test the relevant endpoint.
runtime_supported: yes | no | error | untestedMinor (auto-implement): Small UX improvements that:
Major (backlog): Significant feature additions that:
Minor suggestions → added to the audit report as s-N items (lowercase).
When invoked as foundry SIGHT, these are automatically converted into defects
for the next GRIND cycle.
Major suggestions → written to the suggestion backlog file at
quality_reports/suggestion-backlog.md. Format:
# Suggestion Backlog: {url}
Date: {date}
Total suggestions: {N}
Auto-implemented: {N} (minor)
Pending review: {N} (major)
## Pending Suggestions (Major — requires user approval)
### S-1: {title}
- **Page**: {url}
- **Suggestion**: {what to add/change}
- **Feasibility**:
- Spec: {yes/no/ambiguous}
- Codebase: {yes/no/partial}
- Runtime: {yes/no/error/untested}
- **Effort estimate**: {small/medium/large}
## Auto-Implemented Suggestions (Minor)
### s-1: {title}
- **Page**: {url}
- **What was added**: {description}
- **Cycle**: {which foundry cycle it was implemented in}
The backlog accumulates across INSPECT cycles and is presented in the final foundry report (F6: DONE).
If --fix was NOT passed (standalone mode):
If --fix WAS passed (standalone mode):
/ralph with the generated planIf invoked as foundry SIGHT stream (F2c — READ-ONLY):
R=$PWD; while [ -n "$R" ] && [ "$R" != / ] && [ ! -d "$R/.claude" ]; do R=${R%/*}; done
"$R/.claude/scripts/foundry.sh" save-console-errors <console-log-file>
foundry/proofs/cycle-N-sight.jsonfoundry_mark_stream("sight"):
"$R/.claude/scripts/foundry.sh" mark-stream sight <items_checked>
Ralph-UI auto-detects headed vs headless Playwright mode using
.claude/scripts/detect-display.sh. No manual .mcp.json editing needed.
| Environment | Mode | Reason |
|---|---|---|
| macOS | Headed | GUI always available |
| Linux + DISPLAY/Wayland | Headed | Display server detected |
| Linux + SSH (no X11) | Headless | No display forwarding |
| WSL without WSLg | Headless | No native GUI |
| CI (GitHub Actions, GitLab, etc.) | Headless | CI environment |
Override with --headless or --headed flags.
| Feeds into /ralph-ui | /ralph-ui feeds into |
|---|---|
| Live URL + credentials | Audit report (quality_reports/) |
| Reference design/URL (parity mode) | Fix plan for /ralph |
| Playwright MCP (browser + devtools) | Foundry SIGHT proof (foundry/proofs/cycle-N-sight.json) |
| LISA spec (for suggestion feasibility) | Suggestion backlog (quality_reports/) |
| Foundry state (cycle, target URL) | Foundry defect list (foundry/defects.json) |
Decomposed spec (quality_reports/specs/) | Console log file (quality_reports/console-logs-iter{N}.md) |
| Prior console logs (check regressions) | foundry_mark_stream("sight") completion signal |
npx claudepluginhub alphabravocompany/codsworth-marketplace --plugin foundryAudits web app UX by dogfooding as user persona: tracks emotional friction, click efficiency, resilience to back/refresh, return intent. Uses Playwright/Chrome MCP for live sites, outputs ranked reports.
Runs AI-powered adversarial UI testing via the browse CLI — analyzes git diffs, explores full apps, and tests functional correctness, accessibility, responsive layout, and UX heuristics. Use for QA pull requests, auditing accessibility, or exploratory testing.
Drives a real Chrome session against a running web app to find bugs, UX issues, a11y problems, and perf regressions. Outputs structured findings JSON for downstream triage.