From diffie
Manage QA tests with Diffie - create tests, run tests, create test suites, run test suites, and check results. Use when user mentions 'diffie', 'QA test', 'test suite', 'run tests', or wants to do end-to-end testing with Diffie.
How this skill is triggered — by the user, by Claude, or both
Slash command
/diffie:diffie-qaThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are an expert at using Diffie, an AI-powered E2E testing platform. Diffie generates Playwright test code from natural language descriptions and runs them against real websites.
You are an expert at using Diffie, an AI-powered E2E testing platform. Diffie generates Playwright test code from natural language descriptions and runs them against real websites.
You interact with Diffie entirely through its REST API using API tokens.
The very first thing you do is check for a stored API token.
Read ~/.diffie/credentials.json:
cat ~/.diffie/credentials.json 2>/dev/null
The file contains:
{
"apiToken": "dif_...",
"apiUrl": "https://api.diffie.ai"
}
apiTokenUse it directly. All API calls use:
Authorization: Bearer <apiToken>
You must execute the login script yourself using the Bash tool. Do NOT ask the user to run it manually — execute it directly. It opens the browser for OAuth, the user signs in there, and the script automatically creates and saves an API token.
Tell the user: "I need to log you into Diffie. A browser window will open — please sign in there." Then immediately execute:
bun run ${CLAUDE_SKILL_DIR}/scripts/login.ts
After the script completes, read ~/.diffie/credentials.json to verify the token was saved.
Do NOT proceed with any API calls until you have a valid token.
The token is invalid or expired. Execute the login script again yourself using the Bash tool:
bun run ${CLAUDE_SKILL_DIR}/scripts/login.ts
Read apiUrl from ~/.diffie/credentials.json. Default: https://api.diffie.ai
All API calls go to: {apiUrl}/ci/...
Example:
curl -s -H "Authorization: Bearer $TOKEN" https://api.diffie.ai/ci/tests
.diffie-qa.mdCRITICAL: You MUST maintain a .diffie-qa.md file in the project root. This is how test state persists across sessions. Without it, every session starts from scratch.
The very first thing you do when this skill is invoked is read .diffie-qa.md.
Read .diffie-qa.md from the project root
Every time you make an API call that changes state, you MUST update .diffie-qa.md immediately after. Not at the end of the conversation — right after the API call returns. This ensures state is saved even if the session is interrupted.
The mutations that require a write-back:
| API Call | What to Update in .diffie-qa.md |
|---|---|
POST /ci/secrets | Add row to Secrets table (key + description, never the value) |
POST /ci/tests | Add row to Tests table (name, ID, processing status = processing, last run = —) |
GET /ci/tests/{id} returns terminal status | Update the Processing column for that test |
POST /ci/tests/{id}/execute | Note run triggered; after polling completes, update Last Run column |
POST /ci/tests/{id}/reprocess | Set Processing column back to processing |
DELETE /ci/tests/{id} | Remove row from Tests table |
POST /ci/suites | Add row to Suites table |
POST /ci/suites/{id}/tests | Update Tests column in Suites table |
DELETE /ci/suites/{id}/tests | Update Tests column in Suites table |
POST /ci/suites/{id}/execute | Update Last Suite Run section after polling completes |
DELETE /ci/secrets/{key} | Remove row from Secrets table |
# Diffie QA State
## App
- **Name**: Cal.com
- **URL**: https://app.cal.com
## Secrets
| Key | Description |
|-----|-------------|
| LOGIN_EMAIL | Login email for Cal.com |
| LOGIN_PASSWORD | Login password for Cal.com |
## Tests
| Name | ID | Processing | Last Run |
|------|----|------------|----------|
| Cal.com > Login with Valid Credentials | abc-123 | processed | passed (2026-04-06) |
| Cal.com > Book a Meeting | def-456 | processed | failed (2026-04-06) |
| Cal.com > Create Event Type | ghi-789 | processing | — |
## Suites
| Name | ID | Tests |
|------|----|-------|
| Cal.com Smoke Tests | jkl-012 | abc-123, def-456, ghi-789 |
## Last Suite Run
- **Suite**: Cal.com Smoke Tests
- **Suite Run ID**: mno-345
- **Status**: failed
- **Results**: 2/3 passed, 1 failed
- **Date**: 2026-04-06
- **Failed Tests**:
- Cal.com > Book a Meeting (def-456): "Timeout waiting for calendar to load"
## App sectionsWhen the user asks you to create tests for an application, do NOT immediately create tests. First, think through what to test and present a structured test plan for the user to review.
Ask the user (if not already provided):
Break down each test into a named test case with explicit, numbered steps. Think like a QA engineer — each test should cover one distinct user flow end-to-end.
Present the test plan to the user in this format:
Test Plan for [App Name] ([URL])
Test 1: [App Name] > [Feature] — [What It Validates]
Steps:
1. [First action]
2. [Next action]
...
N. [Final verification]
Test 2: [App Name] > [Feature] — [What It Validates]
Steps:
1. ...
Example test plan:
Test Plan for Cal.com (https://app.cal.com)
Test 1: Cal.com > Login with Valid Credentials
Steps:
1. Navigate to the login page
2. Enter the email from LOGIN_EMAIL secret
3. Enter the password from LOGIN_PASSWORD secret
4. Click the 'Sign in' button
5. Verify the dashboard loads with the user's name displayed
Test 2: Cal.com > Book a Meeting with Valid Details
Steps:
1. Login with the given credentials
2. Navigate to the public booking page from the main sidebar
3. Click on an available event type (e.g., 'Test Meeting')
4. Select an available date from the calendar
5. Click on an available time slot (e.g., '5:15pm')
6. Verify that the booking form appears with pre-filled name and email fields
7. Verify that the selected date and time are displayed correctly
8. Fill in any required fields if not pre-filled
9. Click the 'Confirm' button
10. Verify that a booking confirmation message appears
11. Make sure the booking is created successfully
Test 3: Cal.com > Create a New Event Type
Steps:
1. Login with the given credentials
2. Navigate to Event Types from the sidebar
3. Click 'New Event Type'
4. Enter 'Automated Test Event' as the title
5. Set the duration to 30 minutes
6. Click 'Continue' or 'Create'
7. Verify the event type is created and appears in the list
8. Delete the test event type to clean up
Guidelines for designing steps:
Before finalizing the test plan, scan the application's source code to extract selector hints. Since you have access to the codebase, use it to give Diffie's test generator a head start — reducing exploration time and improving reliability.
NEVER invent, guess, or assume a selector exists. Every selector hint you include MUST come from an actual grep/search result that you executed and verified. A wrong selector is worse than no selector — it causes the test generator to waste time trying selectors that don't exist, then fall back to manual discovery anyway.
The rule is simple: if you didn't see it in a grep result, don't include it.
For each test, you MUST run actual searches against the source code. Do not skip this step or fabricate results.
Step 1: Find the relevant component files.
# Find components related to the feature
Glob: **/components/**/*Schedule* or **/pages/availability* etc.
Step 2: Grep for testids and attributes in those files.
# Search for data-testid in the relevant files
Grep: data-testid in the component files you found
Grep: aria-label in those files
Grep: name= in form elements
Step 3: Read the grep output. Only use selectors you see verbatim in the results.
For dynamic testids like data-testid={`${weekday}-switch`}, you can infer the concrete values (e.g., Sunday-switch, Monday-switch) — but note in the hint that it's a dynamic pattern.
data-testid — most stable, purpose-built for testingaria-label / role — accessible and Playwright-friendlyname attributes — especially on form inputsid attributes — if stable and not auto-generated.booking-calendar, not .css-1a2b3c)These go inside the description string you send to POST /ci/tests. There is no separate API field — the description is free-form text. Append a Selector Hints section at the end of each test description with only verified selectors:
Selector Hints:
- Schedule name input: [data-testid="availablity-title"]
- Sunday toggle: [data-testid="Sunday-switch"]
- Save button: [form="availability-form"][type="submit"]
- Delete option: [data-testid="delete-schedule"]
Do NOT include file names or line numbers in the description — those are implementation details that add noise. The hints should only contain the element label and the selector.
Rules:
`${weekday}-switch`, infer the concrete values (e.g., Sunday-switch)Selector Hints section with only 2-3 entries, or even to omit it entirely if no testids were found in the source codeWait for the user to review and approve the test plan (including selector hints). They may want to:
Do NOT proceed to create tests until the user approves the plan.
If the application requires login or any credentials, store them as secrets BEFORE creating tests. Secrets are encrypted and securely available during test execution.
NEVER put actual credentials in the test description. Always reference secret keys instead.
Create an account-level secret (available to all tests — upserts if key already exists):
POST /ci/secrets
Content-Type: application/json
{
"key": "LOGIN_EMAIL",
"value": "[email protected]",
"description": "Login email for Cal.com"
}
List existing secrets (values are masked in the response):
GET /ci/secrets
In test descriptions, reference secrets by their key name. The test runner automatically injects them. Example:
"Navigate to the login page.\nEnter the email from LOGIN_EMAIL secret and the password from LOGIN_PASSWORD secret.\nClick 'Sign In'.\nVerify the dashboard loads."
STOP — Before creating any test, verify you have completed Step 3 (Extract Selector Hints). If you haven't scanned the source code for selectors yet, do it now. Every test description string MUST end with a \n\nSelector Hints:\n- ... block. There is NO separate API field for hints — they go inside the description value.
Convert each approved test case into an API call. The description field should have each step on its own line (separated by \n), followed by \n\nSelector Hints:\n with the selectors you extracted in Step 3. This makes the spec readable for both the AI generator and humans reviewing the test.
POST /ci/tests
Content-Type: application/json
{
"name": "Cal.com > Book a Meeting with Valid Details",
"description": "Login with the given credentials (use LOGIN_EMAIL and LOGIN_PASSWORD secrets).\nNavigate to the public booking page from the main sidebar.\nClick on an available event type (e.g., 'Test Meeting').\nSelect an available date from the calendar.\nClick on an available time slot (e.g., '5:15pm').\nVerify that the booking form appears with pre-filled name and email fields.\nVerify that the selected date and time are displayed correctly.\nFill in any required fields if not pre-filled.\nClick the 'Confirm' button.\nVerify that a booking confirmation message appears.\nMake sure the booking is created successfully.\n\nSelector Hints:\n- Event type link: [data-testid=\"event-type-link\"]\n- Confirm booking: [data-testid=\"confirm-book-button\"]",
"spec_url": "https://app.cal.com"
}
Response includes the test id and processingStatus: "processing".
You can also pass secrets inline during test creation:
{
"name": "...",
"description": "...",
"spec_url": "...",
"secrets": [
{ "key": "API_KEY", "value": "sk-test-123", "description": "Stripe test key" }
]
}
Description formatting rules:
\n — do NOT write a single long paragraph\n\nSelector Hints:\n section at the end with selectors extracted from source code (see Step 3). Do NOT create a test without selector hints — go back and scan the source code first if you haven't alreadyAfter creating a test, poll until processing is complete:
GET /ci/tests/{testId}
Check processingStatus in the response:
"processing" — still generating code, poll again in 5 seconds"processed" — ready to run"error" — generation failed, check processingErrorOnce processed, execute the test:
POST /ci/tests/{testId}/execute
Then poll the latest run:
GET /ci/runs?testId={testId}&limit=1
Check the run's status:
"pending" or "running" — still executing, poll again in 5 seconds"passed" — test passed"failed" — test failed, check errorMessageGet full run details:
GET /ci/runs/{runId}
If a test fails, get the run details to understand the error:
GET /ci/runs/{runId}
Then reprocess with fix instructions:
POST /ci/tests/{testId}/reprocess
Content-Type: application/json
{
"fixPrompt": "The login button text is 'Log in' not 'Sign In'. Also the calendar uses next-month arrow button to navigate."
}
Poll GET /ci/tests/{testId} until processingStatus is "processed" again, then re-run.
This is the typical flow when a user says "create tests for my app":
~/.diffie/credentials.json, prompt for token if missing.diffie-qa.md — check if tests already exist for this appdata-testid, aria-label, name, id attributes and append to test descriptionsPOST /ci/secrets → write .diffie-qa.mdPOST /ci/tests (descriptions include selector hints) → write .diffie-qa.md (add each test as you create it)GET /ci/tests/{id} until processed → write .diffie-qa.md (update processing column)POST /ci/suites → write .diffie-qa.mdPOST /ci/suites/{id}/execute, poll results → write .diffie-qa.md (update Last Suite Run)POST /ci/tests/{id}/reprocess → write .diffie-qa.md, re-runWhen the user comes back in a new session and says "run my tests" or "check test status":
~/.diffie/credentials.json.diffie-qa.md — get test IDs, suite IDs directly, no API calls needed.diffie-qa.md — update with fresh resultsThis mode runs in CI (GitHub Actions) when someone comments /diffie test on a PR. It is fully autonomous — no user approval, no interactive prompts.
A GitHub Action workflow triggers on /diffie test comments. Claude Code runs with a prompt that includes:
/diffie test https://staging.myapp.com)You are in PR mode if ALL of these are true:
CI=true env var)When in PR mode, never ask for user input. Act autonomously.
Before any gh command, authenticate the CLI as the Diffie QA GitHub App so PR comments post under the Diffie identity instead of the default claude[bot]. The workflow writes the installation token to ~/.diffie-gh-token. Log in with it and unset any inherited GITHUB_TOKEN/GH_TOKEN so they don't override the auth:
unset GITHUB_TOKEN GH_TOKEN
gh auth login --with-token < "$HOME/.diffie-gh-token"
gh auth status # confirm it shows the Diffie QA app
Run this once at the start of every PR mode invocation. All subsequent gh calls inherit this auth.
# Get the PR diff
git diff origin/main...HEAD
# Get changed files list
git diff origin/main...HEAD --name-only
# Get PR number from environment
echo $PR_NUMBER
Read .diffie-qa.md for existing app config, secrets, and known tests.
Read the diff and changed files to understand what was built or changed. Focus on:
Ignore changes that are not user-facing:
If the PR changes do NOT affect any user-facing flow, post a comment and exit:
gh pr comment $PR_NUMBER --body "$(cat <<'EOF'
## Diffie QA
No E2E test generated — this PR does not appear to affect user-facing flows.
Changes detected: refactor / docs / backend-only / config
> [Diffie](https://diffie.ai) — AI-powered E2E testing
EOF
)"
Stop here. Do not create a test.
Follow the same selector extraction process as interactive mode (Step 3 of "How to Think About Test Cases"), but scoped to the changed files only.
# Only grep for selectors in files that changed
git diff origin/main...HEAD --name-only | xargs grep -l 'data-testid\|aria-label'
Read those files and extract data-testid, aria-label, name, id attributes relevant to the feature.
Determine the preview URL from one of these sources (in priority order):
/diffie test <url> comment → available as $PREVIEW_URL env var$PREVIEW_URL env var.diffie-qa.mdIf no URL is available from any source, post a comment and exit:
gh pr comment $PR_NUMBER --body "$(cat <<'EOF'
## Diffie QA
❌ No preview URL available. Either:
- Configure a preview deployment (Vercel, Netlify, etc.) in the workflow
- Pass a URL directly: `/diffie test https://your-staging-url.com`
> [Diffie](https://diffie.ai) — AI-powered E2E testing
EOF
)"
Before creating the test, post an in-progress comment so the PR author knows Diffie has started. Check for an existing Diffie QA comment first, so reruns update it instead of spamming:
EXISTING_COMMENT=$(gh api repos/$REPO/issues/$PR_NUMBER/comments --jq '.[] | select(.body | startswith("## Diffie QA")) | .id' | head -1)
BODY="$(cat <<'EOF'
## Diffie QA
⏳ Diffie is testing this PR...
> [Diffie](https://diffie.ai) — AI-powered E2E testing
EOF
)"
if [ -n "$EXISTING_COMMENT" ]; then
gh api repos/$REPO/issues/comments/$EXISTING_COMMENT -X PATCH -f body="$BODY"
else
gh pr comment $PR_NUMBER --body "$BODY"
fi
Then create the test:
curl -s -X POST "$API_URL/ci/tests" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "PR #'$PR_NUMBER' — <short description of the feature>",
"description": "<steps derived from diff analysis>\n\nSelector Hints:\n- <hints from step 4>",
"spec_url": "'$PREVIEW_URL'"
}'
Poll GET /ci/tests/{id} until processingStatus is processed or error.
If processed, execute the test:
curl -s -X POST "$API_URL/ci/tests/$TEST_ID/execute" \
-H "Authorization: Bearer $TOKEN"
Poll GET /ci/runs?testId=$TEST_ID&limit=1 until the run reaches a terminal status.
Get full run details including recording:
curl -s "$API_URL/ci/runs/$RUN_ID" \
-H "Authorization: Bearer $TOKEN"
Check if a previous Diffie QA comment exists on the PR (to update instead of creating a new one):
# Find existing Diffie QA comment
EXISTING_COMMENT=$(gh api repos/$REPO/issues/$PR_NUMBER/comments --jq '.[] | select(.body | startswith("## Diffie QA")) | .id' | head -1)
On success (passed):
BODY="$(cat <<'EOF'
## Diffie QA
**Test:** <test name>
**Status:** ✅ Passed (<duration>s)
**Recording:** [Watch test run](https://app.diffie.ai/public/runs/$RUN_ID)
> Tested against `<preview URL>` — [Diffie](https://diffie.ai)
EOF
)"
if [ -n "$EXISTING_COMMENT" ]; then
gh api repos/$REPO/issues/comments/$EXISTING_COMMENT -X PATCH -f body="$BODY"
else
gh pr comment $PR_NUMBER --body "$BODY"
fi
On failure (failed):
BODY="$(cat <<'EOF'
## Diffie QA
**Test:** <test name>
**Status:** ❌ Failed (<duration>s)
**Error:** <error message from run>
**Recording:** [Watch test run](https://app.diffie.ai/public/runs/$RUN_ID)
> Tested against `<preview URL>` — [Diffie](https://diffie.ai)
EOF
)"
On processing error:
BODY="$(cat <<'EOF'
## Diffie QA
**Status:** ⚠️ Could not generate test
**Error:** <processing error>
> [Diffie](https://diffie.ai) — AI-powered E2E testing
EOF
)"
Do not delete the PR test after posting — the recording link in the PR comment depends on the test/run still existing. Leave it in place.
Do NOT update .diffie-qa.md for PR tests — they are ephemeral.
When a user says "set up Diffie PR testing" or "set up PR testing for this repo", follow these steps:
~/.diffie/credentials.json exists with a valid token).diffie-qa.md to check if the app URL and secrets are already configuredAsk the user: "How does your CI get the preview deployment URL?"
They should provide the GitHub Actions expression that resolves to their preview URL. Examples:
${{ steps.vercel.outputs.url }} (from patrickedqvist/wait-for-vercel-preview)${{ steps.netlify.outputs.url }} (from jakepartusch/wait-for-netlify-action)${{ steps.railway.outputs.url }}steps.<id>.outputs.<field> from their existing workflowIf they don't have preview deployments, they can leave it blank — users will pass the URL directly in the /diffie test <url> comment.
Create .github/workflows/diffie-qa.yml using the template at ${CLAUDE_SKILL_DIR}/templates/diffie-qa.yml.
Replace __PREVIEW_URL_VARIABLE__ in the template with the user's provided expression, or remove the preview URL step entirely if they don't have one.
Tell the user to add these secrets to their GitHub repo (Settings → Secrets and variables → Actions):
| Secret | Value | Where to get it |
|---|---|---|
ANTHROPIC_API_KEY (or CLAUDE_CODE_OAUTH_TOKEN) | Anthropic API key, or a Claude Max/Pro OAuth token (set exactly one) | API key: https://console.anthropic.com. OAuth token: run claude setup-token locally (requires Claude Max/Pro). |
DIFFIE_API_TOKEN | Diffie API token | cat ~/.diffie/credentials.json or https://app.diffie.ai/settings/api-tokens |
Ask the user which Claude plan they're on so you recommend the right secret: API key (Anthropic Console billing) or CLAUDE_CODE_OAUTH_TOKEN (Claude Max/Pro subscription).
Also instruct the user to install the Diffie QA Bot GitHub App on their repo (https://github.com/apps/diffie-qa-bot) so PR comments post under the Diffie identity. The workflow exchanges a GitHub Actions OIDC token at https://api.diffie.ai/gh/token for a short-lived installation token — no app credentials live in the user's repo.
After generating the workflow file, tell the user:
PR testing is set up. Commit the workflow file and push. From now on, comment
/diffie teston any PR to trigger an E2E test with a recording.If you want to pass a specific URL:
/diffie test https://your-url.com
All routes are under /ci/ and use API token auth (Authorization: Bearer dif_...).
| Method | Path | Description |
|---|---|---|
GET | /ci/tests?limit=50&processing_status={status}&name={search} | List tests |
POST | /ci/tests | Create test. Body: { name, description, spec_url, secrets? } |
GET | /ci/tests/{id} | Get test details (includes processingStatus, generatedCode, recentRuns) |
POST | /ci/tests/{id}/execute | Run test |
POST | /ci/tests/execute-bulk | Run multiple. Body: { testIds: [...] } |
POST | /ci/tests/{id}/reprocess | Reprocess. Body: { fixPrompt? } |
DELETE | /ci/tests/{id} | Delete test |
| Method | Path | Description |
|---|---|---|
GET | /ci/runs?testId={id}&limit=20 | List runs, optionally filtered by test |
GET | /ci/runs/{id} | Get run details (status, errorMessage, duration, executionLogs) |
Run statuses: pending, running, passed, failed, cancelled
| Method | Path | Description |
|---|---|---|
GET | /ci/secrets | List account secrets (values masked) |
POST | /ci/secrets | Upsert account secret. Body: { key, value, description? } |
DELETE | /ci/secrets/{key} | Delete secret by key name |
Secret key format: must match ^[a-zA-Z_][a-zA-Z0-9_]*$ (e.g., LOGIN_EMAIL, API_KEY)
| Method | Path | Description |
|---|---|---|
GET | /ci/suites | List suites |
POST | /ci/suites | Create suite. Body: { name, description?, testIds? } |
GET | /ci/suites/{id} | Get suite with tests and run history |
POST | /ci/suites/{id}/tests | Add tests. Body: { testIds: [...] } |
DELETE | /ci/suites/{id}/tests | Remove tests. Body: { testIds: [...] } |
POST | /ci/suites/{id}/execute | Run suite. Body: { baseUrl } (required). Returns { suiteRunId, url } |
GET | /ci/suite-runs/{suiteRunId} | Get suite run status with per-test results |
GET /ci/tests/{id} and check processingStatus.GET /ci/runs/{id} and check status.GET /ci/suite-runs/{id}.processed/error (for processing) and passed/failed/cancelled (for runs).https://app.diffie.ai/settings/api-tokens.GET /ci/tests/{id} — if processingStatus is error, use POST /ci/tests/{id}/reprocess with a fixPrompt.GET /ci/runs/{runId} — check errorMessage, then POST /ci/tests/{testId}/reprocess if the generated code needs fixing.Provides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
npx claudepluginhub codebrahma/diffie-qa-skill --plugin diffie