From octoperf
Use when an OctoPerf Virtual User validation run has produced many failing actions and the user needs to diagnose them efficiently without reading every single failure serially. Triggers on "the validation is red", "lots of errors after import", "VU validation failed, what's wrong", "triage these failures", "why is my virtual user failing". Groups failures by category, drills into one representative per group, and proposes the matching MCP-tool fix. Requires the OctoPerf MCP server.
How this skill is triggered — by the user, by Claude, or both
Slash command
/octoperf:octoperf-validation-triageThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
When a Virtual User validation finishes with many failing actions,
When a Virtual User validation finishes with many failing actions, reading each failure detail one by one wastes context window and time. This skill groups failures by root cause, drills into one representative per group, and maps each group to the correct fix.
validate_virtual_user run finished with finished=true and at least one failing action.mcp__octoperf__get_virtual_user_validation_index(virtualUserId)
The index returns one entry per failing action: action id, action name, HTTP method, URL, response status code, brief failure summary. Do not fetch failure details yet — the index alone is usually enough to classify.
OctoPerf's "KO" rule isn't just "non-2XX". A sample is KO when:
| Live response code | Recorded response code | Result |
|---|---|---|
| 2XX | none | ✅ OK |
| 2XX | 2XX | ✅ OK |
| 2XX | 3XX | ❌ KO (recording expected a redirect, got a body — usually wrong) |
| 3XX | 3XX | ✅ OK |
| 3XX | 2XX | ❌ KO (recording expected a body, got a redirect) |
| Any 4XX / 5XX | anything | ❌ KO |
| Unknown code | anything | ❌ KO |
| Any code | 4XX / 5XX / unknown | ❌ KO (recording was already broken; re-record) |
This matters when classifying: a "successful" 200 against a recorded 302 is a real bug, not a false positive — the VU is hitting a different code path than when it was recorded.
Caveat — ResponseAssertion overrides the matrix. A ResponseAssertion
attached to the action can mark a sample KO even when the matrix above
says OK (e.g. status 200 against recorded 200, but the body contains
the assertion's pattern — typical for forms that re-render with an
error message on a 200). Check for assertions on the failing action
before assuming an HTTP-level mismatch. The assertion's type field
also controls scope:
REQUEST_ONLY — only the parent sample is checked.REQUEST_AND_SUBREQUESTS — parent and embedded resources. Matters when downloadResources=true: a 404 on an embedded CSS can fail the parent.SUBREQUESTS_ONLY — only embedded resources.success < total)When the index entry shows success < total (e.g.
{"success": 2, "total": 3, "failedTimestamps": [...]}), the action
failed on some iterations but passed on others. Common patterns:
CSVVariable.LoopContainerAction or shared variables.success < total reflects it.failedTimestamps lists every failed iteration's epoch-ms. Pass the
exact timestamp to get_validation_failure_detail to pin the
failing iteration instead of getting a random one — critical when
debugging CSV-driven flakiness so you see the bad row, not a good one.
Bucket the failing actions into a small number of categories. The common ones:
| Category | Index signal | Likely fix |
|---|---|---|
| Auth / state | 401, 403; "invalid token", "expired", "CSRF" | Auto-correlation (separate skill) |
| Variable / data | 400 with body validation errors; "field required", "invalid format" | Edit / create variables; check CSV upload |
| HTTP server config | Connection timeout, DNS failure, SSL handshake error, wrong port | update_http_server (baseUrl, timeouts, IP spoofing) |
| Server-side 5xx | 500, 502, 503, 504 | Not the VU's fault — surface to user; check target env |
| Body mismatch | 422; "schema mismatch", "unexpected field" | Re-import (recording out of date) or edit action body |
| Assertion failure | Status 2XX/3XX matching recorded, but failure message quotes the assertion's pattern (often on a ResponseAssertion node attached to the action) | Read validationResponse.body to see the matched substring/regex. Either the response is genuinely wrong (fix the upstream cause) or the assertion's pattern is too strict (loosen it) |
| Missing dependency | 404 on resources, signed URLs returning errors | Run a correlation rule, or the resource genuinely doesn't exist |
A useful heuristic: if 80% of failures fall in one category, fix that first and re-validate before investigating the rest. Most of the long tail clears once the dominant root cause is resolved.
For each group, pick the first failing action and fetch its detail:
mcp__octoperf__get_validation_failure_detail(virtualUserId, actionId)
The detail returns the four HTTP entities (sent/received request and
response). For very large bodies, you can pull a single one with
mcp__octoperf__fetch_validation_http_body(...). Its kind parameter
(RECORDED_REQUEST / RECORDED_RESPONSE / VALIDATION_REQUEST /
VALIDATION_RESPONSE) lets you isolate the recorded-vs-replay diff
side-by-side — useful when the request/response is several MB and a
single side fits in your context but not both.
Read each detail with a specific question in mind based on the category:
${variable} reference? → list_variables → create_*_variable → re-import or edit.list_http_servers_by_project → update_http_server."Signon failed" substring) a symptom of an upstream wrongness (auth failed, validation error re-rendered in the page) or an overly-strict assertion? Fix the upstream cause first; loosen the assertion's pattern only if the response is genuinely correct./api/orders/12345) one that the recording created earlier, but whose id is now stale? → correlation, or use a CSV variable.Resist the urge to fix three things at once. Apply the fix that should clear the largest category, then:
mcp__octoperf__validate_virtual_user(projectId, virtualUserId, providerId, location, iterations=1)
mcp__octoperf__get_virtual_user_validation(projectId, virtualUserId)
Re-fetch the failure index. Verify:
update_* or delete_*) before trying another.If get_virtual_user_validation_index comes back empty but the run is
still marked failed/aborted, the validation engine itself crashed
(JMeter OOM, missing Playwright dependency, bad locale, …) — there are
no HTTP samples to read. The validation run produces a benchResultId
(returned by validate_virtual_user) which backs the same log storage
as a real bench run, so:
mcp__octoperf__list_bench_result_files(benchResultId)
mcp__octoperf__read_bench_result_file_lines(benchResultId, "jmeter.log")
surfaces the engine logs (and any Playwright trace / screenshot / HAR
the engine left behind). Read the tail of jmeter.log first — startup
errors are usually within the first 50 lines and fatal errors within
the last 50.
Log retention. Validation log files are erased 7 days after the run, or as soon as the user leaves the design screen. Old validation runs may no longer have logs — call out the freshness window before promising a re-read.
For binary artefacts (Playwright trace.zip, screenshots .png,
HAR archives) the line-based reader returns garbage. Use the
binary-aware tool instead:
mcp__octoperf__fetch_bench_result_file(benchResultId, filename)
# returns { filename, mimeType, sizeBytes, contentBase64 } — capped at 5 MB
Decode contentBase64 locally and inspect (e.g. base64 -d > trace.zip && unzip -p trace.zip trace.trace). This is especially valuable for
Playwright VU failures — the JMeter wrapper log only sees the spawn,
the actual selector miss / timeout / navigation abort lives in the
trace.
Stop and surface to the user when:
run_scenario.sanity_check_virtual_user runs before the first validation. Its output is
a flat list of (level, message). ERROR entries block validation, WARNING /
INFO entries don't. Mapping the canonical messages to fixes:
| Level | Message | What it means | Fix |
|---|---|---|---|
| ERROR | A file is missing for CSV variable | A CSVVariable points at a file not uploaded | upload_project_file, or patch_virtual_user to repoint the variable |
| ERROR | CSVVariable has conflicting column names | Two CSVVariables share a column name | Prefix one variable's columns; patch_virtual_user |
| ERROR | JSR223Action is empty | Empty script generates only noise logs | Delete the action or fill the script (patch_virtual_user) |
| ERROR | No Server Found | A request points at a deleted HTTP server | list_http_servers_by_project → recreate or repoint via update_http_server |
| ERROR | Cyclic Dependency Detected! | A fragment references itself (directly or indirectly) | Break the cycle with patch_virtual_user |
| WARNING | Clear Cookies before recording … | Recorded cookies may leak invalid session ids | Remove Cookie headers in the relevant requests |
| WARNING | Empty file for CSVVariable | The uploaded CSV parsed to zero rows | Re-upload a properly-encoded UTF-8 file |
| WARNING | End Of Value Policy is 'Stop VU' | Test will end abruptly when the CSV is exhausted | Confirm with user; otherwise switch policy to Recycle / Continue |
| WARNING | file is missing for POST request | A multipart POST references a file not in /resources | upload_project_file (no path prefix; OctoPerf adds /resources/ itself) |
| INFO | Host header and server host are differing | Some servers reject mismatched Host headers | Search-and-replace the Host header if the target rejects |
| INFO | Using a JMeter generic action | Imported a raw JMeter element; double-check its config | None unless behavior is unexpected — JAR plugins go under /lib/ext |
| INFO | XXX sec thinktime is high | A recorded pause is unusually long | Trim the thinktime if the duration would harm the test |
| INFO | xxxxx should have a name | Unnamed element renders as "Unnamed" in reports | Rename via patch_virtual_user |
| INFO | xxxxx is empty | Empty controller / logic action that won't execute | Delete it |
| INFO | HTTP Action has empty query parameter | Imported a stray query param with no name / value | Remove the parameter via patch_virtual_user |
Apply the ERROR fixes first — validation is blocked until those clear.
run_scenario to debug. Validation is the right tool — it's cheap, captures full HTTP. A load test gives you metrics, not bodies.sanity_check_virtual_user is a static check; run it before the first validation run, not after. If it would have caught the issue, you wasted a validation cycle.delete_* or destructive change.octoperf-auto-correlation — for the "auth / state" category.octoperf-scenario-diagnosis — for diagnosing problems that appear under load but not in validation.Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub octoperf/octoperf-claude-plugins --plugin octoperf