Skill

octoperf-validation-triage

Use when an OctoPerf Virtual User validation run has produced many failing actions and the user needs to diagnose them efficiently without reading every single failure serially. Triggers on "the validation is red", "lots of errors after import", "VU validation failed, what's wrong", "triage these failures", "why is my virtual user failing". Groups failures by category, drills into one representative per group, and proposes the matching MCP-tool fix. Requires the OctoPerf MCP server.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/octoperf:octoperf-validation-triage

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

When a Virtual User validation finishes with many failing actions,

SKILL.md

226 lines · ~3.9k tokens

Stats

Parent stars0

MaintenanceGood

Last CommitMay 18, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

OctoPerf — Validation-failure triage

When a Virtual User validation finishes with many failing actions, reading each failure detail one by one wastes context window and time. This skill groups failures by root cause, drills into one representative per group, and maps each group to the correct fix.

When this applies

A validate_virtual_user run finished with finished=true and at least one failing action.
The user wants to know what's wrong, not just retry blindly.

Steps

1. Get the failure index — no bodies yet

mcp__octoperf__get_virtual_user_validation_index(virtualUserId)

The index returns one entry per failing action: action id, action name, HTTP method, URL, response status code, brief failure summary. Do not fetch failure details yet — the index alone is usually enough to classify.

OctoPerf's "KO" rule isn't just "non-2XX". A sample is KO when:

Live response code	Recorded response code	Result
2XX	none	✅ OK
2XX	2XX	✅ OK
2XX	3XX	❌ KO (recording expected a redirect, got a body — usually wrong)
3XX	3XX	✅ OK
3XX	2XX	❌ KO (recording expected a body, got a redirect)
Any 4XX / 5XX	anything	❌ KO
Unknown code	anything	❌ KO
Any code	4XX / 5XX / unknown	❌ KO (recording was already broken; re-record)

This matters when classifying: a "successful" 200 against a recorded 302 is a real bug, not a false positive — the VU is hitting a different code path than when it was recorded.

Caveat — ResponseAssertion overrides the matrix. A ResponseAssertion attached to the action can mark a sample KO even when the matrix above says OK (e.g. status 200 against recorded 200, but the body contains the assertion's pattern — typical for forms that re-render with an error message on a 200). Check for assertions on the failing action before assuming an HTTP-level mismatch. The assertion's type field also controls scope:

REQUEST_ONLY — only the parent sample is checked.
REQUEST_AND_SUBREQUESTS — parent and embedded resources. Matters when downloadResources=true: a 404 on an embedded CSS can fail the parent.
SUBREQUESTS_ONLY — only embedded resources.

1b. Partial failures (`success < total`)

When the index entry shows success < total (e.g. {"success": 2, "total": 3, "failedTimestamps": [...]}), the action failed on some iterations but passed on others. Common patterns:

Bad row in a CSVVariable — one iteration picks a row that doesn't pass server-side validation (e.g. a wrong-password row, a malformed account id, a stale token). Cheapest possible fix: edit the CSV. Investigate this before correlation when the VU uses a CSVVariable.
Race condition — uncommon in 1-user validation but possible if the VU has internal LoopContainerAction or shared variables.
CSV exhausted with EOF = StopVU — iterations past the CSV length never run; not strictly a failure but success < total reflects it.

failedTimestamps lists every failed iteration's epoch-ms. Pass the exact timestamp to get_validation_failure_detail to pin the failing iteration instead of getting a random one — critical when debugging CSV-driven flakiness so you see the bad row, not a good one.

2. Group by category

Bucket the failing actions into a small number of categories. The common ones:

Category	Index signal	Likely fix
Auth / state	401, 403; "invalid token", "expired", "CSRF"	Auto-correlation (separate skill)
Variable / data	400 with body validation errors; "field required", "invalid format"	Edit / create variables; check CSV upload
HTTP server config	Connection timeout, DNS failure, SSL handshake error, wrong port	`update_http_server` (baseUrl, timeouts, IP spoofing)
Server-side 5xx	500, 502, 503, 504	Not the VU's fault — surface to user; check target env
Body mismatch	422; "schema mismatch", "unexpected field"	Re-import (recording out of date) or edit action body
Assertion failure	Status 2XX/3XX matching recorded, but failure message quotes the assertion's pattern (often on a `ResponseAssertion` node attached to the action)	Read `validationResponse.body` to see the matched substring/regex. Either the response is genuinely wrong (fix the upstream cause) or the assertion's pattern is too strict (loosen it)
Missing dependency	404 on resources, signed URLs returning errors	Run a correlation rule, or the resource genuinely doesn't exist

A useful heuristic: if 80% of failures fall in one category, fix that first and re-validate before investigating the rest. Most of the long tail clears once the dominant root cause is resolved.

3. Confirm each category against one representative

For each group, pick the first failing action and fetch its detail:

mcp__octoperf__get_validation_failure_detail(virtualUserId, actionId)

The detail returns the four HTTP entities (sent/received request and response). For very large bodies, you can pull a single one with mcp__octoperf__fetch_validation_http_body(...). Its kind parameter (RECORDED_REQUEST / RECORDED_RESPONSE / VALIDATION_REQUEST / VALIDATION_RESPONSE) lets you isolate the recorded-vs-replay diff side-by-side — useful when the request/response is several MB and a single side fits in your context but not both.

Read each detail with a specific question in mind based on the category:

Auth/state: is there a token in the previous response that should have been re-sent here? → Auto-correlation skill.
Variable/data: does the request body contain a literal value that should be a ${variable} reference? → list_variables → create_*_variable → re-import or edit.
HTTP server config: is the request reaching the server at all? Timing out? Hitting the wrong host? → list_http_servers_by_project → update_http_server.
Body mismatch: does the recorded request body match what the API actually expects today? If not, the recording is stale.
Assertion failure: is the matched pattern (e.g. "Signon failed" substring) a symptom of an upstream wrongness (auth failed, validation error re-rendered in the page) or an overly-strict assertion? Fix the upstream cause first; loosen the assertion's pattern only if the response is genuinely correct.
Missing dependency: is the requested resource (/api/orders/12345) one that the recording created earlier, but whose id is now stale? → correlation, or use a CSV variable.

4. Apply ONE fix, then re-validate

Resist the urge to fix three things at once. Apply the fix that should clear the largest category, then:

mcp__octoperf__validate_virtual_user(projectId, virtualUserId, providerId, location, iterations=1)
mcp__octoperf__get_virtual_user_validation(projectId, virtualUserId)

Re-fetch the failure index. Verify:

Cleared: the target category is gone. Move on to the next.
Reduced: some failures of the same category cleared, others didn't. The fix was partial — refine it (e.g. add a more specific correlation rule).
Unchanged or worse: the fix was wrong; revert it (update_* or delete_*) before trying another.

5. Engine-level failure (no index, no HTTP entities)

If get_virtual_user_validation_index comes back empty but the run is still marked failed/aborted, the validation engine itself crashed (JMeter OOM, missing Playwright dependency, bad locale, …) — there are no HTTP samples to read. The validation run produces a benchResultId (returned by validate_virtual_user) which backs the same log storage as a real bench run, so:

mcp__octoperf__list_bench_result_files(benchResultId)
mcp__octoperf__read_bench_result_file_lines(benchResultId, "jmeter.log")

surfaces the engine logs (and any Playwright trace / screenshot / HAR the engine left behind). Read the tail of jmeter.log first — startup errors are usually within the first 50 lines and fatal errors within the last 50.

Log retention. Validation log files are erased 7 days after the run, or as soon as the user leaves the design screen. Old validation runs may no longer have logs — call out the freshness window before promising a re-read.

For binary artefacts (Playwright trace.zip, screenshots .png, HAR archives) the line-based reader returns garbage. Use the binary-aware tool instead:

mcp__octoperf__fetch_bench_result_file(benchResultId, filename)
# returns { filename, mimeType, sizeBytes, contentBase64 } — capped at 5 MB

Decode contentBase64 locally and inspect (e.g. base64 -d > trace.zip && unzip -p trace.zip trace.trace). This is especially valuable for Playwright VU failures — the JMeter wrapper log only sees the spawn, the actual selector miss / timeout / navigation abort lives in the trace.

6. Stop conditions

Stop and surface to the user when:

Zero failures → VU is ready. Offer to run_scenario.
All remaining failures are 5xx → it's the target environment, not the VU. Hand back with the list.
After two rounds of fixes the failure count plateaus → there's a structural issue (wrong recording, target API changed, …). Hand back with the diagnosis.

Sanity-check output reference

sanity_check_virtual_user runs before the first validation. Its output is a flat list of (level, message). ERROR entries block validation, WARNING / INFO entries don't. Mapping the canonical messages to fixes:

Level	Message	What it means	Fix
ERROR	A file is missing for CSV variable	A CSVVariable points at a file not uploaded	`upload_project_file`, or `patch_virtual_user` to repoint the variable
ERROR	CSVVariable has conflicting column names	Two CSVVariables share a column name	Prefix one variable's columns; `patch_virtual_user`
ERROR	JSR223Action is empty	Empty script generates only noise logs	Delete the action or fill the script (`patch_virtual_user`)
ERROR	No Server Found	A request points at a deleted HTTP server	`list_http_servers_by_project` → recreate or repoint via `update_http_server`
ERROR	Cyclic Dependency Detected!	A fragment references itself (directly or indirectly)	Break the cycle with `patch_virtual_user`
WARNING	Clear Cookies before recording …	Recorded cookies may leak invalid session ids	Remove `Cookie` headers in the relevant requests
WARNING	Empty file for CSVVariable	The uploaded CSV parsed to zero rows	Re-upload a properly-encoded UTF-8 file
WARNING	End Of Value Policy is 'Stop VU'	Test will end abruptly when the CSV is exhausted	Confirm with user; otherwise switch policy to Recycle / Continue
WARNING	file is missing for POST request	A multipart POST references a file not in `/resources`	`upload_project_file` (no path prefix; OctoPerf adds `/resources/` itself)
INFO	Host header and server host are differing	Some servers reject mismatched Host headers	Search-and-replace the Host header if the target rejects
INFO	Using a JMeter generic action	Imported a raw JMeter element; double-check its config	None unless behavior is unexpected — JAR plugins go under `/lib/ext`
INFO	XXX sec thinktime is high	A recorded pause is unusually long	Trim the thinktime if the duration would harm the test
INFO	xxxxx should have a name	Unnamed element renders as "Unnamed" in reports	Rename via `patch_virtual_user`
INFO	xxxxx is empty	Empty controller / logic action that won't execute	Delete it
INFO	HTTP Action has empty query parameter	Imported a stray query param with no name / value	Remove the parameter via `patch_virtual_user`

Apply the ERROR fixes first — validation is blocked until those clear.

Anti-patterns

Don't fetch failure details for every action. The index is enough to classify; details are for confirmation, not bulk inspection.
Don't run_scenario to debug. Validation is the right tool — it's cheap, captures full HTTP. A load test gives you metrics, not bodies.
Don't sanity-check after validating. sanity_check_virtual_user is a static check; run it before the first validation run, not after. If it would have caught the issue, you wasted a validation cycle.
Don't edit the VU silently. Summarize what fix you're about to apply and confirm with the user before any delete_* or destructive change.

octoperf-validation-triage

Invocation

Context Preview

SKILL.md

octoperf-validation-triage

Invocation

Context Preview

SKILL.md

OctoPerf — Validation-failure triage

When this applies

Steps

1. Get the failure index — no bodies yet

1b. Partial failures (`success < total`)

2. Group by category

3. Confirm each category against one representative

4. Apply ONE fix, then re-validate

5. Engine-level failure (no index, no HTTP entities)

6. Stop conditions

Sanity-check output reference

Anti-patterns

See also

Similar Skills

OctoPerf — Validation-failure triage

When this applies

Steps

1. Get the failure index — no bodies yet

1b. Partial failures (`success < total`)

2. Group by category

3. Confirm each category against one representative

4. Apply ONE fix, then re-validate

5. Engine-level failure (no index, no HTTP entities)

6. Stop conditions

Sanity-check output reference

Anti-patterns

See also

Similar Skills

octoperf-validation-triage

Invocation

Context Preview

SKILL.md

octoperf-validation-triage

Invocation

Context Preview

SKILL.md

OctoPerf — Validation-failure triage

When this applies

Steps

1. Get the failure index — no bodies yet

1b. Partial failures (success < total)

2. Group by category

3. Confirm each category against one representative

4. Apply ONE fix, then re-validate

5. Engine-level failure (no index, no HTTP entities)

6. Stop conditions

Sanity-check output reference

Anti-patterns

See also

Similar Skills

OctoPerf — Validation-failure triage

When this applies

Steps

1. Get the failure index — no bodies yet

1b. Partial failures (success < total)

2. Group by category

3. Confirm each category against one representative

4. Apply ONE fix, then re-validate

5. Engine-level failure (no index, no HTTP entities)

6. Stop conditions

Sanity-check output reference

Anti-patterns

See also

Similar Skills

1b. Partial failures (`success < total`)

1b. Partial failures (`success < total`)