From xray-test-suite
Full-cycle Xray test case workflow — generate test cases from requirements (Jira issue, Confluence page, file, or pasted text), produce import-ready CSV and/or create Jira issues directly via Atlassian API, and optionally upload the CSV via Playwright browser automation. Use this when you need to bulk-author Xray tests from a spec.
How this skill is triggered — by the user, by Claude, or both
Slash command
/xray-test-suite:xray-test-suite [source: jira-key | confluence-url | file-path | --dry-run][source: jira-key | confluence-url | file-path | --dry-run]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
End-to-end workflow for authoring and importing Xray test cases:
End-to-end workflow for authoring and importing Xray test cases:
output/, or Xray UI via PlaywrightRegardless of which output mode was chosen, every test issue created during a run must end up with the following state. The skill MUST verify (or perform) each of these for every test before reporting success in Step 9:
output/TestCases_<EPIC_KEY>_<TIMESTAMP>.csv (Step 6).customfield_10014) — set during CSV import via column 8, or via editJiraIssue post-creation in API mode (Step 8.1).createIssueLink with inwardIssue=<TEST>, outwardIssue=<EPIC> (Step 8.1.a — applies to both paths). Read the link back and confirm the test's issuelinks entry shows outwardIssue=<EPIC> (verified-correct direction; matches the 8.1.a body).customfield_14374 set on every AI-generated test, in bulk after creation (Step 8.4 — applies to both paths). Makes AI-generated tests filterable via JQL "Reported by AI" = Yes.These six items are NOT optional steps — they're the completion contract. Any test that ends a run with any of these missing should be reported in Step 9 as incomplete and offered for retry.
Three files. Read them at the start of every run.
| File | Purpose | Committed? |
|---|---|---|
${CLAUDE_SKILL_DIR}/references/config.json | Routing (cloudId, project key, custom field IDs, Xray import URL, step template Jira key, reviewer settings) | NO — gitignored |
~/.claude/.xray-credentials.json | Real secrets (API token, Xray client ID/secret) | NO — lives outside skill |
${CLAUDE_SKILL_DIR}/references/importConfiguration.json | CSV column → Jira field mapping (authoritative for CSV schema) | YES |
New in this version: config.templates.testStepTemplateKey (Jira key for canonical Xray test) and config.reviewer.* (enabled, maxIterations, severityThreshold) drive the Step 4.5 automated reviewer loop. Defaults if unset: enabled = true, maxIterations = 3, severityThreshold = "High". If testStepTemplateKey is unset/placeholder, the reviewer skips only the Template review category.
Templates: config.sample.json and credentials.sample.json are committed. See references/README.md for first-time setup.
config.reviewer.enabled = true, NEVER skip Step 4.5 — automated review is the primary gate; manual APPROVE is only the fallback when reviewer is disabled OR when the user opts out via the escalation menu after iteration maxIterationsa) xray-mcp (default), b) CSV import, c) playwright-mcp — do not assume; default to xray-mcpimportConfiguration.json — never hardcodecreateTestWithSteps / addTestStep / CSV import / Playwright upload runs until the traceability xlsx is approvedcreateTestWithSteps/addTestStep, or Playwright UI; never leave a created test step-less)[OPEN FOR SPEC OWNER INPUT] followed by the open question. This convention makes it greppable later and avoids the trap of inventing behavior that doesn't match what gets shipped. (Validated 2026-05-28 on 22 Map-family gap-fill tests where 12 of 22 SRS clauses required this treatment.)mcp__atlassian__* tools for Jira/Confluence accessmcp__playwright__* tools (required for Step 7 upload and Xray UI step-entry fallback)~/.claude/.xray-credentials.json (see setup README)Read ${CLAUDE_SKILL_DIR}/references/config.json. If missing, instruct user to cp config.sample.json config.json and abort.
Read the credentials file at config.credentialsPath (default ~/.claude/.xray-credentials.json). If missing or contains placeholder <...> values, abort with setup instructions.
Determine xrayMethod (how tests + native steps get written):
xray-mcp MCP server available (mcp__xray-mcp__* tools) → xrayMethod = "xray-mcp" (DEFAULT, preferred). Confirm with mcp__xray-mcp__test_simple.credentials.xrayCloud.clientId and clientSecret both non-empty → xrayMethod = "API"xrayMethod = "Playwright"Note: on tenants where the Xray Cloud GraphQL key is read-blind (getTests → total:0), the PAT-based xray-mcp gateway is the only working step read/write path — prefer it.
If $ARGUMENTS == "--dry-run": read both configs, print resolved values (REDACT secrets), and exit.
Supported input types:
| Input | Example | How to detect |
|---|---|---|
| Jira Issue Key | FIFAGEN-2872 | /^[A-Z]+-\d+$/ |
| Confluence URL | https://...atlassian.net/wiki/... | contains /wiki/ |
| Confluence Page ID | 123456789 | /^\d{6,}$/ |
| Local file | ./reqs.md, C:/docs/spec.pdf | file extension recognized |
| Plain text | multi-line spec | nothing else matches |
Supported file formats:
| Extension | How to read |
|---|---|
.md, .txt | Read tool, direct text |
.pdf | Read tool with pages parameter for >10 pages |
.pptx, .docx | Read tool |
.png, .jpg, .jpeg | Read tool (vision/OCR for diagrams) |
.drawio, .xml | Read tool, parse mxCell elements |
If no argument provided, prompt the user with the above table.
Route based on detected input type:
| Type | Tool / method |
|---|---|
| Jira Issue | mcp__atlassian__getJiraIssue (+ fetch linked children if it's an Epic) |
| Confluence | mcp__atlassian__getConfluencePage (extract page ID from URL if needed) |
| Local file | Read tool |
| Plain text | use directly |
Epic resolution: Test cases must link to an Epic.
customfield_10014. If unset, ASK USER.Parse the requirements: use cases (UC1, UC2…), requirements (R1, R2…), state transitions, error scenarios.
Categorize into: Positive Flow / Negative Flow / Edge Case / State Machine / Integration / Error Recovery.
Assign priority (P1 Critical → P4 Low) based on business impact and safety-criticality.
Optimize by merging related scenarios (since tests run serially, fewer well-designed tests beat many narrow ones):
One verification per step (atomic steps) — REQUIRED: author every step with exactly ONE observable verification in expected_result. Never bundle assertions in one step:
When the target tool only appends (xray-mcp addTestStep), prefix each action with a stable [NN] index so intended order survives.
Internal test case schema (drives all output modes):
{
"id": "TC-001",
"requirement_ids": ["R1", "R3"],
"summary": "Concise title (max 100 chars)",
"description": {
"objective": "What this validates and why",
"preconditions": ["System state", "Data setup"]
},
"steps": [
{"action": "Click Unload", "data": "slot=1", "expected_result": "Wafer moves to slot 1"}
],
"priority": "Critical|High|Medium|Low",
"tags": ["Positive|Negative|Boundary|ErrorHandling|Safety"]
}
If config.reviewer.enabled = true (default), pass the draft matrix to Step 4.5 — Review & Refine Loop for automated coverage / mapping / template / state-machine / merge / diagram validation. Otherwise (or after the loop escalates and the user opts out), present the matrix to the user and wait for explicit APPROVE / APPROVE ALL / APPROVE: <TC-IDs> before proceeding to Step 5.
Runs when: config.reviewer.enabled = true (default true if the field is missing). When false, skip directly to Step 5 after manual APPROVE.
Purpose: An automated test-case-reviewer subagent — running in a fresh context with no prior commitment to the draft — validates the matrix against the source SRS / images / state diagrams / step template, and loops with the generator (this skill) until either the reviewer is satisfied or the user accepts the remaining gaps.
Algorithm:
iter = 1
prev_feedback = null
loop:
result = dispatch_reviewer_agent(
source_type, source_content, image_paths,
template_issue_key = config.templates.testStepTemplateKey,
draft_matrix = current_matrix,
iteration = iter,
previous_feedback = prev_feedback
)
print "Reviewer iter <iter>: verdict=<result.verdict>, <#crit>/<#high>/<#med>/<#low> issues, <#gaps> gaps, <#merges> merges"
if result.verdict == "PASS":
break # → proceed to Step 5
if iter >= (config.reviewer.maxIterations ?? 3):
escalate_to_user(result) # see "Escalation menu" below — user decides
break
current_matrix = refine_matrix(current_matrix, result) # see "Refinement strategy" below
prev_feedback = result.issues
iter += 1
Reviewer dispatch: Spawn one general-purpose agent (same agent-type as Step 8.2's parallel creators), foreground (Step 5 blocks on its result). Pass the inputs documented in the Reviewer Agent Contract section below. The agent's response is parsed as JSON conforming to that contract — if the response is not parseable JSON, retry the dispatch once with a stricter "return JSON only — no prose" instruction; on second failure, escalate to the user as if iter == maxIterations.
Template gracefully optional: If config.templates.testStepTemplateKey is unset, empty, or still a <...> placeholder, dispatch the reviewer with template_issue_key = null and instruct it to skip the Template review category (enforce only the other five: Coverage, Mapping, StateMachine, MergeOpportunity, DiagramCoverage). Note this in the iteration summary so the user knows step-shape was not checked.
Refinement strategy (refine_matrix is implemented inline by this skill — NOT a subagent — since the matrix lives in the current conversation):
| Reviewer issue category | Skill response (only if severity ≥ config.reviewer.severityThreshold) |
|---|---|
Coverage | Generate a new TC covering each missing requirement ID from coverage_gaps[] |
StateMachine | Generate a new TC for each missing transition |
Mapping | Fix the named TC's requirement_ids array to match what its steps actually exercise |
MergeOpportunity | Merge tests per merge_suggestions[]: keep merge_into's TC ID; absorb steps + requirement_ids from the named TCs in absorb[]; remove the absorbed TCs. Compare INTENT (objective + requirement_ids), not only step text — TCs covering the same SRS Open Issue or scenario are merge candidates even with non-overlapping steps. |
Template | Rewrite the offending step's action / data / expected_result per the reviewer's suggested_fix |
DiagramCoverage | Extend or add a TC referencing the missed visual element |
PriorityLadder | Generate per-rule isolated-collision TCs for each missing rule; add one full-ladder top→down release TC per affected signal. For safety-critical signals (alarm-bearing, buzzer, emergency-stop equivalents) lift severity to Critical regardless of severityThreshold |
APIContract | Generate a direct-call TC for each named method; if oneof / Result / Either pattern is referenced, add explicit success AND error wire-shape assertions; for documented parameter ranges, add boundary-value tests for -1, max+1, and one oversized value |
SpecCompletenessGap | Generate a new characterization TC for each sub-check (a-j) that has no covering TC. When the SRS is silent on recovery / failure / decision (sub-checks a, c, d, g, j most commonly), set the expected_result to the literal token [OPEN FOR SPEC OWNER INPUT] followed by the specific open question — never invent behavior. For (a) cross-spec contradictions, the new TC must explicitly call out which existing test(s) it contradicts, so spec owner can reconcile. For (g) Comment: / Need to add annotations, the new TC's Description must quote the SRS annotation verbatim. |
Issues below severityThreshold | Note in the iteration summary; do NOT trigger refinement |
After each refinement, re-number TC IDs to remain dense (TC-001, TC-002…) and preserve requirement_ids traceability.
Escalation menu (at iter ≥ maxIterations OR if reviewer JSON is unparseable twice):
Reviewer did not converge after <N> iterations.
Remaining issues (severity ≥ <threshold>):
[grouped by category, each with TC ID + description]
Coverage gaps still open: <R-IDs>
Merge suggestions not applied: <list>
Options:
1. Accept all gaps — proceed to Step 5 with the current matrix
2. Accept specific — e.g. "accept: TC-003, R7" (only those issues skipped; others still loop)
3. Force iterate — re-run reviewer + refinement once more
4. Abort — exit workflow without creating any tests
Reply: 1, 2 (with list), 3, or 4
Options 1 and 2 proceed to Step 5. Option 3 sets iter = iter (no increment) and re-enters the loop body once. Option 4 cleanly exits with no Jira changes.
After the matrix is approved (Step 4.5 / manual APPROVE) and BEFORE any output mode runs, generate a traceability matrix as an .xlsx and get explicit user sign-off. No createTestWithSteps / addTestStep / CSV import / Playwright upload may run until the user approves this file.
output/traceability_matrix_<EPIC_KEY>_<YYYYMMDD>.xlsx (openpyxl) with:
Review the traceability matrix — approve to proceed to creation/import? (yes / changes).Rationale: step creation/import is effectively irreversible on append-only / import paths — catch coverage gaps in the xlsx, not in Jira.
After the traceability matrix is approved (Step 4.7), ask:
## How would you like to generate these test cases?
a) xray-mcp — create tests + native steps directly via the xray-mcp gateway [DEFAULT]
b) CSV import — import-ready CSV in output/, then the Xray Test Case Importer
c) playwright-mcp — create / enter tests via Xray UI browser automation
Reply: a, b, or c (default a)
Map the answer to OUTPUT_MODE ∈ {XRAY_MCP, CSV_IMPORT, PLAYWRIGHT}. Default to XRAY_MCP when the user says "go" / "default". Do not guess between b and c — re-ask if ambiguous.
Routing:
| Mode | Steps that run |
|---|---|
| XRAY_MCP (default) | 8 (xray-mcp pilot + serial creation) → 8.4 → 9 |
| CSV_IMPORT | 6 (generate CSV) → 7 (Playwright importer upload + 7.9 wiring) → 9 |
| PLAYWRIGHT | 8 (Method B: Xray UI step entry) → 8.4 → 9 |
The CSV is consumed by Xray's Test Case Importer. The column schema is driven by importConfiguration.json — read it now and rebuild the order from config.field.mappings[]. Do NOT hardcode.
Current schema (sorted by column.index):
| Index | Field ID | Header | Row scope |
|---|---|---|---|
| 0 | __xray_testId | Test ID | Every row (group key) |
| 1 | summary | Summary | First row of each test |
| 2 | description | Description | First row of each test |
| 3 | xray_testtype | Test Type | First row of each test |
| 4 | __xray_step_data | Step Data | Every step row |
| 5 | __xray_step_action | Step Action | Every step row |
| 6 | __xray_step_result | Step Result | Every step row |
| 7 | __xray_step_number | Step Number | Every step row (1-indexed within test) |
| 8 | customfield_10014 | Epic Link | First row of each test (Jira Epic key, e.g. FIFAGEN-10400) |
⚠️ Tenant-portability note:
customfield_10014is the Epic Link field ID on the default reference tenant. If your Jira tenant uses a different ID (checkconfig.json→customFields.epicLink), update column.index 8'sjira.field.idinimportConfiguration.jsonaccordingly.
Format rules (from importConfiguration.json):
, • Encoding UTF-8 • Quote " • Escape " as "",, ", \n, or ;; (used for requirement_ids, tags)Row layout — row-per-step, grouped by Test ID:
A test with N steps emits N rows sharing the same Test ID. The first row carries case-level fields (Summary / Description / Test Type); subsequent rows leave those blank. This matches how the Xray importer groups rows by __xray_testId.
Example (1 test, 2 steps):
Test ID,Summary,Description,Test Type,Step Data,Step Action,Step Result,Step Number,Epic Link
TC-001,"Basic Unload from Chuck","Objective: Verify wafer unload...
Preconditions:
- Chuck loaded
- Carrier slot empty
Requirements: R1;R3
Priority: High
Tags: Positive",Manual,"slot=1; wafer_id=W001","Click Unload button","Wafer moves to carrier slot 1",1,FIFAGEN-10400
TC-001,,,,"slot=1","Verify carrier light is green","Light is green and no alarms",2,
Field construction:
| Column | Source |
|---|---|
Test ID | test_case.id |
Summary | test_case.summary |
Description | Objective: <objective>\n\nPreconditions:\n- <p>\n\nRequirements: <ids joined ";">\nPriority: <p>\nTags: <tags joined ";"> |
Test Type | config.defaults.testType |
Step Data | step.data (use literal "no data" if step has no meaningful data; never use Unicode em-dash — cp1252 mojibake) |
Step Action | step.action |
Step Result | step.expected_result |
Step Number | step_index + 1 (1-indexed within test; Xray uses this for ordering and step-level display in Jira test issues) |
Epic Link | EPIC_KEY on first row of each test, empty on subsequent step rows. Maps to customfield_10014 so Xray auto-links the created Test under the Epic — no manual Jira UI click needed post-import. |
Write to: ${CLAUDE_SKILL_DIR}/output/TestCases_<EPIC_KEY>_<YYYYMMDD-HHMMSS>.csv (or per config.output.filenamePattern).
After writing, present the file path, row count, and instructions to either continue to Step 7 (Playwright upload) or stop.
After CSV is written, ask:
Upload this CSV to Xray now via browser automation? (yes / no)
- yes: I'll navigate to the Xray Test Case Importer, handle SSO, upload the CSV + importConfiguration.json, and run the import wizard.
- no: skip — you can import manually later via Jira → Xray → Test Case Importer
If yes, run the upload sub-workflow:
mcp__playwright__browser_navigate → config.xrayImport.urlconfig.atlassian.usernameContinue → wait for SSO redirectoutput/references/importConfiguration.jsonOn success: list created Jira keys (read from the importer's results page — do NOT hardcode). On failure: report error + screenshot, suggest checking CSV format or field mappings.
After the importer reports success and the new Jira keys are harvested, run these finalization steps so the CSV-import path produces tests in the same fully-wired state as the API path:
browser_navigate + browser_snapshot, or Xray Cloud API GET /api/v2/test/<KEY>/steps) and assert ≥1 step is present. If any test has zero steps, the CSV's __xray_step_* columns were either empty or mismapped — flag for retry.These steps are NOT optional: skipping any of them produces a half-wired test that violates the Per-Test Wiring Checklist at the top of this skill.
Selector strategy (in order of preference): text → role → data-testid → CSS → XPath. Take fresh snapshot after each major action. Iframes: Xray Test Case Importer runs in an iframe — use iframe refs from snapshots.
Primary path = xray-mcp gateway (mode XRAY_MCP, default). Mode PLAYWRIGHT runs the same flow but enters tests/steps via the Xray UI (Method B in 8.3).
xray-mcp mechanics (learned constraints — obey them):
mcp__xray-mcp__createTestWithSteps(projectKey, summary, steps[]) creates the test WITH all steps ordered in one call (no reordering needed). It does NOT set description/priority/fields — set those afterwards via Atlassian MCP. Multiple new-test createTestWithSteps calls MAY run in parallel.mcp__xray-mcp__addTestStep only — it APPENDS to the end, one step per call, and the gateway rejects concurrency with HTTP 503. Call it strictly serially (await each before the next); order is then preserved. Prefix each action with [NN] so order is self-evident/recoverable.[NN] atomic steps; the user deletes the old block in the UI.getTests → total:0); the PAT-based xray-mcp gateway is the working path. Verify with mcp__xray-mcp__test_simple.Create ONE test via mcp__xray-mcp__createTestWithSteps (atomic steps from the approved matrix) to validate formatting before bulk creation. Then:
mcp__atlassian__editJiraIssue: description (Objective/Preconditions/Requirements/Priority/Tags), priority {name}, Epic Link customfield_10014 = <EPIC_KEY> (and customfield_14374 = {value:"Yes"} here or in bulk at Step 8.4).issuelinks entry must show outwardIssue = <EPIC_KEY> (verified-correct direction).Present pilot URL to user and WAIT for explicit approval before proceeding to 8.2. In 8.2, create remaining NEW tests with createTestWithSteps (may batch in parallel); use serial addTestStep only when appending to pre-existing tests.
The Epic Link custom field (customfield_10014) establishes hierarchy but does NOT populate the Epic's Issue Links panel. To make the Epic's Issue Links section display is tested by <TEST_KEY>, create a standard Jira issue link AFTER the Epic Link is set.
⚠ Idempotency requirement (critical): Jira's POST /rest/api/3/issueLink is NOT idempotent — calling it twice for the same (type, inwardIssue, outwardIssue) tuple creates TWO duplicate rows in the Issue Links panel. ALWAYS pre-check existing links before posting. The skip condition: a link with type.name == linkTypeName AND inwardIssue.key == <TEST_KEY> already exists on the Epic.
Pre-check (once per Epic, cache the result for that run):
existing = mcp__atlassian__getJiraIssue(cloudId, issueIdOrKey=<EPIC_KEY>, fields=["issuelinks"])
existingTestKeys = set of l.inwardIssue.key for l in existing.fields.issuelinks where l.type.name == config.linkTypes.testLinkName AND l.inwardIssue
Then per test, only POST if not already present:
if <TEST_KEY> not in existingTestKeys:
mcp__atlassian__createIssueLink(
cloudId,
type: config.linkTypes.testLinkName // default "Test" — see Issue Link Types Reference
inwardIssue: <TEST_KEY>, // active subject (named by outward label "tests")
outwardIssue: <EPIC_KEY> // passive object (named by inward label "is tested by")
)
status: "created"
else:
status: "skipped-already-linked"
Report both counts in Step 9 summary: created: N, skipped (already linked): M, failed: K.
Directionality (Jira convention): inwardIssue is the issue whose role matches the link type's outward label; outwardIssue matches the inward label. For Test link type (outward="tests", inward="is tested by"):
inwardIssueoutwardIssueResult on Epic page: "is tested by <TEST_KEY>" appears in the Issue Links panel.
Result on Test page: "tests <EPIC_KEY>" appears.
API fallback when Atlassian MCP cannot see the issue (some tenants/permissions block recent issues from the MCP — see xray-cloud-api-access memory pattern): use Playwright with in-browser fetch() (see corporate-tls-workaround memory for the validated pattern). The pre-check + create pattern is the SAME — just run both fetches inside mcp__plugin_playwright_playwright__browser_evaluate. The idempotency rule still applies: GET issuelinks first, build the existing-keys set, then POST only for missing pairs.
Validated 2026-05-22 on FIFAGEN tenant: pre-check via GET /rest/api/3/issue/<EPIC>?fields=issuelinks (session-cookie auth), filter on type.name == "Test" && inwardIssue, build a Set<TEST_KEY>, then POST /rest/api/3/issueLink only for tests not in the set. Catches both same-batch retries and cross-batch reruns.
After the link is created (whether via API or Playwright), confirm it appears in the Epic's Issue Links section. Two verification paths:
Path A — API (when MCP visibility allows):
mcp__atlassian__getJiraIssue(cloudId, issueIdOrKey=<EPIC_KEY>, fields=["issuelinks"])
Assert: at least one entry in fields.issuelinks[] where type.inward == "is tested by" AND inwardIssue.key == <TEST_KEY>.
Path B — Playwright (when API can't see the Epic):
mcp__plugin_playwright_playwright__browser_navigate → https://<site>/browse/<EPIC_KEY>mcp__plugin_playwright_playwright__browser_snapshot (capture full page)is tested by followed by <TEST_KEY>mcp__plugin_playwright_playwright__browser_take_screenshot for evidence (filename pattern: verify_link_<EPIC_KEY>_<TEST_KEY>.png)Record the verification outcome (linked: true/false) on the test's result record so Step 9's summary can display it as a new column.
For each remaining test case, spawn a general-purpose agent with run_in_background: true. All agents in a single message → true parallel execution.
Batching: Default 10 agents per batch. Wait for each batch to complete before spawning the next.
Per-agent prompt template must include:
OUTPUT_MODE, cloudId, EPIC_KEY, loginEmailxrayMethod ("API" or "Playwright") so the agent knows how to add native stepslinkTypeName from config.linkTypes.testLinkName (default "Test") for the "is tested by" link creation in Step 8.1.a (parallel agents perform 8.1.a per test, but defer 8.1.b verification to the orchestrator in Step 9){status, testCaseId, jiraKey, url, xrayStepsAdded, isTestedByLinkCreated, error}Method A — Xray Cloud API (when xrayMethod == "API"):
TOKEN=$(curl -s -X POST -H "Content-Type: application/json" \
-d '{"client_id":"<CLIENT_ID>","client_secret":"<CLIENT_SECRET>"}' \
https://xray.cloud.getxray.app/api/v2/authenticate | tr -d '"')
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '[{"action":"...","data":"...","result":"..."}]' \
https://xray.cloud.getxray.app/api/v2/test/<TEST_KEY>/steps
Token TTL is ~15 minutes — each agent authenticates independently.
Method B — Playwright (when xrayMethod == "Playwright"): Navigate to test issue → Test Details tab → loop "Add Step / New Step" for each step, filling Action / Data / Expected Result fields. UI runs in an iframe — refresh snapshot between actions.
Purpose: Tag every created test with the AI-generation marker so they're filterable in Jira ("Reported by AI" = Yes) for audit and traceability. Required by the FIFAGEN tenant convention; configurable per tenant via config.fields.reportedByAi.
When this runs:
Field details (FIFAGEN tenant default):
| Property | Value | Override |
|---|---|---|
| ID | customfield_14374 | config.fields.reportedByAi.id |
| Name | Reported by AI | config.fields.reportedByAi.name |
| Schema | option | — |
| Payload | {value: "Yes"} | config.fields.reportedByAi.value |
Implementation — use Playwright browser_evaluate with one batched fetch loop (validated 2026-05-28 on 85 tests across 4 epics, 100% success):
async () => {
const keys = [/* every Jira test key created in this run */];
const batchSize = 15;
let ok = 0, failed = [];
for (let i = 0; i < keys.length; i += batchSize) {
const batch = keys.slice(i, i + batchSize);
const results = await Promise.all(batch.map(async k => {
const r = await fetch('/rest/api/3/issue/' + k, {
method: 'PUT',
headers: {'Content-Type':'application/json', 'X-Atlassian-Token':'no-check'},
credentials: 'include',
body: JSON.stringify({ fields: { customfield_14374: { value: 'Yes' } } })
});
return { k, ok: r.ok, status: r.status };
}));
ok += results.filter(r => r.ok).length;
failed.push(...results.filter(r => !r.ok));
}
return { ok, failed_count: failed.length, failed_keys: failed.map(f => f.k) };
}
⚠ Editmeta gotcha: on the FIFAGEN tenant,
customfield_14374is NOT exposed viaGET /rest/api/3/issue/<KEY>/editmetafor the Test issue type (it's not on the Test edit screen scheme). The direct PUT works anyway because the field IS in the contextual scope. Don't use editmeta as a gatekeeper — try the PUT and check the response. A 204 response means it took.
Verification (REQUIRED — double-check via two independent paths):
customfield_14374 from 3-5 keys spanning all epics; assert value: "Yes".issue in (<every key>) AND "Reported by AI" = Yes — must return exactly keys.length issues. If lower, list which keys are missing and retry.Both checks must pass before declaring the bulk-tag complete. Either alone has a blind spot (sample misses non-sampled failures; JQL sweep doesn't tell you WHICH key failed).
On failure: If any tests didn't take the tag, retry those keys once (network blips are the common cause); if still failing, list them in Step 9 with the HTTP status returned. Don't silently drop the failures.
See feedback-reported-by-ai-field-tagging memory for the validated pattern and reusable JQL filter.
Agent failures are scoped to single tests. Other agents continue. Failed tests are reported in Step 9 and can be retried.
Mode-aware report:
CSV_ONLY:
CSV Generation Complete
File: <path>
Tests: <N>, Step rows: <TOTAL_STEPS>
Schema: derived from references/importConfiguration.json
No Jira issues created.
To import: see Step 7 instructions (re-run /xray-tests and pick mode 1 → say yes to Playwright upload),
or manually upload via Jira → Xray → Test Case Importer.
API_ONLY:
Test Case Creation Summary
Total: <N>
Pilot: <PILOT_KEY>
Parallel agents: <M>
Success rate: <X>/<N>
"is tested by" link rate: <Y>/<N> (counted after Step 8.1.b verification on Epic Issue Links table)
"Reported by AI" = Yes tag rate: <Z>/<N> (counted after Step 8.4 JQL verification sweep)
| # | TC ID | Jira Key | Xray Steps | is tested by | Reported by AI | Status | Link |
| - | ----- | -------- | ---------- | ------------ | -------------- | ------ | ---- |
...
Failed cases (if any):
| TC ID | Error | Suggested action |
Failed link creations (if any):
| TC ID | Test Key | Reason | Suggested action |
Failed "Reported by AI" tag patches (if any):
| Test Key | HTTP status | Suggested action |
BOTH: combine — lead with CSV path, then the API table. Note that CSV is the rollback artifact if API had failures.
If any API failures, offer retry options:
Atlassian Document Format examples for description and custom-field bodies.
Paragraph:
{"type":"doc","version":1,"content":[
{"type":"paragraph","content":[{"type":"text","text":"Your text"}]}
]}
Ordered list:
{"type":"orderedList","content":[
{"type":"listItem","content":[
{"type":"paragraph","content":[{"type":"text","text":"Step 1"}]}
]}
]}
Bold: {"type":"text","text":"bold","marks":[{"type":"strong"}]}
Detailed I/O specification for the Step 4.5 reviewer subagent.
| Section | Content | Notes |
|---|---|---|
| Role | "You are a test-case reviewer running in a fresh context with no prior commitment to the draft. Re-extract requirements independently from the source materials below — do NOT trust the draft matrix's interpretation." | Pinning fresh-context discipline is critical |
source_type | jira / confluence / file / text | |
source_content | Inline verbatim if < 8 KB; otherwise pass the path/key and instruct the agent to re-fetch via Read / mcp__atlassian__getJiraIssue / mcp__atlassian__getConfluencePage | Token budget |
image_paths | Absolute paths to .png / .jpg / .drawio referenced in source; agent uses Read with vision | Raw — no preprocessing |
template_issue_key | config.templates.testStepTemplateKey or null | If null, agent skips the Template category |
draft_matrix | Full internal test-case schema array as JSON | |
iteration | 1, 2, or 3 | |
previous_feedback | (iter > 1) prior issues[] array | Agent verifies prior issues were addressed |
| Six review jobs | Bulleted checklist (see below) | |
| Severity rubric | Critical / High / Medium / Low with examples (see below) | |
| Output instruction | "Return JSON only — no prose preamble or postamble. Conform to the schema below." + the schema |
Before running the 9 review jobs, the reviewer must scan the source materials and build a catalog of:
~~...~~ in markdown, <s> in HTML, struck-through font in Word/PDF). Treat as obsolete; any TC that tests struck-through behavior is a candidate for retirement.Comment:, Note:, Need to add, Should we, Define, TBD, Open Question, 🔴 Missing, or similar self-flagged-gap markers. Each annotation is a SELF-IDENTIFIED gap by the spec author.unknown slot tokens in a printed map example). These are usually valid runtime states that real tests miss.This catalog feeds the new SpecCompletenessGap review job (#9) below.
The reviewer must check each. Paraphrase/abbreviation tolerance applies to all jobs: when checking whether a TC covers a documented identifier (PLC tag, enum value, requirement ID, API method, rule), accept abbreviated/full-form/synonymous variations as equivalent unless the verification is explicitly about verbatim string matching. Example: WATCH_DOG_ALARM and BRV_TOWER_WATCH_DOG_ALARM refer to the same tag — do NOT flag the abbreviated form as missing coverage.
Coverage — Re-extract identifiers from source independently. The reviewer must enumerate ALL of:
requirement_ids arraywatchdog.timeout default = 9999 = disabled), the default-config behavior must have its OWN dedicated TC (testing only non-default configurations is a category-error; the default ships to every customer)OTHER = No change) — each documented tag must have an assertion in at least one TCSeverity: Critical for shipped-default-config misses and items the SRS explicitly flags as missing; High for enum values, Open Issues, NFRs without coverage; Medium for tag-level documentation assertions.
Mapping — For each TC, the steps must actually exercise the requirements listed in requirement_ids. Flag: requirements claimed but not tested, or steps testing un-listed requirements. Apply paraphrase tolerance: a step that asserts WATCH_DOG_ALARM=0 covers a requirement that references BRV_TOWER_WATCH_DOG_ALARM.
StateMachine — If source contains a state machine (text or diagram), every transition must have ≥1 covering TC. Report missing transitions as category: "StateMachine" issues.
PriorityLadder (NEW) — For each multi-rule priority ladder in the SRS (e.g., Red has 5 rules; Buzzer has 3; Yellow has 7; Green has 5):
Severity: High by default, Critical for safety-critical signals (alarm-bearing, buzzer, emergency-stop equivalents).
APIContract (NEW) — When the SRS lists explicit API methods (gRPC, REST, RPC, message bus, etc.):
oneof / Result / Either / discriminated-union error contracts, both success AND error wire shapes must be assertedsignal_id 0-4), boundary values (-1, max+1, 100, oversized) must be in a TCSeverity: Critical for any public method without direct-call coverage; High for missing error-path coverage and missing parameter-boundary tests.
MergeOpportunity (STRENGTHENED) — Two TCs are merge candidates when ANY of:
requirement_ids, not only step text. Example: TC-A "watchdog lifecycle including timeout<reset_interval validation" and TC-B "Open Issue #5: reset_interval > timeout misconfig" overlap at intent level despite different step phrasings.Emit merge_suggestions[] entries.
Template (skip if template_issue_key == null) — Fetch the template Xray test via mcp__atlassian__getJiraIssue + Xray Cloud GET /api/v2/test/<KEY>/steps. Derive shape rules (verb-first actions; measurable expected results — no "works correctly"; data field semantics). Flag steps that violate them.
DiagramCoverage — For each visual element / state / transition / box in attached diagrams, verify some TC references it. Flag uncited elements.
SpecCompletenessGap (NEW — validated 2026-05-28) — Validates the matrix against common production scenarios that SRSes routinely omit. The 8 prior jobs check completeness against what's WRITTEN; this job checks completeness against what production tools NEED but SRSes don't enumerate. For each of these 10 sub-checks, flag a missing TC if no existing TC addresses it AND the source does not explicitly mark it out of scope:
a. Cross-spec contradiction (Critical) — If the SRS contains struck-through text (from the pre-pass catalog) and any TC tests that struck-through behavior, flag for spec-owner reconciliation. If multiple linked SRSes contradict each other, the matrix must contain at least one characterization TC that exposes the contradiction.
b. Multi-instance behavior (High) — When SRS describes a singular resource ("the device", "the load port", "the pipe", "the lane") but the production system has N instances, the matrix must include at least one multi-instance isolation TC asserting an operation on instance A doesn't mutate instance B's state. (Example from this session: 9 Powerup tests all assumed single Load Port; real tools have 2-4.)
c. Hardware-failure path (High) — For each spec step that calls hardware (sensor, motor, robot, mapper, valve, network), the matrix must contain ≥1 characterization TC for "hardware returns failure / non-response / sensor unknown". When SRS is silent on recovery, mark expected result [OPEN FOR SPEC OWNER INPUT].
d. Intermediate / partial-state recovery (High) — For multi-step sequences (e.g., load = Clamp → Dock → Open → Map → Create), the matrix must include ≥1 TC for the partial-state case (shutdown after Clamp but before Dock). Power loss can leave hardware in these intermediate states. When SRS is silent, mark [OPEN FOR SPEC OWNER INPUT].
e. Default-config persistence (High) — Job 1's "shipped default configuration values" rule already requires a default-value TC; this sub-check additionally requires verifying the default persists across clean install AND clean reset AND power cycle.
f. Cross-session / cross-feature interaction (Medium) — When two features share state (e.g., Diagnostics ↔ Production, EU ↔ Power-up, Multiple commands targeting the same resource), the matrix must include ≥1 TC asserting one feature's operations don't mutate the other's persisted state when transitioning between sessions.
g. Spec-author "Comment:" annotations (Critical) — From the pre-pass catalog: every Comment: / Need to add / Should we / Define / TBD / Open Question annotation is a SELF-IDENTIFIED gap by the spec author. Every such annotation must have either a TC pinning current behavior with [OPEN FOR SPEC OWNER INPUT] markers, OR an explicit out-of-scope traceability note. This is the highest-leverage check: the spec author already told you it's missing. (Example: MoU SRS Comment: Need to add a case when Cassette is removed during power down... Part of MAP During power up SRS — but Powerup SRS Rev 3 didn't address it. Now covered by FIFAGEN-16207 and FIFAGEN-16208.)
h. Non-binary sensor/data states (Medium) — From the pre-pass catalog of "Example shows but doesn't require" sentinel values: if SRS examples or error messages mention values like unknown, unspecified, not applicable, <null>, the matrix must include a TC exercising those values (not just binary True/False / present/absent cases). (Example: R003.5a printed-map example showed unknown tokens — but no test exercised mismatch with unknown values.)
i. Same-instance vs new-instance ambiguity (Medium) — When SRS says "load a NEW cassette" / "send another request" / "open another session", check whether the matrix also covers "re-load the SAME instance". Usually implicitly the same behavior, but SRS rarely states so — ambiguity worth flagging.
j. Operator-interrupt scenarios (Low) — Power-down during power-up, cancel during a pending operation, second command while first in flight, browser-close during a multi-step UI flow. SRS rarely covers; mark [OPEN FOR SPEC OWNER INPUT] if recovery is undefined.
Output: feed back into the regular issues[] array with category: "SpecCompletenessGap". Severity defaults: Critical for (a) and (g); High for (b), (c), (d), (e); Medium for (f), (h), (i); Low for (j). Override per safety/business impact judgment.
AtomicSteps (NEW) — Every step must contain exactly one verification point. Flag any step whose expected_result bundles multiple independent assertions (e.g. a gRPC state read AND a PLC-tag value; a multi-signal snapshot like "Red BLINK, Green OFF, White ON"; "consumed AND no error logged"). suggested_fix = "split step into one step per verification". Severity: High — a bundled step defeats one-verification-per-step traceability and per-step pass/fail.
ConfigMatrix (NEW) — When the SRS defines a configuration parameter that changes the execution path (not merely a numeric/threshold value), every behavioral TC that traverses that path must exist for each value of the parameter — including the basic positive / happy-path cases, not just edge cases.
Mapping by Robot | Mapping by Load Port — Robot issues an explicit Map after door-open; Load-Port auto-maps during the door-open event (and must close-then-reopen if the door is already open). These are genuinely different step sequences, so one config tested ≠ both covered.Severity: High for a missing config-variant of any Positive/critical scenario; Medium for missing config-variant of a low-priority negative case. Emit one issue per (scenario, missing-config) pair; suggested_fix = "duplicate for with <device/mode>-specific step deltas".
Present → Placed → Clamped → Docked → Open; unload is the strict reverse Closed → UnDocked → UnClamped → Placed=false → Present=false). A TC whose setup asserts a state that violates this order — e.g. "Docked but NOT Clamped" when Clamp precedes Dock — is unreal and must be flagged for removal, unless the SRS explicitly defines that partial/transitional state as reachable.Severity: High for a TC asserting an unreachable state or impossible config (it misleads reviewers and wastes execution); Medium for a missing return-to-initial-state step. suggested_fix = "remove : state/config unreachable per SRS lifecycle <…>" OR "append restoring step to : <…>".
Presentation conventions (apply when emitting the matrix, not a blocking job): keep Positive and Negative tests separable (tag each TC Positive/Negative by intent — valid-condition behavior vs. error/exclusion/failure/anomaly/boundary-rejection — so the deliverable can be split into per-polarity tabs), and keep each Test Summary concise (short, scannable name; move full prose into the objective). Report only as severity: "Low" category: "StateRealism" notes if violated — do not block the loop.
| Severity | Examples |
|---|---|
| Critical | Missing requirement coverage; safety/compliance miss; broken state transition with no covering test; missing TC for any shipped default configuration value (the default ships to every customer); missing direct-call coverage of any documented public API method; any item explicitly listed as "🔴 Missing" / "Missing Critical Tests" / "Required" / "Recommended Test Enhancements" in the SRS itself (the SRS author already told you it's missing — failing to enumerate this is the reviewer's most damning miss); SpecCompletenessGap sub-checks (a) cross-spec contradiction / struck-through-text TC and (g) any Comment: / Need to add / Should we / Define / TBD annotation in the SRS without a covering TC — the spec author already self-identified the gap |
| High | Vague expected results ("works correctly", "is correct"); mapping error; missing edge case explicitly listed in source; missing rule in an N-rule priority ladder (every rule needs an isolated-collision TC); missing enum value reachability test (including no-rule enum values that should default OFF); missing direct-call test for an API method's error / oneof path; SRS Open Issue not addressed (no TC pinning current behavior AND no explicit out-of-scope flag); NFR section without any covering TC; SpecCompletenessGap sub-checks (b) multi-instance behavior, (c) hardware-failure path, (d) intermediate / partial-state recovery, (e) default-config persistence — these are production-required even when SRS is silent |
| Medium | Merge opportunity (especially intent-level overlap, not just step overlap); minor template-shape deviation; redundant test; documented PLC / device / hardware tag default value without an assertion in any TC; minor enum-value verbatim / typo string-match deviation (when verbatim match was a stated requirement); SpecCompletenessGap sub-checks (f) cross-session / cross-feature interaction, (h) non-binary sensor/data states, (i) same-instance vs new-instance ambiguity |
| Low | Stylistic phrasing; non-essential ordering; SpecCompletenessGap sub-check (j) operator-interrupt scenarios |
{
"verdict": "PASS" | "REVISE",
"iteration": 1,
"summary": "<one-line human summary>",
"issues": [
{
"category": "Coverage" | "Mapping" | "StateMachine" | "PriorityLadder" | "APIContract" | "MergeOpportunity" | "Template" | "DiagramCoverage" | "SpecCompletenessGap" | "AtomicSteps" | "ConfigMatrix" | "StateRealism",
"severity": "Critical" | "High" | "Medium" | "Low",
"test_id": "TC-001" | null,
"requirement_ids": ["R1", "R5"],
"description": "<what's wrong>",
"suggested_fix": "<concrete edit>"
}
],
"coverage_gaps": ["R7", "R9"],
"merge_suggestions": [
{"merge_into": "TC-002", "absorb": ["TC-005"], "reason": "<why>"}
]
}
Verdict logic for the reviewer: Return PASS only if NO issue has severity ≥ config.reviewer.severityThreshold (default High) AND coverage_gaps[] is empty. Otherwise REVISE.
Default IDs (override in config.json per tenant):
| Field ID | Name | Purpose |
|---|---|---|
customfield_10014 | Epic Link | Parent epic |
customfield_11985 | Manual Test Steps | Legacy ADF steps field |
customfield_12591 | Rovo Manual Steps | Rovo agent field |
Configurable via config.linkTypes.testLinkName (default: "Test"). Discover available types per tenant via mcp__atlassian__getIssueLinkTypes.
| Link Type Name | Inward Label | Outward Label | Use For |
|---|---|---|---|
Test (default) | is tested by | tests | Standard Jira test linking — Epic page shows "is tested by " |
Epic-Test Link | Epic Tested By | Test for Epic | Tenant-specific variant (some legacy projects) |
API directionality reminder (mcp__atlassian__createIssueLink):
inwardIssue = the issue whose role is the outward label of the type (the active subject)outwardIssue = the issue whose role is the inward label of the type (the passive object)So for Test type, to make Epic display "is tested by Test":
inwardIssue = <TEST_KEY> # Test "tests" the Epic
outwardIssue = <EPIC_KEY> # Epic "is tested by" the Test
| Action | Method + Endpoint |
|---|---|
| Authenticate | POST https://xray.cloud.getxray.app/api/v2/authenticate body: {"client_id":"...","client_secret":"..."} |
| Get steps | GET https://xray.cloud.getxray.app/api/v2/test/<KEY>/steps |
| Set steps | PUT https://xray.cloud.getxray.app/api/v2/test/<KEY>/steps body: array of {action, data, result} |
Get API credentials: Jira → Settings → Apps → Xray → API Keys → Create new key.
| Scenario | Action |
|---|---|
| Config or credentials file missing | Abort with setup instructions pointing to references/README.md |
Credentials contain placeholder <...> values | Abort with "fill in credentials.json" message |
| File not found / unsupported extension | Report and re-prompt |
| PDF >20 pages without page range | Ask user for page range |
| Jira fetch fails | Report HTTP error, check API token & cloudId |
| Xray import URL 404 | Verify project.id in URL matches the project key |
| Playwright file upload selector not found | Try alternative selectors (input[type="file"], [data-testid*="file"]), screenshot if still missing |
| "Begin Import" button disabled | Read validation errors on screen, report to user |
| Xray API auth 401 | Re-check clientId/clientSecret; tokens expire ~15min, re-authenticate |
| Agent batch failure | Continue remaining agents; offer retry in Step 9 |
importConfiguration.json (it's the contract with the Xray importer)config.json or ~/.claude/.xray-credentials.json to git/xray-tests FIFAGEN-2872
→ skill fetches epic + linked stories
→ presents 8-test matrix, user APPROVES
→ user picks mode "1" (CSV only)
→ writes output/TestCases_FIFAGEN-2872_20260511-143022.csv
→ user picks "no" to Playwright upload
→ summary shows file path
/xray-tests ./examples/sample-requirements.md
→ user provides Epic FIFAGEN-3001 when asked
→ matrix approved
→ user picks mode "3" (Both)
→ CSV written → user picks "no" to Playwright (will use API path instead)
→ pilot test created in Jira → user approves
→ 7 parallel agents create remaining tests
→ summary shows CSV path + Jira key table
/xray-tests https://...atlassian.net/wiki/spaces/PROJ/pages/12345
→ matrix approved
→ mode "1" (CSV only)
→ CSV written
→ user picks "yes" to Playwright upload
→ skill navigates to Xray, handles SSO, uploads files, runs import wizard
→ summary lists Jira keys read from import results page
/xray-tests --dry-run
→ skill reads config + credentials, redacts secrets, prints resolved values
→ verifies importConfiguration.json is valid
→ exits without any creation
/xray-tests FIFAGEN-2872
→ matrix drafted at Step 4 (8 tests)
→ Step 4.5 iter 1: reviewer fetches template (FIFAGEN-99999), finds 2 coverage gaps
(R5, R7), 1 mapping issue on TC-003, 1 merge opportunity (TC-006↔TC-008)
verdict=REVISE
→ generator refines: adds TC-009 for R5+R7, fixes TC-003 requirement_ids,
merges TC-008 into TC-006
→ Step 4.5 iter 2: reviewer PASS — no issues at severity ≥ High, no coverage gaps
→ Step 5: user picks mode "1" (CSV only)
→ CSV written; 8 tests (1 added, 1 merged in)
/xray-tests ./reqs.md
→ matrix drafted (5 tests)
→ Step 4.5 iter 1: REVISE — StateMachine issue: "error→idle transition missing"
→ generator adds TC-006 for error→idle
→ Step 4.5 iter 2: REVISE — same StateMachine issue persists; reviewer says
TC-006's expected_result doesn't actually verify the transition fires
→ generator rewrites TC-006 step 3 expected_result
→ Step 4.5 iter 3: REVISE — same issue; reviewer claims source diagram shows
a SECOND error→idle transition under a different precondition
→ Escalation menu shown. User replies "accept: state-machine-transition-2"
(acknowledges the gap is intentional — second transition is documented
elsewhere as out-of-scope for this epic)
→ Step 5: user picks mode "3" (Both)
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub swadhatripathi/xray-test-suite-skills --plugin xray-test-suite