From claudecode-research-harness-workflow
Read-only review of research outputs checking identification, numerical accuracy, causal claims, and reproducibility. Produces a structured report with APPROVE/REQUEST_CHANGES/BLOCK verdict.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claudecode-research-harness-workflow:research-harness-review [--quick] [--task TASK-ID][--quick] [--task TASK-ID]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Perform an independent, read-only review of research outputs before release.
Perform an independent, read-only review of research outputs before release.
This skill reads existing scripts, logs, and outputs. It does not run code. It does not edit scripts or data. It produces a structured review report with a verdict.
This skill runs after /research-harness-work and before /research-harness-release.
| Input | Action |
|---|---|
/research-harness-review | Full review of all cc:done tasks in analysis_plan.md |
/research-harness-review --quick | Abbreviated review: identification + numerical accuracy only |
/research-harness-review --task 2.1 | Review a single task |
analysis_plan.md. If it does not exist, stop.cc:done. If none, report that no completed tasks exist to review.cc:done tasks. These are the review scope.study_spec.md, reports/data_audit_report.md, reports/data_cleaning_report.md, and reports/merge_report.md (if it exists).This skill is read-only throughout. No Bash commands, no script execution, no file writes except the review report.
Read study_spec.md §2 (identification strategy).
For each main model task in analysis_plan.md, check:
Assign one of: strong / moderate / weak / insufficient
insufficient identification immediately produces REQUEST_CHANGES. Do not write findings as minor if the identification is insufficient for the causal claim being made.
For each cc:done analysis task:
study_spec.md §4 and §5minor (changes that do not affect the main result), major (changes that affect the result), or critical (changes that contradict the approved study design)For each cc:done task with an output file:
unverifiedAn unverified number is a critical finding if it will appear in the final reported results.
Do not verify numbers by re-running scripts. Only read existing logs and outputs.
minor (small difference with a plausible explanation), major (large unexplained difference), or critical (N is clearly wrong)Read any interim outputs, table notes, or text summaries (if present in output/ or reports/).
For every causal claim found:
[descriptive], [correlational], [quasi-experimental: ...], [experimental])major findingsDo not rewrite weak evidence as strong causal evidence. If the evidence is correlational, the claim must be correlational.
reports/data_cleaning_report.md exists and its verification section is PASSreports/merge_report.md exists (if merges were performed) and all entries have pre/post row countsdata/raw/ was not modified (if git status data/raw/ or equivalent is available from prior logs, read it)Copy templates/review_report.md to reports/review_report.md and fill in all sections from Steps 1–7.
Verdict rules:
| Condition | Verdict |
|---|---|
| No critical or major findings | APPROVE |
| One or more major findings (but no critical) | REQUEST_CHANGES |
| One or more critical findings | BLOCK |
Identification is insufficient for the causal claims made | BLOCK |
| Any result cannot be traced to a log | BLOCK |
BLOCK is a stronger form of REQUEST_CHANGES. It means the research cannot be released in any form until the finding is resolved.
Print a review summary:
Research Harness Review — Complete
Tasks reviewed: N
Scope: [task IDs]
Identification credibility: strong / moderate / weak / insufficient
Numerical accuracy: all verified / N unverified
Causal claims: all appropriate / N overstated
Critical findings: N
Major findings: N
Minor findings: N
Verdict: APPROVE / REQUEST_CHANGES / BLOCK
Review report: reports/review_report.md
If verdict is REQUEST_CHANGES or BLOCK:
Required actions before /research-harness-release:
1. [Finding 1 — required action]
2. [Finding 2 — required action]
Return to /research-harness-work to re-execute affected tasks, then re-run /research-harness-review.
APPROVE when any critical finding existsreports/review_report.md exists with all sections populatedAPPROVE, REQUEST_CHANGES, BLOCKcc:done tasks reviewedreports/review_report.md writtenIf verdict is APPROVE:
Review passed. Run
/research-harness-releaseto package the replication archive.
If verdict is REQUEST_CHANGES or BLOCK:
Return to
/research-harness-workto resolve the findings listed inreports/review_report.md. After re-executing affected tasks, run/research-harness-reviewagain. Do not run/research-harness-releaseuntil the verdict isAPPROVE.
npx claudepluginhub maxwell2732/claudecode-research-harness-workflow --plugin claudecode-research-harness-workflowReviews data analysis methodology and quality as Phase 4 of the /ds workflow. Supports systematic review with strategy selection and context monitoring.
Assembles a replication/evidence package after a research review APPROVE verdict, including scripts, logs, cleaned data, tables, figures, and reproducibility report.
Enforces fresh verification of analysis results before making claims. Requires running the analysis from raw data and reading actual output before reporting any finding.