From rn-launch-harness
Evaluates React Native projects from RN Launch Harness pipeline against Anthropic's harness principles; reviews artifacts, git history, and optionally verifies app runs via Expo on iOS/Android simulators.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rn-launch-harness:rn-harness-retroThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Retrospective analysis of a project built with the RN Launch Harness pipeline. Evaluates the app against Anthropic's harness design principles and optionally verifies the app runs correctly.
Retrospective analysis of a project built with the RN Launch Harness pipeline. Evaluates the app against Anthropic's harness design principles and optionally verifies the app runs correctly.
"Every component in a harness encodes an assumption about what the model can't do on its own, and those assumptions are worth stress testing." — Anthropic
/rn-harness-retroParse $ARGUMENTS for:
[project-dir] — Target project path (default: current directory)--run-app — Attempt to build and run the app, verify it works on simulator--deep — Deep analysis including source code reviewRead every file in docs/harness/ and docs/specs/:
docs/harness/state.md → Pipeline state
docs/harness/config.md → Configuration
docs/harness/contract.md → Contract (Generator-Evaluator agreement)
docs/harness/build-log.md → Build log (rounds, scores)
docs/harness/pipeline-log.md → Pipeline event log
docs/harness/plans/ → PRD and planning documents
docs/harness/feedback/ → QA feedback (all rounds)
docs/harness/handoff/ → Handoff documents (all rounds)
docs/harness/screenshots/ → Screenshots taken during evaluation
docs/specs/ → Spec files and progress dashboard
Read all files and reconstruct the full pipeline history.
git log --oneline --all
git log --format="%h %s" --since="$(cat docs/harness/state.md | grep created_at | cut -d' ' -f2)"
Trace the build/QA cycle from commit messages. Identify:
If --run-app is specified:
npm installnpx tsc --noEmit — must pass with 0 errorsnpm run lint — must pass with 0 errorsnpx expo startnpx expo run:ios — verify app launches without crashnpx expo run:android — verify app launches without crash (if available)Record all results for the retro report.
Reference: Harness design for long-running application development
Evaluate the pipeline's adherence to each of the following 9 principles. For each principle, check the evidence, apply the checklist, and assign a star rating.
"Separating the agent doing the work from the agent judging it proves to be a strong lever."
Checklist:
docs/harness/ files)Evidence sources: pipeline-log.md for generator/evaluator alternation, feedback/ files for FAIL verdicts
"Out of the box, Claude is a poor QA agent... would identify legitimate issues, then talk itself into deciding they weren't a big deal."
Checklist:
Evidence sources: feedback/ files score trends, specificity of FAIL reasons
"The generator proposed what it would build and how success would be verified, and the evaluator reviewed that proposal."
Checklist:
Evidence sources: contract.md criteria count and coverage mapping
"Communication was handled via files: one agent would write a file, another agent would read it."
Checklist:
Evidence sources: handoff/ file existence, state.md change history
"I started by removing the sprint construct entirely." "The model handled 2+ hours of coherent building without sprint decomposition."
Checklist:
Evidence sources: build-log.md round structure
"The evaluator would navigate the page on its own, screenshotting and carefully studying the implementation."
Checklist:
Evidence sources: screenshot files, feedback references to visual verification
"Find the simplest solution possible, and only increase complexity when needed."
Checklist:
Evidence sources: pipeline-log.md phase durations, each phase's actual contribution
20x cost increase justified by 20x output quality improvement.
Checklist:
Evidence sources: build-log.md score trends, total duration
Comprehensive testing should avoid redundant test cases across evaluation passes.
Checklist:
Evidence sources: feedback/ files test case comparison across rounds
If --run-app was specified, compile the results from Phase 1.3:
| Item | Result | Notes |
|---|---|---|
| npm install | PASS/FAIL | [error details] |
| TypeScript check | PASS/FAIL | [error count] |
| Lint check | PASS/FAIL | [error count] |
| iOS Simulator launch | PASS/FAIL | [crash details] |
| Android Emulator launch | PASS/FAIL | [crash details] |
| Contract spot-check | N/M PASS | [failed items] |
| Console errors | N found | [severity] |
Generate docs/harness/retro.md:
# Harness Retrospective — [Project Name]
> Generated: [date]
> Pipeline duration: [start ~ end]
> Total rounds: [N]
> Final QA score: [score]
## App Verification Results
| Item | Result | Notes |
|------|--------|-------|
| Build | PASS/FAIL | [error details] |
| TypeScript | PASS/FAIL | [error count] |
| Lint | PASS/FAIL | [error count] |
| iOS Launch | PASS/FAIL | [details] |
| Android Launch | PASS/FAIL | [details] |
| Contract Spot-check | N/M PASS | [failed items] |
| Console Errors | N found | [severity] |
## Anthropic Principle Adherence
| Principle | Rating | Summary |
|-----------|--------|---------|
| P1. Generator-Evaluator Separation | ★★★☆☆ | [one-line assessment] |
| P2. Evaluator Skepticism | ★★★☆☆ | [one-line assessment] |
| P3. Contract Negotiation | ★★★☆☆ | [one-line assessment] |
| P4. File-Based Handoff | ★★★☆☆ | [one-line assessment] |
| P5. No Sprints (V2) | ★★★☆☆ | [one-line assessment] |
| P6. Screenshot-and-Study | ★★★☆☆ | [one-line assessment] |
| P7. Simplicity | ★★★☆☆ | [one-line assessment] |
| P8. Cost-Quality Tradeoff | ★★★☆☆ | [one-line assessment] |
| P9. Test Deduplication | ★★★☆☆ | [one-line assessment] |
### Detailed Analysis per Principle
#### P1. Generator-Evaluator Separation
**Rating: ★★★☆☆**
[Evidence-based detailed analysis. Include checklist results.]
... (repeat for P2 through P9)
## Keep (What Went Well)
1. [Specific example with evidence]
2. ...
## Improve (What Needs Improvement)
1. [Specific example with improvement direction]
2. ...
## Try (Experiments for Next Project)
1. [Ideas to experiment with in the next pipeline run]
2. ...
## Harness Improvement Suggestions
[Based on this project's experience, improvement points for rn-launch-harness itself]
| Rating | Meaning |
|---|---|
| ★★★★★ | Principle perfectly implemented. On par with Anthropic's reference examples. |
| ★★★★☆ | Principle well followed. Minor room for improvement. |
| ★★★☆☆ | Basics followed but some gaps or superficial adherence. |
| ★★☆☆☆ | Principle only partially followed. Substantive improvement needed. |
| ★☆☆☆☆ | Principle barely implemented. |
After writing the retro report:
docs: harness retrospective report--deep was used, include specific code-level observations in the reportnpx claudepluginhub tjdrhs90/rn-launch-harness --plugin rn-launch-harnessProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.