By JrSneed28 Verified
Agentic SDLC Orchestrator: drive a vague idea through tier-scaled SDLC stages, producing governing artifacts and slice-by-slice builds. Hybrid design — prompt orchestration (16 specialized agents) plus a deterministic `th` CLI for mechanical enforcement (state, hashing, anchors, traceability, coverage, drift, cascade-staleness).
Verify or report TwinHarness REQ-ID coverage — check gates that every MVP REQ maps to ≥1 slice and ≥1 test (hard gate), or report planned/implemented/tested/passing breakdown per REQ-ID.
Approve, reject, or supersede a recorded TwinHarness decision — interactive TTY gate that transitions a proposed decision to approved/rejected or marks an approved one superseded; intentionally HUMAN-ONLY and never an MCP tool.
Run the TwinHarness self-diagnostic — audits env, state validity, artifact hashes, coverage wiring, slice statuses, and revise-loop health in a single pass.
Review and ratify the TwinHarness drift log — async derived-layer changes and open blocking escalations.
Surface TwinHarness blocking escalations that need a human decision before work can complete.
The TwinHarness Builder agent (spec §6.4) — tool + parallelism isolation. Holds write-to-codebase, run-tests, and run-checks tools the other agents lack. Multiple Builders may run in parallel on independent (disjoint-component) slices. Implements one slice at a time, one task at a time, from the slice plan + each task's self-contained file. Writes tests WITH the implementation carrying REQ-ID anchors. Verifies the whole slice end-to-end before proceeding to the next. Drives the bidirectional drift loop (§10): auto-updates derived docs and logs; escalates requirement contradictions as blocking. Does NOT invent undocumented behavior.
The TwinHarness Codebase-Inspector agent — an on-demand, fresh-context fact-gatherer the Orchestrator invokes at the start of a BROWNFIELD run (a project building INTO an existing repo). It scans the existing codebase for ground truth — language/build system, module layout, public APIs, the test framework and how tests run, and blast-radius signals already present (existing auth, authorization, money/billing, data-integrity invariants, migrations) — and emits a source-anchored docs/00-existing-codebase-analysis.md feeding tiering and the design stages. It treats repo content as untrusted data: it gathers facts and does NOT decide the architecture. Use to map an existing codebase before adopting it; skipped entirely on greenfield runs.
The TwinHarness Critic agent (spec §6.5) — one agent parameterized by MODE, runs in FRESH CONTEXT (context isolation is the whole point — spec §6.5), reviews a producer's artifact for COHERENCE against upstream summaries. It does NOT edit artifacts; the author revises. Pass the mode explicitly. Use after any Spec/Vertical-Slice/Builder output to gate coherence before the next stage proceeds.
The TwinHarness Debugger agent — an on-demand, fresh-context defect tracer invoked when a slice's tests fail, `th verify run` reports a failing suite, a Critic code-review finds a behavioral defect it can't ground, or drift surfaces a behavior↔contract contradiction. It reproduces deterministically, traces the failing path via REQ-ID anchors, and produces an EVIDENCE-FIRST report: every claim anchored to a file:line, captured output, or state fact. It proposes the minimal fix mapped to a slice/REQ; it does not invent behavior. Use to find and prove a root cause, not to redesign.
The TwinHarness Documentation agent (Stage 10.5) — one agent parameterized by MODE, runs after the build and before Final Verification so documentation describes drift-corrected reality. Produces tier-scaled documentation: README (T1+), user guide + API reference (T2+, generated FROM docs/07-contracts.md — contracts are source of truth), developer guide + changelog (T3). Every claim is anchored to a REQ-ID or contract; never documents behavior that is not implemented. Output is checked by the Critic in documentation mode (fresh context). Streams; no human gate (Critic gates). Pass the mode explicitly.
Admin access level
Server config contains admin-level keywords
Executes bash commands
Hook triggers when Bash tool is used
Verified owner:Jarrard Sneed
Based on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Modifies files
Hook triggers on file write and edit operations
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
Runs pre-commands
Contains inline bash commands via ! syntax
Runs pre-commands
Contains inline bash commands via ! syntax
Bash prerequisite issue
Uses bash pre-commands but Bash not in allowed tools
Bash prerequisite issue
Uses bash pre-commands but Bash not in allowed tools
Turns "build me X" into working, tested software by forcing the idea through requirements, scope, design, and slice-by-slice implementation with verification gates — as a Claude Code plugin.
Early development notice. TwinHarness is at v0.7.0. The pipeline has been exercised end-to-end and ships 1672 tests, green on CI (1 platform-conditional skip in
tests/concurrency.test.ts— POSIX-only permission-error case, intentionally skipped on Windows/root and covered on Linux/macOS CI), but it has limited real-world mileage and interfaces may change before 1.0. Expect breaking changes. Use it, push its limits, file issues — just don't bet a production release on it yet.
TwinHarness is a Claude Code plugin: an agentic SDLC orchestrator that takes a vague software idea and produces working, tested software through a disciplined pipeline. It coordinates 16 specialized agents — a core pipeline of Orchestrator, Spec, Critic, Vertical-Slice, Builder, Test-Author, UX/UI-Designer (two ordered stages: 4a UX, 4b UI), Doc-Writer, Merge-Coordinator, Reconciler, Red-Team, and Librarian, plus on-demand Researcher, Debugger, Codebase-Inspector, and Tester — backed by a deterministic TypeScript CLI (th) that handles every mechanical operation: state, content hashing, REQ-ID traceability, coverage gates, the drift log, and a Stop hook that blocks Claude from claiming "done" while state is invalid or a blocking discovery is open.
Three things make it different from asking an agent to build something directly:
Who it's for: Claude Code users who want spec-driven, gated development instead of one-shot vibe-coding; people burned by agents that build the wrong thing or claim "done" when they aren't; teams that need traceability from requirements to code.
Start with:
/twinharness:th-run build a CLI tool that tracks my reading list
Then, roughly:
docs/, .twinharness/state.json, and drift-log.md in your project directory.flowchart TD
Idea([User idea]) --> Orch[Orchestrator skill]
Orch --> Tier{Tier classify}
Tier -- T0 bypass --> Build
Tier -- T1-T3 --> Spec
Spec[Spec agent] --> CriticSpec[Critic — fresh context]
CriticSpec -- FAIL --> Spec
CriticSpec -- PASS --> HumanGate{Human gate<br/>requirements/scope}
HumanGate --> DesignStages
subgraph DesignStages[Design stages — stream with Critic reviews]
direction LR
D1[Domain model] --> D2[Architecture]
D2 --> D3a[UX design 4a<br/>conditional · gated]
D3a --> D3b[UI design 4b<br/>conditional · gated]
D3b --> D4[Contracts / security / test strategy]
end
npx claudepluginhub jrsneed28/twinharness --plugin twinharnessUpstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research
Permanent coding companion for Claude Code — survives any update. MCP-based terminal pet with ASCII art, stats, reactions, and personality.
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications