Search everything...

Stats

Actions

Available In

claude-harness-forge

Name: claude-harness-forge
Author: rlpatrao

By rlpatrao

GAN-inspired autonomous SDLC scaffold (BRD v3.0): initializer/coding-agent split, feature_list.json contract, mandatory browser-automation E2E gate, Ralph Loop exit interception, per-workflow LLM routing, Plan Mode subagent, Extended ReAct, adaptive context compaction, tree-structured sessions, spec-gap backprop, monotonic-improvement guards, instinct extraction, YAML recipes. Vendor-first reuse mandate per brd/v3.0.md §10.

npx claudepluginhub rlpatrao/claude_harness_forge

Popularity

Stars

Above avg

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Slash Commands23

architect

/architect

Interactive stack interrogation, design artifact generation, decision verification, and learnings persistence. Runs after BRD approval, before spec decomposition.

auto

/auto

Autonomous build loop with Karpathy ratcheting, GAN evaluator, browser console capture, UI standards review, 8-gate ratchet, session chaining, and cross-project learnings.

/branch

Label the current path in the session tree (BRD §4.5) so it can be retrieved later via /tree <label>.

brd

/brd

Create a Business Requirements Document through Socratic five-dimension interview. First step of the SDLC pipeline, before spec/design/build.

build

/build

Full 12-phase SDLC pipeline. BRD → Architect → Spec → Design → Observe → Comply → Initialize → Auto (11 gates) → Post-build. Human gates on phases 1-4.

Agents19

architect

/architect

Interactive technical design partner. Conducts stack interrogation informed by BRD context, challenges weak decisions, generates machine-readable design artifacts, verifies completeness, and persists decisions for cross-project reuse.

brd-creator

/brd-creator

Collaborates with the human to create Business Requirements Documents through Socratic dialogue with 5-dimension exploration, alternatives analysis, and engineer self-audit.

code-reviewer

/code-reviewer

Reviews code for quality, architecture compliance, test coverage, and story traceability.

coding-agent

/coding-agent

Per-session feature worker. Runs every session after the Initializer has set up the project. Follows the fixed 8-step startup sequence enforced by hooks/session-start.js. Works exactly one feature_list.json entry per session.

compactor

/compactor

Summarizes session transcripts for BRD §4.3 compaction stages 3-5. Uses Haiku for cost. Read+summarize only. Spawned by hooks/compaction-stage.js when budget thresholds are crossed.

Skills47

agentic-ux

/agentic-ux

UX patterns for agentic AI applications — intent preview, autonomy dial, confidence signals, audit trails, escalation, streaming, multi-agent dashboards, and error recovery.

architect-patterns

/architect-patterns

Design the system architecture including layered dependencies, API contracts (endpoints, schemas, errors), data models, folder structure, and deployment topology. Output a detailed design document to `specs/design/` with all decisions justified.

architect

/architect

Interactive stack interrogation, design artifact generation, decision verification, and learnings persistence. Runs after BRD approval, before spec decomposition.

auto

/auto

Autonomous build loop with Karpathy ratcheting, GAN evaluator, browser console capture, UI standards review, 8-gate ratchet, session chaining, and cross-project learnings. Iterates story groups until all features pass or stopping criteria met.

brd

/brd

Create a Business Requirements Document through Socratic five-dimension dialogue with the human. First step of the SDLC pipeline, before spec/design/build. Supports greenfield projects or single-feature additions.

MCP Servers1

playwright

Stats

Version3.0.0-alpha

ReleasedMar 28, 2026

LanguageJavaScript

Stars4

Forks1

MaintenanceExcellent

Last CommitMay 19, 2026

AddedMar 28, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

README

Claude Harness Forge

A Claude Code plugin that builds software the way a well-run engineering team would -- from requirements to production, with independent verification at every step.

v3.0 (May 2026) is the current spec. It retrofits the v2.0 pipeline with: initializer/coding-agent split, feature_list.json as the project-completion contract, mandatory browser-automation E2E gate, Ralph Loop exit interception, per-workflow LLM routing, Plan Mode subagent, Extended ReAct, 5-stage adaptive compaction, spec-gap backpropagation, monotonic-improvement guards, instinct extraction, tree-structured sessions, and YAML recipes. Full spec: brd/v3.0.md. Operational plan: brd/v3.0-implementation-plan.md. Live punch list: feature_list.json. 19 agents · 27 hooks · 47 skills · 23 commands.

You describe what you want to build. The forge runs 19 specialized agents through a 9-phase pipeline (v3.0 §7): gathering requirements through Socratic interview, challenging your architecture decisions, decomposing work into stories, generating code with parallel agent teams, and verifying everything by actually running the application. Not by reading the code and saying "looks good."

One command starts it. Human approval gates the creative decisions (BRD, architecture, design). Everything after that -- implementation, testing, verification, self-healing -- runs autonomously, bounded by the feature_list.json contract.

# Load as plugin and scaffold
claude --plugin-dir ~/claude-harness-forge
> /scaffold

# Run the full pipeline
> /build

What the Forge Does

Builds and Verifies Autonomously

The forge doesn't just generate code -- it runs your app, hits your API endpoints, drives a browser through Playwright, and checks for console errors. A 200 response with "Failed to connect" in the body is a failure. An empty list when data should exist is a failure. If something breaks, it diagnoses the issue, fixes it, and re-verifies -- up to 3 attempts per gate before escalating.

Catches What Tests Miss

Tests pass. The app crashes. This is the most common failure mode in AI-generated code, and the forge addresses it structurally:

Three-level verification -- liveness (does it respond?), behavior (does it work correctly?), integration (do features work together?)
Smoke launch with real data -- every build group must start the app with actual production data, not test fixtures. This gate cannot be disabled.
Spec gaming detection -- catches agents deleting tests to make suites pass, writing tautological assertions (expect(true).toBe(true)), inflating coverage with dead code. Also cannot be disabled.
Mutation testing -- injects small bugs and verifies your tests actually catch them. The mutation score ratchets: once it reaches 72%, it can never drop below 72%.

Learns and Improves

The forge has multiple feedback loops that compound over time:

Self-healing loop -- when a gate fails, the system diagnoses the failure (14 categories), spawns a targeted fix with the structured failure context and prior attempt history (so it doesn't retry the same fix), re-runs only the failed gate, and extracts a learned rule on success.
Cross-project learning -- agents read prior stack decisions, failure patterns, and integration notes before making recommendations. Mistakes from project A inform project B.
Findings reporter -- opt-in, anonymized feedback to the forge itself. A passive hook collects build findings; you review everything before submitting. The forge improves from real-world usage.
Change management -- mid-build requirement changes don't get lost. /change logs them with version tracking, runs impact analysis, and cascades updates through only the affected stories, design, and code.

Adapts to Your Project

The architect analyzes your requirements and activates only what's relevant:

Project Type	What Activates
CRUD	Standard architecture review, gates 1-8
ML	+ ML pipeline design, compliance gate, model cards, bias/fairness audits
Agentic	+ Agentic architecture round, OWASP Agentic Top 10, agentic UX patterns
RAG	+ RAG scaffolding, vector DB selection, chunking/embedding guidance

Projects can match multiple types. The forge also supports 4 execution modes -- from Full (all 12 gates, production-grade) to Solo (3 gates, weekend projects) -- so you control cost and rigor.

Scales with Agent Teams

For large story groups, the generator spawns parallel sub-agents that each own a slice of work. A dependency handshake identifies shared files before work begins, preventing merge conflicts. Each teammate gets the full context: learned rules, architecture constraints, and test requirements.

The Pipeline

View full README on GitHub

claude-harness-forge

Popularity

What's Inside

Confidence

README

Claude Harness Forge

What the Forge Does

Builds and Verifies Autonomously

Catches What Tests Miss

Learns and Improves

Adapts to Your Project

Scales with Agent Teams

The Pipeline

Similar Plugins

navox-agents

bkit — AI Native Development OS

tandemkit

sprint

dream-team

cc-best

Claude Harness Forge

What the Forge Does

Builds and Verifies Autonomously

Catches What Tests Miss

Learns and Improves

Adapts to Your Project

Scales with Agent Teams

The Pipeline

Popularity

Health & Quality

Similar Plugins

navox-agents

bkit — AI Native Development OS

tandemkit

sprint

dream-team

cc-best