Search everything...

Stats

Actions

Available In

dev-harness

Name: dev-harness
Author: brothelmdzz

By Brothelmdzz

Dev Harness - 自驱动开发流水线 | 命令: /dev (全流程) /fix (快速修复) /test (运行测试) /audit (代码审计) /review (代码审查) /ask (对话问答) | 特性: 三层 Skill 解析 + 多模式(pipeline/single/conversation) + Web HUD 实时面板 + 12 专用 Agent

npx claudepluginhub brothelmdzz/dev-harness --plugin dev-harness

Popularity

Stars

Above avg

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Agents12

architect

/architect

系统架构评审。评估架构方案、接口设计、模块边界、技术选型合理性。用于 plan 阶段方案审核。

auto-loop

/auto-loop

AutoLoop 迭代器。自主循环推进整个 Pipeline，自动修复问题，直到完成或达到停止条件。受 Karpathy autoresearch 启发。

code-reviewer

/code-reviewer

代码审查。审查代码逻辑缺陷、最佳实践、可维护性。用于 review 阶段三路审查之一。

debugger

/debugger

Bug 诊断。分析编译错误、测试失败、运行异常的根因。用于 implement 门禁失败时。

executor

/executor

代码实现。编写标准业务代码、CRUD、DTO、配置等不需要深度推理的代码。用于 implement 阶段的简单 Phase。

Skills23

ask

/ask

纯对话问答模式 — 不创建 harness 状态，不触发 stop-hook。Use when: 用户说"问一下/ask/聊聊"。

audit-skill

/audit-skill

单独执行代码审计。Use when: 用户说"审计/audit/检查代码质量"。

dev

/dev

开发流水线编排器 — 自动检测状态、三层 Skill 解析、Hook 驱动续跑。通用于任何项目。Use when: 用户说"dev/开发/继续开发/下一步"，或新会话需要续接上次进度。

fix

/fix

快速修复 — 单 Skill 模式，只跑 implement + test。Use when: 用户说"修一下/fix/快速修复"。

generic-audit

/generic-audit

通用代码审计 — 自动对比 plan/PRD 与实际代码实现，检查质量、安全、一致性。适用于任何语言和框架。

Hooks1

Event Hooks

Bash

File writes

4 hooks across 3 events

Stats

Version3.4.4

LanguagePython

Stars2

MaintenanceExcellent

LicenseMIT

Last CommitMay 12, 2026

AddedApr 4, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

dev-harness-marketplace2

Safety Signals

Caution

Executes bash commands

Hook triggers when Bash tool is used

Modifies files

Hook triggers on file write and edit operations

README

Dev Harness

Self-driving development pipeline for Claude Code & Cursor.

Why dev-harness

AI coders write code fast, but the process still falls back to humans — code review, test execution, audit, documentation. The result is "fast typing, slow shipping."

dev-harness applies Harness Engineering — agent scaffolding as load-bearing infrastructure that compensates for what models can't do alone (Anthropic, Effective harnesses for long-running agents). One /dev command takes a task through research → plan → implement → audit + docs + test → review → remember, with stop-hook defenses that resume the agent when it tries to halt mid-pipeline.

What you get out of the box:

One command, full pipeline: /dev resolves the right skill at each stage automatically.
Multi-model code review (v4): three Claude reviewers (sonnet × 2 + opus) plus optional cross-vendor adversarial review via Codex CLI.
Six stop-hook defenses: rate limit, context overflow, timeout, retry storms — never silently dies.

Quick start

# Claude Code
/plugin marketplace add brothelmdzz/dev-harness
/plugin install dev-harness && /reload-plugins
bash "${CLAUDE_PLUGIN_ROOT}/scripts/setup.sh"

# Then in any project
/dev

How it works

/dev → skills/dev/SKILL.md (orchestrator)
  → detect-stack.sh + skill-resolver.py + harness.py init
  → Pipeline loop (defaults/pipeline.yml):
        research → prd → plan → implement → [audit | docs | test] → review → remember
                                                  parallel group              three-way
        Each stage completion → harness.py update
        Claude tries to stop → stop-hook.py defenses → blocks if pipeline incomplete

Three-layer skill resolution — same /dev, different projects pick different skills automatically:

L1: .claude/skills/{name}/SKILL.md      → project-specific (highest priority)
L2: ~/.claude/skills/{name}/SKILL.md    → user-level overrides
L3: skills/generic-{name}/              → built-in fallback

Core concepts

Term	One-liner
Skill	A markdown playbook that drives one pipeline stage. Resolved via L1/L2/L3.
Agent	A specialized sub-agent (12 total) with its own model and tool access.
Hook	A shell hook into Claude Code's lifecycle (Stop / PostToolUse / SessionStart).
Pipeline	YAML-defined ordered stages with optional `parallel_group` and `depends_on` DAG.
Route	Pre-defined pipeline subsets (B = full, C = skip research+prd, C-lite = quick fix).

Multi-agent parallel

dev-harness coordinates agents on two levels:

Layer 1 — Stage-level (parallel_group): after implement completes, three background agents run audit, docs, and test simultaneously. Each reports to harness-state.json independently. Declared in pipeline.yml:

- name: audit
  parallel_group: post-implement
- name: docs
  parallel_group: post-implement
- name: test
  parallel_group: post-implement

Layer 2 — Phase-level (Orchestrator): when a plan has > 3 phases, dev-harness automatically switches to orchestrator mode:

harness.py analyze-deps reads file-level dependencies and groups independent phases into batches
Each worker runs in an isolated git worktree (scripts/worktree.sh) — no merge races
Workers report through .claude/workers/worker-*.json; the main agent waits for all before advancing

Concurrency safety uses filelock on harness-state.json. The orchestrator refuses to start if filelock is missing (no silent fallback to broken state).

Multi-model code review (v4)

Layer 1 — Claude internal heterogeneous (always runs)
  ├─ code-reviewer       (sonnet)   ← code quality, bugs, edge cases
  ├─ security-reviewer   (sonnet)   ← OWASP top 10, secrets, auth
  └─ architect           (opus)     ← architecture, module boundaries

Layer 2 — Cross-vendor adversarial (optional, auto-detect)
  └─ codex exec          (your config)  ← GPT / GLM / o3 / etc. via Codex CLI

Aggregation:
  ≥ 2 channels hit same issue  → high confidence, must fix
  cross-vendor only            → "heterogeneous perspective" section

Layer 2 enables itself when bash scripts/detect-codex.sh exits 0 and dev-config.yml has review.cross_vendor.enabled: auto (default). No codex installed? It silently skips.

Pitfall journal & continuous learning (v4)

When build / test / audit / review fails, the stop-hook captures the root cause to .claude/pitfall-journal.jsonl. On the next resume, recent failures are injected into the prompt as ground truth — the agent sees what just broke and avoids repeating it.

View full README on GitHub

dev-harness

Popularity

What's Inside

Confidence

README

Dev Harness

Why dev-harness

Quick start

How it works

Core concepts

Multi-agent parallel

Multi-model code review (v4)

Pitfall journal & continuous learning (v4)

Similar Plugins

sd0x-dev-flow

aidd-dev

harness-session

harness-flow

harness-engineering

essentials

Dev Harness

Why dev-harness

Quick start

How it works

Core concepts

Multi-agent parallel

Multi-model code review (v4)

Pitfall journal & continuous learning (v4)

Popularity

Health & Quality

Similar Plugins

sd0x-dev-flow

aidd-dev

harness-session

harness-flow

harness-engineering

essentials