By Geun-Oh
Harness Engineering toolkit for Claude Code — 7 tools based on AI-DLC patterns
Context Budget Monitor — Analyzes context window budget ratios (CLAUDE.md ≤5%, rules ≤3%, MEMORY.md ≤2%, actual work ≥80%) and warns about obesity symptoms. Triggers on 'context budget', 'token ratio'.
Features.json Validator — Validates that feature definitions in features.json satisfy AI-DLC Phase 1 criteria (specificity, independence, verifiability, testability). Triggers on 'features validate', 'feature validation', 'feature definition check'.
Feedback Encoding Ladder Tracker — Detects repeated feedback and suggests promoting it to a stronger encoding level (Review Comment → Documentation → Tool Design → Linter/Test). Triggers on 'feedback ladder', 'feedback tracking', 'repeated feedback'.
Gardener Agent — Scans for doc-code mismatches, unused exports, and architecture violations, then suggests cleanup PRs. This skill invokes the gardener agent to perform the scan. Triggers on 'gardener', 'entropy scan', 'entropy', 'cleanup'.
Harness Maturity Assessor — Diagnoses a project's AI-DLC maturity level (L0-L4) and provides gap analysis and action items for the next level. Triggers on 'harness assess', 'maturity assessment', 'maturity'.
Executes bash commands
Hook triggers when Bash tool is used
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Uses power tools
Uses power tools
Uses Bash, Write, or Edit tools
Uses Bash, Write, or Edit tools
Harness Engineering toolkit for Claude Code — 7 tools based on AI-DLC patterns.
"The bottleneck has moved from writing code to reviewing code, and now to architecture and harness design." — Harness Engineering & AI-DLC
# Register marketplace (one-time)
/plugin marketplace add Geun-Oh/personal-harness
# Install
/plugin install personal-harness@personal-harness
For local development:
claude --plugin-dir ./personal-harness
Turn your idea into a structured features.json. This replaces vague specs with a machine-verifiable checklist.
# Create project
mkdir my-project && cd my-project && git init
# Create features.json (write it yourself or ask the agent to draft it)
cat > features.json << 'EOF'
{
"features": [
{
"id": "auth-login",
"name": "Email/Password Login",
"description": "Login with email and password to receive a JWT",
"steps": [
"POST /api/auth/login accepts email and password",
"Returns JWT token for valid credentials",
"Sets session token in response cookie",
"Returns 401 for invalid credentials"
],
"tests": ["tests/auth/login.test.ts"],
"dependencies": [],
"passes": false
}
]
}
EOF
# Validate feature definitions against 4 quality criteria
/personal-harness:features-validate
Each feature must be specific (concrete steps), independent (no circular deps), verifiable (pass/fail conditions), and testable (test file paths). The agent can only modify the passes field — everything else is human-defined.
Build the environment so agents can work reliably.
# Create CLAUDE.md (~100 lines, acts as an index)
# Include: build/test commands, key rules, pointers to deeper docs
# Verify document hierarchy (L0-L3)
/personal-harness:pyramid-lint
# Diagnose current maturity level and get action items
/personal-harness:harness-assess
Minimum goal for L1 (Basic Harness):
CLAUDE.md exists with build/test commands.gitignore properly configuredImplement features from features.json one by one. Hooks run automatically — no manual intervention needed.
# Ask the agent to implement a feature
"Implement the auth-login feature from features.json"
# What happens automatically during coding:
# L1 Hook → syntax/secrets check on every file edit (blocks on CRITICAL)
# L2 Hook → linter/structure check at turn end (blocks on lint failure)
# L3 Hook → test execution at turn end (blocks on test failure)
# Budget Hook → context window obesity warning at turn end
# When done, the agent sets passes: true in features.json
# When you run `gh pr create`, L4 automatically suggests a review
# The gate-reviewer agent checks security and architecture
# Confirm all features pass
/personal-harness:features-validate
Agents generate entropy 5-10x faster than humans. Run these periodically.
# Scan for doc-code drift, unused exports, architecture violations
/personal-harness:gardener
# Check if repeated feedback should be promoted to stronger enforcement
/personal-harness:feedback-ladder
# Verify context budget isn't bloated
/personal-harness:context-budget
# Re-assess maturity — plan next level
/personal-harness:harness-assess
| Skill | Command | Purpose |
|---|---|---|
| Features Validator | /personal-harness:features-validate | Validate feature definitions (specificity, independence, verifiability, testability) |
| Knowledge Pyramid Linter | /personal-harness:pyramid-lint | Check CLAUDE.md and doc hierarchy (L0-L3) structure |
| Context Budget Monitor | /personal-harness:context-budget | Analyze token budget allocation (CLAUDE.md ≤5%, rules ≤3%, MEMORY ≤2%) |
| Harness Maturity Assessor | /personal-harness:harness-assess | Diagnose AI-DLC maturity level (L0-L4) with gap analysis |
| Feedback Encoding Ladder | /personal-harness:feedback-ladder | Track repeated feedback and suggest stronger encoding levels |
| Gardener Agent | /personal-harness:gardener | Scan for entropy (doc drift, unused code, architecture violations) |
npx claudepluginhub geun-oh/personal-harness --plugin personal-harnessLong-running agent harness with 5-layer memory architecture, GitHub integration, autonomous batch processing, Agent Teams with ATDD, 9 hooks (safety, quality gates, team coordination), and 6 Agent Skills
Spec-driven development for big features. When features get too big, plan mode gets too vague—leading to hallucinations during implementation. ShipSpec replaces vague plans with structured PRDs, technical designs, and ordered tasks that keep Claude grounded.
Verification-first engineering toolkit for Claude Code. 15 skills across a 5-phase spine (Investigate → Design → Implement → Verify → Ship), 8 specialist agents, an interactive setup wizard. Every skill has rationalizations + evidence requirements. Built for senior ICs and tech leads.
Specification-driven development workflow: specify → plan → tasks → implement
Full feature development workflow from spec to completion
Requirements-driven development workflow with quality gates for practical feature implementation