By woditschka
Enforces a structured Java/Spring Boot development pipeline with automated code reviews, security audits, TDD cycles, doc synchronization, and architectural validation — all orchestrated via specialist agents.
Grade a passing change for how much human attention it deserves before merge. Terminal, advisory node dispatched after the four reviewers approve. Reads the diff and the deterministic feature row, emits five facets (clear/concern/unknown) with notes, a rationale, and a clear/concern verdict. Never a merge or correctness gate.
Review code for readability and maintainability following Java/Spring Boot conventions. Checks naming, function design, package structure, error handling, and record design.
Review documentation for coherence, structural correctness, and writing quality. Validates PRD, system-design, and ADRs against the checklist in the document-writing skill.
Implement features following Test-Driven Development (TDD). Reads current feature scope, creates implementation plan, writes tests first, then implements code to pass those tests.
Orchestrates the feature delivery pipeline. Use for new features or when unsure which agent to invoke.
Architecture Decision Record format, naming conventions, and when to create ADRs. Load when making or documenting architectural decisions.
Audit the agentic configuration for consistency, coherence, and conciseness. Load when modifying agent definitions, skills, or pipeline structure, or to verify cross-tool parity.
Audit the project's docs/ briefs against the high bar — each document on its own (principle form, enforceability) and in combination (cross-document coherence, contradictions, brief-vs-data agreement). Runs the doctor (deterministic structural gate) first, then the advisory judgment review, and reports both. Load when you want to check the docs hold up — at onboarding, after a harness upgrade, or on request. The judgment half is advisory: it judges form, never philosophical direction.
Grade a passing change for how much human attention it deserves before merge. Load when running the change-grader after the four reviewers approve. Holds the full grading protocol: the five facets, worst-facet aggregation, the facets-rationale-verdict order, persistence, and scope/non-goals.
Build, test, format, and lint requirements that must pass before code review. Load when checking implementation completeness or running the quality gate.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Agentic coding that amplifies an engineer's judgment instead of replacing it.
Ship in days what would otherwise die in triage: work worth trying but not worth weeks, built and tested against real users instead of shelved. The machinery that makes that repeatable — durable specs and nested feedback loops that keep every agent, session, and person pointed the same way — is the substance underneath.
The shape, in one minute. A file-based pipeline of nine one-job specialist agents builds one vertical slice at a time. Each appends a schema-validated record to a shared log, a coordinator routes from it, and four reviewers plus a change-grader gate every change — nothing auto-merges. The work runs through four nested feedback loops, from the inner TDD cycle out to whole-codebase review, so drift is caught before it compounds. Durable specs — PRD, system design, ADRs, ubiquitous language — are the shared memory every agent, session, and person reads and writes. One CLAUDE.md carries it across four agent tools; /materialize and /harvest adopt it in your project and feed improvements back.
AI coding agents face the same two challenges human engineers always have: keeping long-term memory across sessions, and running multi-scale feedback loops that catch drift before it compounds. The difference is degree, not kind — a human forgets between Friday and Monday; an agent forgets between one message and the next. Within days, not years, an agentic project that skips the disciplines that compensate starts drifting: terms picked inconsistently session-to-session, settled decisions re-litigated, this week's architecture contradicting last week's.
The fix is to treat the disciplines human teams already built for these problems as the memory and feedback substrate — documentation standards, DDD, TDD, ADRs, ubiquitous language, and XP-style nested loops. Every agent, every session, and every person on the codebase reads and writes the same durable specs, so all stay pointed the same direction. A file-based specialist pipeline of nine one-job agents operates it, building one vertical slice at a time, and a single rules file (CLAUDE.md) carries it across Claude Code, Copilot CLI, OpenCode, and Junie CLI.
Two working reference implementations (Go, Spring Boot), portable skills, and enforceable documentation standards demonstrate the pattern; a bidirectional /materialize + /harvest loop adopts it in your own project and feeds improvements back.
It is for anyone running an agentic coding workflow over more than a few sessions: a solo developer driving an agent team past what fits in one conversation, a team where each developer drives their own agent team on a shared codebase, or a human-only team that wants the same discipline against the slower drift humans face. The failure modes are the same; only the speed differs.
The architecture, principles docs, and reference implementations are stable and in active use. The specialist pipeline machinery (JSONL contract, four-reviewer fan-out, capability progression) is operational, though its cost-effectiveness is still being measured (with Harness Stats) and will be revised as evidence accumulates. Treat the disciplines as the validated core and the pipeline machinery as one reference implementation of the shape the harness can take.
→ Deep dive: agentic-harness.md covers the loop model and handoff contract. specialist-agent-workflow.md covers the full architecture and migration playbook. the document-writing skill covers the writing rules that keep agents from guessing.
The sections move from how it works to trying it to reference — read top-down, or jump to what you need.
An agent multiplies whatever it is pointed at. Judgment is the scarce input — you supply the design and the standards; the harness supplies the memory, discipline, and execution that amplify them. The disciplines that keep the multiplier raising quality rather than noise — TDD, DDD, owned specs, ADRs — matter more at this speed, not less.
How the work divides:
npx claudepluginhub woditschka/agentic-coding-reference --plugin spring-boot-junieJava Spring Boot agent harness for Claude Code — pipeline agents, skills, continuation hook, plus a one-time engine setup.
Go agent harness for Copilot CLI — pipeline agents, skills, plus a one-time engine setup.
Go agent harness for Claude Code — pipeline agents, skills, continuation hook, plus a one-time engine setup.
Go agent harness for Junie CLI — pipeline agents, skills, plus a one-time engine setup.
Java Spring Boot agent harness for Copilot CLI — pipeline agents, skills, plus a one-time engine setup.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.