Adversarial agent-pair harness: one model writes code (plan → implement), a second model reviews it adversarially, gated at human checkpoints (a single-model test-first TDD mode is also available).
Vendor the redteam agent-pair harness into the current project (runs the bundled installer). Use when setting up redteam in a repo for the first time.
Run the redteam adversarial reviewer (a different model than the one that wrote the code) over the current branch diff, read-only. Use to get a cross-model security/quality review of your local changes without driving the full pipeline.
Customize the per-role models for the redteam harness (plan / execute / review / rescue) and write them to .redteam/config.toml. Use to choose which model plans, which implements, and which independent model reviews.
Show the redteam pipeline status for a batch — per-task phase, next step, gates, and any deferrals — without running anything. Use to check where tasks stand.
Scaffold a new redteam task — create the next task-NNN directory and seed an input.md brief from the template. Use to add a task to a batch without hand-reproducing the brief structure the planner expects.
Independent reviewer of the task branch diff (against the base branch named in the phase prompt) versus outcome.md and the project security checklist. HIGH findings from the project security scanner force CHANGES_REQUESTED. Outputs code_review.md ending with REVIEW_DECISION on the final line. No code modification. Run after the implementer completes.
Implement the minimum code to turn red-phase tests green, scoped strictly to outcome.md's Affected files. Saves git diff to impl_diff.patch and self-verifies via the project verify command before completing. Run after test_review.md is APPROVED.
Translate a raw task brief into a verifiable outcome.md with Goal, Done-when checklist, Out-of-scope, Affected files, Verification hooks, and Risks. Use as the first phase of the redteam pipeline, after the user supplies input.md for a task.
After security review APPROVED, push the existing per-task branch (named in the phase prompt), write pr.md, create a draft GitHub PR, and save the PR URL to pr_url.txt. Always uses --draft. Never force-pushes. Never commits to the base branch.
TDD mode only — skipped in the default agent-pair mode (where the worker writes tests inside implement). Write tests (in the project's test framework) that fail in TDD red phase against an approved outcome.md. Each test must trace 1:1 to a Done-when item via docstring quotation. Use after the human approves outcome.md (sentinel outcome.approved is touched).
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
An adversarial agent-pair harness for shipping code with AI. One model
drives a task through a pipeline (plan → implement → review); a
different model reviews the work adversarially; humans gate the
irreversible steps. The collision of two independent model perspectives is the
point — automatic self-agreement is what it exists to prevent. (A single-model
TDD mode that front-loads write_test → verify_test is also available — see
Phases by mode.)
Status: early. redteam was built as one project's internal harness and then extracted into this standalone repo, which owns it going forward — it has driven real, merged pull requests. (Its early git history reflects that origin, including cross-repo coordination from the parent project.) APIs and layout may still move.
Quick install (Claude Code) — two commands:
/plugin marketplace add https://github.com/AscendyProject/redteam
/plugin install redteam@ascendy-redteam
Not on Claude Code? Vendor it into any repo — see Install.
Given a batch of tasks (each a short input.md brief), the orchestrator walks
every task through a fixed pipeline, persisting state.json after each phase so
a run is fully resumable and retrying on CHANGES_REQUESTED:
flowchart TD
PO[plan_outcome]:::worker --> PRV[plan_review]:::rev
PRV --> IMPL[implement]:::worker
IMPL --> RC[review_code]:::rev
RC -->|APPROVED| CPR[create_pr → draft PR]:::worker
RC -->|CHANGES_REQUESTED| IMPL
RC -. blocker persists .-> RES[rescue]:::rev
RES --> CPR
CPR --> DONE([done]):::done
classDef worker fill:#e3f2fd,stroke:#1976d2,color:#0d47a1;
classDef rev fill:#fce4ec,stroke:#c2185b,color:#880e4f;
classDef done fill:#e8f5e9,stroke:#388e3c,color:#1b5e20;
Blue = worker model (writes) · pink = reviewer model (adversarial, fresh).
This is the default agent-pair flow. By design it runs with no human gates in the common path — the adversarial pair plus verification is the trust, and the output is a draft PR (your existing human checkpoint before merge), not an auto-merge. Human gates are something you add back for risky changes, not the default tax on every change — see When to use it.
mode (agent-pair by default, or tdd) decides which phases run. The
authority is PHASE_ORDER in orchestrator.py — driving the pipeline manually
must follow the row for the declared mode, not the prose:
| Mode | Core phases |
|---|---|
agent-pair (default) | plan_outcome → plan_review → implement → review_code → create_pr |
tdd | plan_outcome → write_test → verify_test → implement → review_code → create_pr |
The agent-pair worker writes its tests inside implement — there is no
separate test-authoring phase; the second perspective is the adversarial
reviewer (review_code), and the plan is independently checked by
plan_review. The TDD mode instead drops plan_review and front-loads a
write_test → verify_test pair before implement. So write_test /
verify_test (the test-author / test-verifier sub-agents) run in TDD mode
only — inserting them into an agent-pair task runs a phase the mode excludes.
(The conditional rescue escalation slot and any human gates are added per the
tier profile; the table shows the common path.)
Each phase is run by a focused sub-agent with its own prompt and tool scope
(.claude/agents/*.md): an outcome-planner, implementer, code-security-reviewer,
and pr-author — plus a test-author / test-verifier pair used only in TDD mode.
The reviewer is a fresh agent that only sees the diff and the project's
security checklist — it never sees the implementer's reasoning.
A plain "two-model" setup stops at a second model takes a second look. redteam makes that separation structural and then acts on it:
npx claudepluginhub ascendyproject/redteam --plugin redteamGrounded portfolio/resume/reference/fit/rating from a developer's real GitHub work — every claim traced to evidence, never invented.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Comprehensive feature development workflow with specialized agents for codebase exploration, architecture design, and quality review
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use