Run autonomous agents that review pull requests and diffs for correctness, security, and performance, then iteratively improve code or other artifacts through mutation-evaluate-gate loops with checkpointing and recovery.
Review code changes with a max-grade, recall-oriented pipeline. Use when the user wants to: - Review a pull request, branch diff, or local working-tree diff - Find correctness, security, contract, concurrency, or performance bugs - Surface reuse, simplification, efficiency, altitude, or convention issues introduced by a change - Get a structured JSON summary of actionable findings
This skill should be used for multi-session autonomous agent work requiring progress checkpointing, failure recovery, and task dependency management. Triggers on '/harness' command, or when a task involves many subtasks needing progress persistence, sleep/resume cycles across context windows, recovery from mid-task failures with partial state, or distributed work across multiple agent sessions. Synthesized from Anthropic and OpenAI engineering practices for long-running agents.
Iteratively evolve any measurable artifact (prompt, skill, code, idea, configuration, document, benchmarked experiment) through autonomous mutation-evaluate-gate loops. Supports both GT case suites and autoresearch-style scalar metric loops where a fixed command prints one score. Uses 8-phase iteration, 3-layer evaluation, deterministic keep/discard gates, trace-driven diagnosis, and layered mutation. Trigger on "evolve this", "optimize iteratively", "self-improve", "train this prompt", "iterate on this code", "make this better through iteration", "run evolution loop", "self-evolution", "evolve this skill", "train this skill", "optimize skill quality", "autonomous research", "autoresearch", or whenever the user has a measurable artifact that needs data-driven improvement with automatic keep/revert decisions.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Agent skills for work that needs more control than a single prompt: long-running execution, high-recall code review, and measurable self-improvement loops.
These are not vibe-coding macros. They are small operating protocols for agents: explicit state, hard gates, deterministic rollback, and proofs before claims. Use them directly, fork them, or steal the patterns.
Install the collection with the skills.sh installer:
npx skills@latest add stellarlinkco/skills
Then call the skill you need from your agent:
/harness implement prd.md; loop verify, fix, retest
For manual installs, copy the skill directory you need from skills/ into your agent's skills directory. harness also needs hook registration; see skills/harness/README.md.
AI agents fail in boring, repeatable ways. They stop too early. They review too narrowly. They make a prompt or skill better once, then lose the path that made it better. This repo turns those failure modes into protocols.
The problem: long tasks die at session boundaries. Context windows reset, partial state disappears, and the agent starts summarizing instead of finishing.
The fix: harness gives the agent a durable task ledger, append-only progress log, hook-driven stop blocking, dependency checks, leases, and recovery rules. Progress files become the context.
Use it when a task has many subtasks, must survive sleep/resume cycles, or needs automatic recovery after a failed attempt.
The problem: most agent reviews optimize for precision too early. They produce a tidy list, but miss the dangerous bug hiding in a changed contract, deleted branch, or wrapper boundary.
The fix: code-review uses a max-recall pipeline: gather the real diff, generate candidates from independent angles, verify them, run a final gap sweep, then return a capped JSON findings list.
Use it for PRs, branch diffs, local working-tree diffs, security-sensitive changes, or any review where a missed P1 costs more than an extra candidate.
The problem: “make this better” is not a loop. Without a measurable oracle, each mutation is just taste with confidence.
The fix: self-evolution turns prompts, skills, documents, configs, code, and experiments into evaluate-gate loops. It supports GT case suites and scalar scoreboard metrics, keeps a ledger, and reverts mutations that do not pass.
Use it when the artifact can be measured and you want repeated improvement without guessing.
SKILL.md is the agent-facing protocol.npx claudepluginhub stellarlinkco/skillsFull BMAD agile workflow with role-based agents (PO, Architect, SM, Dev, QA) and interactive approval gates
Essential development commands for coding, debugging, testing, optimization, and documentation
Minimal SPARV workflow (Specify→Plan→Act→Review→Vault) with 10-point spec gate, unified journal, 2-action saves, 3-failure protocol, and EHRB risk detection.
Requirements-driven development workflow with quality gates for practical feature implementation
AI agent skill pipeline: plan-interview, intent-framed-agent, context-surfing, simplify-and-harden, self-improvement, and agent-teams-simplify-and-harden. Prevents scope drift, context degradation, rough code, and repeated mistakes.
Harness for Claude Code — skills, /harness:* slash commands, persona subagents, lifecycle hooks, and MCP tools without per-repo `harness setup`. Sibling plugins exist for Cursor, Gemini CLI, and Codex.
Multi-agent collaboration plugin for Claude Code. Spawn N parallel subagents that compete on code optimization, content drafts, research approaches, or any problem that benefits from diverse solutions. Evaluate by metric or LLM judge, merge the winner. 7 slash commands, agent templates, git DAG orchestration, message board coordination.
Agent Skills for improving SKILL.md files: mine repeated workflows from history, personalize and audit existing skills, or generalize personal skills for publication.
Production-grade engineering skills for AI coding agents — covering the full software development lifecycle from spec to ship.
Self-evolving skill engine for Claude Code. Creates, scores, repairs, and hardens skills autonomously through recursive improvement cycles.