By gbasin
Adversarially stress-test technical plans by verifying claims against real documentation, running proof-of-concept code in .poc-stress-test/, and iteratively updating the plan to catch issues before building.
An agent skill that stress-tests technical plans before you build them.
Models are lazy about verification. They'll write a plan that says "use SQLite for concurrent writes" or "Y.js supports persistence out of the box" and move on without checking. These unchecked assumptions become mid-build surprises that force architectural pivots, messy workarounds, and wasted context.
This skill forces the model to actually verify its claims — searching real docs, ranking evidence quality, running proof-of-concept code when search is not enough, and fixing the plan before implementation starts. Each verification runs in a fresh sub-agent context, so there's less confirmation bias from the planning conversation — fewer hidden assumptions, less mid-build churn, and a clearer line between what's confirmed and what's still risk.
A plan claimed bash + sqlite3 would be fast enough for git hooks. The skill spun up parallel agents to research alternatives and run an actual latency POC:

The POC disproved the assumption — bash was 4-5x slower than estimated — and surfaced the real tradeoffs across runtimes:

npx skills add gbasin/stress-test-skill --all -g
Works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Windsurf, and other supported agents.
Six phases, each building on the last:
.poc-stress-test/ directory using the smallest representative setup in the most production-like environment available.Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub gbasin/stress-test-skill --plugin stress-testDecomposed reasoning with explicit confidence scoring. Break complex questions into verifiable units, score each independently, and synthesize with uncertainty tracking.
No description provided.
Orchestrate Claude Code and Codex to collaboratively create implementation plans with human-in-the-loop review
Orchestrate Claude Code and Codex to collaboratively create implementation plans with human-in-the-loop review
Adversarial plan review using red-team/blue-team agents -- generates what-if questions and grounds answers in plan artifacts with configurable tool scope
Pre- and post-implementation validation with parallel subagents: /replan validates plans before execution, /recheck verifies implementations match the plan
Codex, Gemini, Claude の3つの AI で Plan ファイルを並列レビュー。実装計画の妥当性、抜け漏れ、リスクを分析する
Independent plan/spec reviewer for AI coding agents. Verifies claims against the workspace and returns structured verdicts with findings.
AI-assisted deep planning with research, interview, external LLM review, and TDD approach
Verification-first engineering toolkit for Claude Code. 15 skills across a 5-phase spine (Investigate → Design → Implement → Verify → Ship), 8 specialist agents, an interactive setup wizard. Every skill has rationalizations + evidence requirements. Built for senior ICs and tech leads.