code-hardening
A Claude Code skill that responds to code audits by scoring findings, planning parallel work streams, and coordinating fix implementation across isolated git worktrees. Produces summary-response.md and review-response.md as the deliverable.
How It Works
%%{init: {"flowchart": {"wrappingWidth": 300}}}%%
flowchart TD
A([Repo with audit branch]) --> P1
subgraph P1 [Phase 1: Initial Review]
direction TB
P1a[Pass 1 — Parallel subagents: confirm each finding in code, estimate effort Low / Medium / High]
P1b[Pass 2 — Deep dive on High effort issues only: full scope, dependencies, risks]
P1c[Write hardening-initial-review.md and commit to audit branch]
P1a --> P1b --> P1c
end
P1 --> U1{User confirms Phase 1 summary}
U1 --> P2
subgraph P2 [Phase 2: Plan and Review]
direction TB
P2a[2a — Group issues into work streams by domain and subsystem, user confirms count]
P2b[2b — Interactive review per cluster: approve, skip, or adjust each item]
P2c[2c — Write hardening-plan.md with all agreed decisions, commit to audit branch]
P2a --> P2b --> P2c
end
P2 --> U2{User confirms plan}
U2 --> P3
subgraph P3 [Phase 3: Setup]
direction TB
P3a[Create named git worktrees as sibling directories outside the repo]
P3b[Pre-populate summary-response.md and review-response.md in each worktree — all rows Pending]
P3c[Discover test command from repo, ask user whether to run tests after each work stream]
P3d[Write filled HARDENING-TASK.md into each worktree root]
P3e[Detect claude binary, launch each worktree in tmux with claude --permission-mode auto]
P3a --> P3b --> P3c --> P3d --> P3e
end
P3 --> P4
subgraph P4 [Phase 4: Fix — runs in parallel]
direction TB
P4a[Read HARDENING-TASK.md, call EnterPlanMode, present approach for all issues]
P4b{User confirms per session}
P4c[ExitPlanMode, auto mode: fix each issue, update response docs, commit and push]
P4d[Run tests after all issues complete, surface failures to user and resolve]
P4a --> P4b --> P4c --> P4d
end
P4 --> U3{All sessions complete?}
U3 -->|No| P4
U3 -->|Yes| P5
subgraph P5 [Phase 5: Merge]
direction TB
P5a[Merge all hardening branches into hardening/merged]
P5b[Resolve any remaining Pending rows with user]
P5c[Run full test suite and code-review on hardening/merged]
P5d[Commit and push final hardening/merged]
P5a --> P5b --> P5c --> P5d
end
P5 --> Z([summary-response.md and review-response.md])
Phases
Phase 1 — Initial Review
Discovers all audit files on the audit branch without assuming a fixed location. Runs two passes: a lightweight parallel pass across all findings to confirm each issue exists in the code and estimate fix effort (Low / Medium / High), then a deeper dive for High effort issues to understand full scope, dependencies, and risks. Results are written to hardening-initial-review.md.
Phase 2 — Plan and Review
Groups findings into work streams by domain or subsystem — not by audit category. A single work stream may include security, performance, and architectural issues that all relate to the same part of the system. The skill walks through each cluster interactively with the user, presenting the recommended fix approach and flagging any behavioral change risks. The user approves, skips, or adjusts each item. Final decisions are written to hardening-plan.md.
Phase 3 — Setup
Creates one git worktree per approved work stream, named after the work stream, as sibling directories outside the repo. Pre-populates summary-response.md and review-response.md in each worktree with all issue rows at ⬜ Pending so nothing can be missed. Discovers the test command from the repo and asks the user whether to run tests after each work stream completes. Generates a HARDENING-TASK.md per worktree with full context. Detects the correct claude binary (e.g. claude or claude-work) and launches each worktree in a tmux window running claude --permission-mode auto.
Phase 4 — Fix
Each worktree session runs independently. It reads its HARDENING-TASK.md, enters plan mode to present its approach, and after user confirmation switches to auto mode. For each issue it applies the fix, updates its assigned rows in the response docs, commits, and pushes. Once all issues are done, runs tests if a test command was provided — surfacing any failures to the user for resolution before the final commit.