By akshat4165
Per-project model policy for Claude Code: analyzes your repo, writes a routing policy for which Claude model plans, codes, and reviews, then proves the token savings.
Implement a planned change — write or edit code following an approved plan. Before invoking, read .claude/model-policy.json and pass the policy's phases.code.model as this invocation's model parameter; fall back to sonnet if no policy exists.
Read-only codebase exploration — find files, trace call paths, summarize how a subsystem works. Before invoking, read .claude/model-policy.json and pass the policy's phases.explore.model as this invocation's model parameter; fall back to haiku if no policy exists.
Architecture decisions, implementation planning, and breaking down complex features. Before invoking, read .claude/model-policy.json and pass the policy's phases.plan.model as this invocation's model parameter; fall back to opus if no policy exists.
Review a diff or recent changes for bugs, security issues, and correctness. Before invoking, read .claude/model-policy.json and pass the policy's phases.review.model as this invocation's model parameter — but if the diff touches paths matched by a policy escalation rule (e.g. auth/payment/infra globs), pass the escalated model instead.
Analyze this repository and write a per-project model routing policy (.claude/model-policy.json) deciding which Claude model tier handles planning, coding, review, and exploration. Use when the user asks to calibrate the project, optimize token spend, or pick models for this repo.
Report token spend for this project and estimate savings achieved by the model policy versus an all-top-tier baseline. Use when the user asks how many tokens the policy saved, for a spend report, or to verify calibration is working.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Your Claude Code learns each project's cheapest safe model policy — and proves the savings.
Heavy Claude users burn tokens guessing which model to use: Opus to plan? Sonnet to code? Haiku to review? The right answer depends on the project — its size, risk surface, and test coverage. calibrate has Claude analyze the repo once, write an auditable per-project routing policy, route phase work through it, and report measured savings versus running everything on the top tier.
Calibrate two different repos and you get two different policies — because the analysis is grounded in each repo's actual risk and safety net. Real output from two projects:
| Phase | A security-sensitive Next.js app | A standard Python FastAPI + RAG backend |
|---|---|---|
| plan | opus — subtle partner-sync + Firestore-rules security, no tests to catch a bad design | sonnet — standard patterns, no architectural complexity worth Opus |
| code | sonnet — strong types + next build catch errors | sonnet — zero tests, so not cheaper than this |
| review | sonnet, escalates to opus on auth/**, partner/**, firestore.rules | sonnet, escalates to opus on main.py, rag.py, ingest/**, config.py |
| explore | haiku | haiku |
The planning tier dropped from Opus to Sonnet on the simpler project — that's the whole idea: Claude already knows how hard a project is, so let it decide.
On a verified end-to-end run in the Next.js app — routing checked against the session logs, not the model's self-report — the policy cost $1.48 vs $2.78 all-Opus: 46.8% saved. That run deliberately touched security-sensitive code, so 3 of 4 agents escalated to Opus; a task that doesn't hit flagged paths keeps code+review on Sonnet and saves more. So ~47% is roughly the floor, on a worst-case task. (Dollar figures are subscription quota-value, not cash.)
Honest caveat: that's a single verified data point so far. The second project's savings are estimated (50–65%) pending its own measured run. Calibrate reports theoretical numbers as theoretical and measured numbers as measured — never the two confused.
/calibrate:project — Claude profiles the repo (scale, stack, risk paths, test/CI safety net; interviews you instead if the repo is empty) and writes .claude/model-policy.json: a model tier per phase, each with a one-line evidence-based rationale, plus escalation rules for risky paths and repeated failures. Tier aliases (haiku/sonnet/opus/fable) auto-upgrade as Anthropic ships new model versions.planner, coder, reviewer, explorer subagents run on the policy's model for their phase, escalating when a diff touches a flagged path. Switching happens at subagent and session boundaries, never mid-conversation — so the prompt cache survives./calibrate:savings — parses your local usage logs, compares actual spend against an all-top-tier baseline, and reports the delta alongside quality incidents (escalations, retries, test failures) so savings claims stay honest./plugin marketplace add akshat4165/calibrate
/plugin install calibrate@calibrate
For development: claude --plugin-dir ./calibrate. Then, inside any project: /calibrate:project.
Honest status — this is early:
claude plugin validate passes.claude/settings.json cleanly sets what each new session starts on (see anthropics/claude-code#43326)Known rough edges: /calibrate:savings needs read access to ~/.claude/projects/ (or ccusage) to compute actual dollars — a restricted session falls back to theoretical-only. Creating .claude/ in a fresh repo prompts for permission once.
npx claudepluginhub akshat4165/calibrate --plugin calibrateUltra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Memory compression system for Claude Code - persist context across sessions