From mina
Use this skill whenever errors occur related to rate limits (429), overload (529), 5-hour window exhaustion, "rate limited" or "waiting for capacity" messages, or whenever context exceeds 80%, cost on a single change exceeds $5, or the user explicitly asks about switching models or fallback. Defines the project's model routing tiers and the decision rules for when to switch.
How this skill is triggered — by the user, by Claude, or both
Slash command
/mina:model-fallbackThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This project pre-declares a fallback chain so model switching is deterministic, not ad-hoc. Read `model-routing.json` at the project root (or `templates/model-routing.json.example` for defaults).
This project pre-declares a fallback chain so model switching is deterministic, not ad-hoc. Read model-routing.json at the project root (or templates/model-routing.json.example for defaults).
| Tier | Model | When |
|---|---|---|
| primary | claude-opus-4-7 | Default. Brainstorm, design, complex reasoning, architecture |
| cheap | claude-sonnet-4-7 | Code execution from clear spec, GSD subagents, refactors |
| emergency | claude-haiku-4-5 | Title generation, simple lookups, when primary exhausted |
| backup_provider | API key (BYOK) or alt provider | When subscription 5-hour window exhausted |
| Trigger | Action |
|---|---|
| 429 rate_limit_error | Subscription users: wait or invoke /model-switch emergency. API users: check retry-after header, switch to next tier if wait > 5 min |
| 529 overloaded_error | Exponential backoff (1s, 2s, 4s, 8s); after 4 retries, switch to backup_provider |
| 5-hour window exhausted | /model-switch backup_provider if BYOK configured, else wait for reset |
| Cost on change > config.cost_cap_usd (default $5) | Suggest /model-switch cheap for remaining tasks on this change |
| Context > 80% | Run /clear or /compact BEFORE switching model (model swap doesn't help context) |
| Subagent spawn (>3 parallel) | Force cheap tier per subagent — Opus × N subagents = catastrophic cost |
| Trigger | Action |
|---|---|
Spec-driven phase = execute and spec has clear AC | Recommend cheap tier (Opus quality not needed for mechanical implementation) |
| Task type = title, summary, status update | Recommend emergency tier |
| Task type = security, payments, data migration | Stay on primary even if cost is high. Quality > cost here |
✅ Knows the rules, recommends, surfaces tradeoffs
✅ Reads model-routing.json to honor team config
✅ Updates .mina/state.json with recommended_model field
✅ Flags in statusline when active model ≠ recommended
❌ Cannot programmatically force Claude Code to switch models mid-session — user runs /model or /model-switch
❌ Cannot intercept API errors before they surface (no PreLLMCall hook in Claude Code yet)
❌ Cannot guarantee fallback succeeds — provider outages are real
Each GSD subagent runs Opus by default = each consumes its own 200k context at $15/M input + $75/M output. A 5-subagent wave on a non-trivial phase: easily $20-50. Reports from the community: 49-subagent run = $8k-15k, 23-subagent multi-day = $47k.
Always cap subagent model to cheap tier in model-routing.json unless the task genuinely requires Opus reasoning (architecture review, novel algorithm design).
/model-route — show current model, recommended model, why/model-switch <tier> — switch to a named tier with confirmation/cost — see if cost threshold is approached on current session/mina:token-report <change> — see total spend on current changeNo native fallback chain. model field in settings.json sets default; /model overrides for session; ANTHROPIC_MODEL env for launch-time override. When 429/529 hits, you must manually switch.
Has provider abstraction. model field supports a single model; small_model for cheap tasks. Multi-model fallback is an open feature request (#7602, #8687) — for now, use a gateway (LiteLLM, OpenRouter) as the provider, configured to handle fallback upstream.
Both runtimes support pointing at API key instead of subscription. This bypasses 5-hour windows but bills per-token. Useful as backup_provider tier. Set up via ANTHROPIC_API_KEY env var or in runtime config.
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub astronaut1712/astronaut-ai --plugin mina