From ax
Routes expensive model work to cheaper subagents for codebase-heavy or token-heavy tasks, with cost measurement and verification via ax.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ax:efficient-dispatchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The main model is the orchestrator and Q&A reviewer. Mechanical work runs on
The main model is the orchestrator and Q&A reviewer. Mechanical work runs on cheaper models - and unlike guidance-only approaches, every claim here is checkable against your own ax graph.
Two axes. First, main model vs subagent: the main model orchestrates and reviews; mechanical work goes to subagents. Second, and the one that actually controls spend - the tier of each subagent dispatch:
model: sonnet (or haiku for pure
search/locate, per the table).model: opus/fable explicitly. Review is the
catch-rate gate; a cheap reviewer misses real bugs.Get this backwards and you pay twice: in one ax session implementers ran on the
expensive inherited model while reviewers were sent to a cheap one - ~$130 over,
weaker catch rate, three fix rounds (memory feedback-review-gets-strong-model).
The default-inherit trap is implementers, not reviewers: a forgotten model: on
an implement … dispatch silently runs expensive. Set it.
Main model keeps (never dispatched at all): decomposition, architecture and product tradeoffs, plan synthesis, judging conflicting subagent reports, final integration, taste-heavy design/copy.
Cost-tier is one reason to dispatch. The other is context isolation - and it applies even when the work needs the strong model. A large input read into the main thread does not cost once: it sits in the context window and is re-sent as input on every later turn. A 0.5 MB screenshot Read on turn 5 of a 40-turn session is re-billed ~35 times and crowds out earlier reasoning.
The biggest offender is images. Reading screenshots for visual judgment (does this match the spec? rate this design, find the visual bug) floods the main context with vision tokens that persist for the rest of the session. Route it:
Same logic applies to any bulky tool output you only need a conclusion from: giant logs, large query dumps, full-file reads for one fact. If you need the answer, not the bytes, dispatch for it.
Source of truth: ~/.ax/hooks/routing-table.json (regenerate with
ax dispatches compile-routing). Consult it when present; these built-ins
mirror it:
| class | description pattern | model |
|---|---|---|
| spec-review | ^spec review | sonnet |
| search-locate | ^(pattern-find|locate|find|map|sweep|grep) | haiku |
| research | ^(research|investigate docs|study) | sonnet |
| well-specified-impl | ^implement | sonnet |
| bulk-mechanical | ^(write announcements|regenerate|standardize|merge main) | sonnet |
| task-N-impl | ^Task \d+: | sonnet |
| bug-fix | ^Fix\s | sonnet |
| feature-add | ^Add\s | sonnet |
| agent types | Explore, codebase-locator, codebase-pattern-finder → haiku; codebase-analyzer → sonnet |
Anything unmatched: leave the model unset only if the work genuinely needs main-model judgment - otherwise pick sonnet.
model: explicitly on every mechanical dispatch. The route-dispatch
hook is quota-aware and ADVISORY (Claude Code hooks cannot enforce model on
subagent dispatches - they can only inject context via additionalContext):
in conserve mode it advises re-dispatching a forgotten mechanical dispatch
with model:<cheaper>; near a 7d quota reset (splurge) it stays quiet so
work runs on the strong inherited model; it advises when judgment work
(review/design/audit) is sent on a cheap model. Real enforcement is your
discipline + setting model: explicitly on every dispatch.
Treat the advisory as a re-dispatch signal, not noise..claude/workflows/*.js) run sandboxed and cannot
import ax code. Set model: on every agent(...) call by hand, per
ax routing show: mechanical stages → model: 'sonnet'; judgment/review
stages → keep the strong model. routing-tune.workflow.js is the reference.
In-tree Effect/axctl code that dispatches should call resolveDispatchModel
(from @ax/hooks-sdk) instead of hardcoding.ax dispatches --days=7 - your inherit rate (target: explicit model on all
mechanical classes)ax dispatches --candidates - missed routings + est savings, repriced from
real token bucketsax cost split --days=7 - main vs subagent spend by model; the dominant
cost is usually main-loop cache reads, so move tool-heavy loops (build/test
cycles, browser QA) into subagents entirelyax cost images --days=7 - image-read context per session, main vs subagent.
High main-thread MB = screenshots persisting in the main window; route that
visual judgment to a subagent (see "Isolate heavy context" above)ax improve recommend - surfaces a routing proposal automatically when
missed savings accumulateAfter adopting this skill, compare windows: ax cost split + inherit rate
before vs after. If the inherit rate doesn't drop, the routing isn't
happening - check ax hooks backtest ~/.ax/hooks/route-dispatch.ts --days=7
and whether dispatches are bypassing the table.
npx claudepluginhub necmttn/axRoutes Claude Code tasks to optimal models (Haiku, Sonnet, Opus) using decision matrices, cost tables, complexity signals, and subagent assignments for cost/quality tradeoffs.
Routes tasks to Haiku, Sonnet, or Opus based on complexity to optimize cost and quality. Use for intelligent model selection and tiered routing.
Smart Routing policy that reduces cost by spending strong reasoning only where it changes the outcome. Batches, bounds, and delegates tasks based on risk signals.