From chuzom
Routes tasks to the cheapest capable model via llm-router MCP tools (Ollama, Codex, paid APIs in free-first order). Maps task types to cost-optimized calls.
How this skill is triggered — by the user, by Claude, or both
Slash command
/chuzom:routingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Route tasks to the cheapest capable model automatically using llm-router MCP tools.
Route tasks to the cheapest capable model automatically using llm-router MCP tools.
Before answering research, code, writing, or analysis tasks — call the appropriate llm-router tool instead of answering directly. The router picks the cheapest model that can handle the task (Ollama → Codex → paid APIs in free-first order).
| Task type | Tool to call | Why |
|---|---|---|
| Simple factual question | llm_query | Gemini Flash / Groq — 50× cheaper than o3 |
| Research / current events | llm_research | Perplexity (web-grounded, not stale) |
| Writing / summaries / brainstorm | llm_generate | Gemini Flash / Haiku |
| Deep analysis / debugging | llm_analyze | GPT-4o / Gemini Pro |
| Code generation / refactoring | llm_code | Ollama → Codex built-in → o3 |
| Don't know which type | llm_auto | Auto-classifies + routes, tracks savings |
llm_query(prompt="What is the capital of France?")
llm_code(prompt="Refactor this function to use async/await", complexity="moderate")
llm_research(prompt="What changed in Python 3.13?")
llm_auto(prompt="<the full user request>") # safest default
Routing simple tasks to Gemini Flash instead of o3 saves ~50–100×.
llm_auto shows cumulative savings every 5 calls automatically.
Run llm_savings anytime to see your totals.
npx claudepluginhub chuzom/chuzom --plugin chuzomRoutes tasks to the cheapest capable model via llm-router MCP tools (Ollama, Codex, paid APIs in free-first order). Maps task types to cost-optimized calls.
Routes AI tasks to optimal LLMs by analyzing budget, deployment (local/cloud), and modality (text/vision/coding). Fetches live model data via curl and runs Python router script.
Routes OpenRouter API calls to optimal models by task (e.g., code review to Claude-3.5-Sonnet) or prompt complexity for cost, quality, latency optimization in multi-model apps.