From multillm
Route work through the local MultiLLM gateway and decide when to ask other LLMs or helper agents for support. Use when Codex should leverage Claude, OCA, GPT, local models, or MultiLLM specialist agents for second opinions, architecture review, security review, context handoff, dashboard checks, or multi-device session consolidation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/multillm:llm-orchestratorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use MultiLLM as the control plane for cross-model work instead of treating other models as ad hoc side conversations.
Use MultiLLM as the control plane for cross-model work instead of treating other models as ad hoc side conversations.
http://localhost:8080 unless the environment says otherwise.MULTILLM_HOME is the intended consolidation mechanism.The orchestrator should be invoked proactively — don't wait for the user to ask. Detect the task phase and route automatically:
Use the narrowest tool that matches the task:
| Need | Tool | Agent |
|---|---|---|
| Direct question to another model | llm_ask | — |
| Moderate-risk implementation | llm_second_opinion | work-orchestrator |
| Architecture, migration, tradeoffs | llm_council | arch-council |
| Code quality review | llm_second_opinion | code-reviewer |
| Security-sensitive changes | llm_second_opinion | security-reviewer |
| Complex task decomposition | llm_council | task-planner |
| Large file comprehension | llm_summarize_cheap | local-summarizer |
| Cross-session handoff | llm_share_context | work-orchestrator |
| Usage, costs, dashboard | llm_usage | — |
| Settings changes | llm_settings_get/set | — |
1. State the question precisely
2. Search shared memory for prior decisions on this topic
3. Call llm_council with 3-4 models
4. Synthesize consensus and diverging views
5. Store the decision to shared memory
6. Present recommendation with confidence level
1. Read the changed files
2. Identify security-relevant patterns (auth, crypto, input handling, secrets)
3. Call llm_second_opinion with security focus using GPT-4o
4. Merge both analyses
5. Store findings to shared memory
6. Present PASS/WARN/FAIL verdict
1. Read the code under review
2. Analyze correctness, design, performance, error handling
3. Call llm_second_opinion for cross-family perspective
4. Compare findings — flag agreements and disagreements
5. Store significant findings to shared memory
6. Present structured review with Accept/Request Changes verdict
1. Parse the objective and constraints
2. Search memory for related prior work
3. Decompose into 3-7 subtasks with model assignments
4. Call llm_council to validate the plan
5. Store the plan to shared memory
6. Present with execution order and dependencies
1. Summarize current working context (what was done, what's next, decisions made)
2. Search memory for any related prior context
3. Call llm_share_context with structured summary
4. Confirm the context is retrievable
5. Tell the user how to resume in the other session
After every significant orchestration action, store a memory:
llm_memory_store(
title="[decision|finding|plan]: short description",
content="Detailed content with model consensus...",
category="decision", # or: finding, context, todo
project="auto-detect from cwd",
source_llm="claude"
)
This ensures continuity across sessions, models, and devices.
When the user wants one dashboard across machines:
MULTILLM_HOME.agents/work-orchestrator.md — Auto-routing with phase detection and checkpoint disciplineagents/task-planner.md — Task decomposition with model assignmentagents/code-reviewer.md — Multi-perspective code quality reviewagents/security-reviewer.md — Security-focused review with GPT-4o second opinionagents/arch-council.md — 3-4 model council for architecture decisionsagents/local-summarizer.md — Token-efficient summarization via local modelscommands/llm-usage.md and commands/llm-usage-hourly.md for dashboard-oriented usage summariesCLAUDE.md for the runtime architecture, API, and gateway behaviornpx claudepluginhub adibirzu/multillmDesigns and implements multi-agent LLM systems using orchestrator patterns, parallel coordination, pipelines, hierarchical delegation, communication, and failure handling. For agent workflows and debugging failures.
Runs multi-LLM council for adversarial debate and cross-validation on implementation, architecture, review, security, research, and planning tasks.
Orchestrates multi-model workflows by discovering skills, selecting models via guidance, and composing subagent runs for multi-step development plans.