From amux
Use when user explicitly requests deep research or comprehensive analysis across many authoritative sources. Creates an agent team for parallel research, gated on answer sufficiency (not a source count), with confidence tracking and structured synthesis. NOT for simple questions answerable with a single search.
How this skill is triggered — by the user, by Claude, or both
Slash command
/amux:researchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
<objective>
Core principle: Decompose questions, research in parallel with an agent team, evaluate confidence, iterate until sufficient, synthesize with source attribution.
<success_criteria>
Done when: state.json phase is "DONE", every question is answered at medium+ confidence (or remaining gaps are documented as limitations), and report.md synthesizes findings with source attribution, conflicts, gaps, and limitations. Sufficiency — not a source count — is the bar.
</success_criteria>
<when_to_use> Use when the user explicitly asks for deep research / comprehensive analysis needing multiple authoritative sources, confidence tracking, and source attribution.
Don't use for simple factual questions a single search answers, or topics too narrow for an 8-question decomposition. </when_to_use>
<required_tools>
| Tool / Feature | Purpose | Required |
|---|---|---|
WebSearch | Search queries (built-in) | Yes |
| Agent teams | Spawn parallel researcher teammates | Yes |
firecrawl-mcp:firecrawl_scrape | Scrape full page content (preferred) | No |
WebFetch | Fetch page content (built-in fallback) | Fallback |
Prerequisite: Agent teams must be enabled (CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in settings or environment).
Tool Selection: In INIT phase, check if firecrawl-mcp:firecrawl_scrape is available. If not, use WebFetch (built-in). Record choice in state.json as "scraper": "firecrawl" or "scraper": "webfetch".
Tradeoffs:
firecrawl-mcp:firecrawl_scrape: Better content extraction, handles JS-rendered pagesWebFetch: Always available, sufficient for static pages
</required_tools><state_machine>
INIT → DECOMPOSE → RESEARCH → EVALUATE → [RESEARCH or SYNTHESIZE] → DONE
State File: research/{slug}/state.json
{
"topic": "string",
"phase": "INIT|DECOMPOSE|RESEARCH|EVALUATE|SYNTHESIZE|DONE",
"iteration": 0,
"maxIterations": 5,
"targetSources": 30,
"sourcesGathered": 0,
"totalSearches": 0,
"teammateCompletions": 0,
"codexCompletions": 0,
"findingsCount": 0,
"startTime": "ISO-8601 timestamp",
"scraper": "firecrawl|webfetch",
"questions": [{"id": 1, "text": "...", "status": "pending|done", "confidence": null}]
}
Rule: Read state.json before acting. Write state.json after acting.
</state_machine>
<task_loop>
A generic task loop hook prevents the session from ending while task-loop.json has complete: false, so research isn't abandoned mid-flight.
How it works:
research/{slug}/task-loop.jsoncomplete is false, exit is blocked and continuationPrompt is re-injectedcomplete: true, exit is allowed and completionMessage is displayedYou manage task-loop.json alongside state.json. Update statusMessage and continuationPrompt as progress changes. Set complete: true when EVALUATE's sufficiency gate passes or the iteration ceiling is hit — both are legitimate exits, so the loop always terminates.
</task_loop>
<state_recovery> On skill invocation, first check for existing state:
If research/{slug}/state.json exists:
phaseVerify state consistency before resuming:
findings.json has dataIf inconsistent, offer user choice:
Detect available scraper:
firecrawl-mcp:firecrawl_scrape tool exists"scraper": "firecrawl""scraper": "webfetch" (uses built-in WebFetch)Create working directory:
mkdir -p research/{slug}
Set targetSources as a rough expectation for the topic's breadth (narrow ~20, standard ~30, broad ~40) and maxIterations (default 5). These are signals/backstops, not the completion bar — sufficiency is.
Initialize state files:
state.json: use the schema from <state_machine> with phase: "DECOMPOSE", counters at 0, startTime set, questions: [].
task-loop.json (activates the generic task loop hook):
{
"active": true,
"complete": false,
"continuationPrompt": "Continue researching: {topic}. Check research/{slug}/state.json and continue the RESEARCH phase.",
"statusMessage": "Research in progress: {topic}",
"completionMessage": "Research complete."
}
findings.json:
[]
Common angles to draw from (a menu, not a checklist): definition/background, current state, key entities, core mechanisms, evidence and data, criticisms and limitations, comparisons to alternatives, future developments. Add others the specific topic demands.
Add questions to state.json with status="pending". Set phase="RESEARCH".
Claude teammates: One per pending question (up to 8 at a time). Each works independently with its own context window. Each teammate also calls the codex MCP tool to get Codex's perspective on the same question, providing genuine cross-validation — two engines may surface different sources and perspectives.
Read scraper from state.json. Spawn each Claude teammate with these instructions, substituting {SCRAPER} = firecrawl-mcp:firecrawl_scrape or WebFetch accordingly:
<teammate_instructions>
You are a researcher teammate with WebSearch and {SCRAPER}.
TASK: {QUESTION}
Run enough searches to answer the question well, then scrape the best sources with {SCRAPER} ("Extract main content and key facts") and extract specific facts with their sources. Continue if a scrape fails. Use your judgment on how many searches and sources are enough.
Quality guide (favour higher tiers): Tier 1 — .gov, .edu, journals, official docs · Tier 2 — Reuters, AP, BBC, industry pubs · Tier 3 — company blogs, Wikipedia · Skip — forums, social media, SEO spam. </teammate_instructions>
<teammate_codex_crossvalidation>
After web research, call the codex MCP tool (model: gpt-5-codex, sandbox: read-only) with prompt: "Research this question: {QUESTION}. Return JSON findings with fields: fact, sourceNote, confidence (high/medium/low). Focus on facts confirmable from training data."
Treat codex as unavailable if the call throws/times out, or returns empty/non-JSON/MCP-error text (e.g. "Codex CLI Not Found") — then return Claude-only findings. If valid JSON, merge per question:
"status": "hypothesis" (no web citation).<teammate_return_format> RETURN ONLY THIS JSON:
{
"questionId": {ID},
"questionText": "{QUESTION}",
"searchQueries": ["query1", "query2", "query3", "query4"],
"searchesRun": 4,
"urlsScraped": 4,
"scrapeFailures": [],
"findings": [
{
"fact": "...",
"sourceUrl": "...",
"tier": 1,
"crossValidated": false,
"engines": ["claude"],
"status": "confirmed|hypothesis|disputed"
}
],
"gaps": ["what you couldn't find"],
"contradictions": ["X says A, Y says B"],
"confidence": "high|medium|low",
"confidenceReason": "...",
"codexAvailable": true
}
</teammate_return_format>
After each Claude teammate completes:
findings.jsonstate.json:
totalSearches by searchesRun from responseteammateCompletions by 1sourcesGathered by urlsScraped from responsefindingsCount by length of findings array from responsecodexAvailable is true, increment codexCompletions by 1"Sources: {sourcesGathered}/{targetSources}"After all teammates complete:
phase="EVALUATE"
Sufficiency gate → SYNTHESIZE when both hold:
medium confidence or better (high=3, medium=2, low=1).sourcesGathered vs targetSources is a sanity signal, not a gate: if you'd synthesize with far fewer sources than expected, double-check you haven't stopped short; if you're well past target but still thin, keep going. Don't pad to hit a number.
Otherwise → RESEARCH another round, unless the iteration ceiling (maxIterations, default 5) is reached — then SYNTHESIZE with the remaining gaps documented as limitations (futility exit; never loop forever).
If continuing to RESEARCH:
status="pending"; increment iteration; set phase="RESEARCH"task-loop.json: set statusMessage and continuationPrompt to the specific open questions"Continuing research: {N} questions still below confidence / {M} open gaps"Set complete: true in task-loop.json only when the sufficiency gate passes or the iteration ceiling is hit.
# {Topic}
## Executive Summary
[300-400 words. Most important finding first. State confidence. Note caveats.]
## Background
[200 words. Key terms. Context.]
## Key Findings
### [Theme 1]
[Grouped findings. Inline citations. Note source strength.]
### [Theme 2]
[3-5 themes total]
## Conflicting Information
[Both sides. Which has better sourcing.]
## Gaps & Limitations
[What's unknown. What needs more research.]
## Source Assessment
- **High confidence:** [claims with 3+ quality sources]
- **Medium confidence:** [claims with 1-2 sources]
- **Low confidence:** [single source or Tier 3 only]
## Sources
### Primary
[Tier 1 sources with URLs]
### Secondary
[Tier 2-3 sources with URLs]
---
*Sources: {sourcesGathered} | Searches: {totalSearches} | Teammates: {teammateCompletions} | Iterations: {iteration} | Duration: {duration} | Date: {date}*
Set phase="DONE".
Update task-loop.json:
{
"active": true,
"complete": true,
"completionMessage": "Research complete: \"{topic}\"\n\nResources used:\n Searches: {totalSearches}\n Sources: {sourcesGathered}/{targetSources}\n Teammates: {teammateCompletions}\n Iterations: {iteration}\n\nReport: research/{slug}/report.md"
}
The task loop hook will display this message when the session exits.
<error_handling>
| Error | Action |
|---|---|
| Malformed JSON | Retry once, then mark low confidence |
| Scrape fails | Continue with other URLs |
| Rate limit | Wait 60s, reduce batch to 2 |
| No results | Mark low confidence, rephrase as follow-up |
| Tool not found | Fall back to WebFetch, update state.json |
codex MCP unavailable, empty, or error-text response | Teammate returns Claude-only findings, research continues |
| </error_handling> |
Searches per teammate, URLs scraped, and follow-ups per iteration are the agent's discretion. Completion is governed by the sufficiency gate (EVALUATE), bounded by maxIterations.
<red_flags>
Stop on sufficiency, not a number: synthesize once every question is answered at medium+ confidence with gaps documented — don't pad sources to hit a target, and don't quit while questions are still thin (unless maxIterations is reached). Weight Tier 1 sources higher. If stuck, narrow scope or generate sharper follow-ups.
</red_flags>
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub guyathomas/workflows --plugin amux