From deep-researcher
Perform multi-dimensional deep research using parallel subagents. Combines web and codebase analysis into synthesized reports.
How this skill is triggered — by the user, by Claude, or both
Slash command
/deep-researcher:deep-researchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill uses file-system checkpoints under `${CLAUDE_PLUGIN_DATA}/checkpoints/`. The orchestrator creates a JSON checkpoint file at skill start and updates it after each research phase (dimensions confirmed, research complete, synthesis complete). On completion, set `status: "completed"`. On error, set `status: "error"`.
pipeline-steps/appendix.mdpipeline-steps/coherence.mdpipeline-steps/enrichment.mdpipeline-steps/executive-summary.mdpipeline-steps/grouping.mdpipeline-steps/peer-review.mdpipeline-steps/resume.mdpipeline-steps/shared-citations-algorithm.mdscripts/citations-resolve.tsscripts/tests/citations-resolve-cli.test.tsscripts/tests/citations-resolve.test.tsscripts/tests/fixtures/sample-topic/dim-01-bootstrap/findings.mdscripts/tests/fixtures/sample-topic/dim-02-api/findings.mdscripts/tests/fixtures/sample-topic/dim-03-data/findings.mdscripts/tests/fixtures/sample-topic/groupings.jsonThis skill uses file-system checkpoints under ${CLAUDE_PLUGIN_DATA}/checkpoints/. The orchestrator creates a JSON checkpoint file at skill start and updates it after each research phase (dimensions confirmed, research complete, synthesis complete). On completion, set status: "completed". On error, set status: "error".
The checkpoint file is ${CLAUDE_PLUGIN_DATA}/checkpoints/deep-researcher-<topic-slug>.json and follows this schema:
{
"version": 1,
"topic": "W3C annotation tools",
"topic_slug": "w3c-annotation-tools",
"status": "in_progress",
"phase": 4,
"phase_name": "Parallel Deep Dive",
"depth": "standard",
"format": "md",
"dimensions": ["dim-01-tools-landscape", "dim-02-technical-architecture", "..."],
"completed_dimensions": ["dim-01-tools-landscape"],
"groupings_path": null,
"synthesis_path": null,
"created_at": "2026-05-06T16:00:00Z",
"updated_at": "2026-05-06T16:30:00Z",
"error": null
}
Status values: in_progress, paused, error, completed
deep-researcher-<topic-slug>.json with status: in_progress, phase: 1.status: paused, phase: 3.status: error and error string.status: completed.in_progress or paused, offer resume.Researcher subagents write rolling checkpoints when cumulative read volume crosses 2000-line multiples. Path: ${CLAUDE_PLUGIN_DATA}/checkpoints/deep-researcher-<session-id>-checkpoint-<NNN>.md
Where <session-id> is the ISO timestamp from the subagent's scratch filename and <NNN> is zero-padded increment (001, 002, ...).
Invoked with /deep-research followed by a topic and optional flags.
/deep-research <topic> [--depth quick|standard|deep] [--visualize]
<topic> -- The research subject. Can be a technology, market, product category, architectural pattern, or problem space.--depth -- Research intensity tier. Default: standard.
quick: 3-5 dimensions, ~5 searches each, 1-2 page reportstandard: 6-10 dimensions, ~10 searches each, 3-5 page reportdeep: 10-15 dimensions, ~20 searches each, 8-12 page report--visualize -- After research completes, generate HTML report and open in browser. Chains to the visualize skill with the research output directory.Capture ${ARGUMENTS} so the topic and flags are available throughout the workflow.
Gather pre-build intelligence across multiple analytical dimensions using parallel subagents. Each dimension is researched independently, then synthesized into a unified report with confidence tiers, conflict zones, and cross-dimensional insights.
Typical use cases:
When output_dir is configured in plugin settings, use <output_dir>/research/<topic-slug>/. Otherwise use ${CLAUDE_PLUGIN_DATA}/research/<topic-slug>/.
All durable artifacts live under the research output directory:
synthesis.md -- cross-verification and insight extractiongroupings.json -- page groupings produced by Phase 6.5 Grouping steppage-N.md -- coherence-written report pages (one per page group)report.html / report-offline.html -- HTML output (only if --visualize)dim-{NN}-{kebab-title}/ -- per-dimension directory
findings.md -- structured findings with citationssources.md -- raw URLs, excerpts, quality tiersCRITICAL — Pipeline Execution Order. The following flow is the ONLY valid execution sequence. You MUST follow it exactly. Do not skip steps, reorder steps, or proceed to the next step until the current step's checkpoint artifact exists on disk. Each step's procedure doc is the authoritative reference for that step.
flowchart TD
A["Phase 1-5: Orient → Dimensions → Deep Dive → Synthesis"] --> B["Enrichment"]
B --> C["Grouping"]
C --> D["Executive Summary"]
D --> E["Coherence Writing"]
E --> F{"Citations exist?"}
F -- yes --> G["Appendix"]
F -- no --> J["Phase 7: Delivery"]
G --> I["Peer Review"]
I --> J
J --> K{"--visualize?"}
K -- yes --> L["HTML Emission"]
K -- no --> M["Done"]
L --> M
Each Report Pipeline step has a procedure doc:
pipeline-steps/enrichment.mdpipeline-steps/grouping.mdpipeline-steps/executive-summary.mdpipeline-steps/coherence.mdpipeline-steps/appendix.mdpipeline-steps/peer-review.mdRead the procedure doc before executing each step. The procedure doc contains the exact prompts, shell commands, validation checks, and retry logic for that step. Do not improvise — follow the procedure.
Resume: if the session was interrupted, follow the checkpoint-resume procedure in ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/resume.md to detect the last completed phase and resume from there.
Subagent output protocol: every subagent spawn follows the protocol defined below. The orchestrator prepends the Spawn Preamble to each spawn prompt.
CRITICAL — subagent_type: The only valid subagent_type for all spawns in this skill is deep-researcher:deep-researcher. Never use the bare name deep-researcher — it will fail with "agent type not found". Never use any other agent type. This applies to every role: researcher, synthesis, enrichment, grouping, executive-summary, coherence, appendix, peer-review.
Scratch file path convention: ${CLAUDE_PLUGIN_DATA}/scratch/deep-researcher-<ISO-timestamp>-<random-12>.md
Where:
<ISO-timestamp> = output of date -u +%Y-%m-%dT%H-%M-%SZ<random-12> = 12 hex chars generated via bash -c 'printf "%08x%04x" $RANDOM $RANDOM'The orchestrator prepends this block to EVERY subagent spawn:
## Output Protocol
You are a subagent. Write your full output to exactly this path:
<WRITE-PATH>
Begin the file with this YAML frontmatter:
---
schema_version: 1
verdict: complete | blocked | defer | timeout
summary: "<=140 chars, single line, decision-grade"
agent: deep-researcher:deep-researcher
produced_at: <ISO-8601 UTC>
followups: <integer>
next: orchestrator-decides | none
time_budget_seconds: <integer>
elapsed_seconds: <integer>
---
Then the sentinel line: <!-- end-frontmatter -->
Write the body below. Return ONLY a one-line confirmation with the absolute path -- do not echo the file body in your reply.
Time budget enforcement (CRITICAL): Every spawn prompt includes time_budget_seconds and started_at (Unix epoch, set by orchestrator via date +%s before spawning). The subagent MUST check elapsed time after every round boundary (after completing a search batch, after completing fetches, after each enrichment or synthesis pass). Check with:
elapsed=$(( $(date +%s) - <started_at> ))
If elapsed >= time_budget_seconds, stop immediately — do not start the next round or pass. Write whatever findings exist so far to the output file with verdict: timeout and a summary noting how far you got. The orchestrator handles timeout verdicts as partial results.
Time budgets per depth tier:
quick: 300 seconds (5 min)standard: 600 seconds (10 min)deep: 1200 seconds (20 min)Read(<path>, limit: 30). Act on verdict, summary, followups, next.Background subagents may return content in their task result text but fail to persist the file to disk. After EVERY subagent completes:
test -s <WRITE-PATH> (exists AND non-empty).<WRITE-PATH> yourself.This verification applies to ALL subagent spawns: researchers, synthesis, enrichment, coherence, peer-review. Never assume a file was written just because the subagent reported success.
Dialog with the user to establish:
--visualize)Do not proceed to Phase 2 without user confirmation.
Compute a topic slug from the topic string: lowercase, replace non-alphanumeric with hyphens, collapse multiple hyphens.
Store the topic slug as $TOPIC for use in all subsequent phases.
At the start of Phase 2, create the shared failed-domains scratch file:
SESSION_START=$(date -u +%Y-%m-%dT%H-%M-%SZ)
FAILED_DOMAINS_FILE="${CLAUDE_PLUGIN_DATA}/scratch/${TOPIC}-${SESSION_START}-failed-domains.txt"
touch "$FAILED_DOMAINS_FILE"
Store $FAILED_DOMAINS_FILE — it is passed to every researcher subagent in its spawn prompt and used by all parallel researchers to share domain ban state.
The orchestrator performs Phase 2 directly — no subagent is spawned here. Call WebSearch yourself (never Bash curl/wget). Do 3-5 quick orientation searches:
After each search, record in the orchestrator's internal state:
If the landscape scan reveals the topic is too narrow or already well-understood, surface this to the user and offer to downgrade depth or abort.
Based on landscape findings, define 6-10 research dimensions. Each dimension:
Dimension types to consider:
For each dimension, produce:
web or codebase (auto-detected; override with --web or --codebase)Present the dimension list to the user:
Wait for user approval, edits, or scope adjustments before spawning researchers.
Spawn researcher subagents in parallel. Cap: 5 concurrent researchers. If dimensions exceed 5, batch in groups of 5.
Agent: deep-researcher:deep-researcher with role: researcher
Write path: dimension findings go to scratch file per output protocol; durable artifacts go to the research output directory for that dimension.
Before spawning each researcher, capture the start time:
RESEARCHER_START=$(date +%s)
Per-dimension spawn prompt includes:
web or codebasefailed_domains_file: <FAILED_DOMAINS_FILE> — absolute path to the shared failed-domains file; append 5XX/down domains, check before every fetchtime_budget_seconds: <300|600|1200> — 300 for quick, 600 for standard, 1200 for deepstarted_at: <RESEARCHER_START> — Unix epoch captured just before this spawnNote: The researcher subagent writes findings to its scratch file only. The orchestrator reads the scratch file and copies/structures the durable artifacts into the research output directory after the subagent returns.
For web mode: researcher uses WebSearch and WebFetch tools.
For codebase mode: researcher uses targeted mode against the current project.
Failure handling (per researcher):
Timeout handling (per researcher):
verdict: timeout means the subagent hit the time budget mid-work and wrote partial findings. Treat partial findings as usable — do not discard them.started_at and the additional seconds added to time_budget_seconds. Pass the existing partial findings path so the subagent can build on them rather than starting over.partial in the delivery summary.Satisfaction-failure handling (per researcher):
verdict: defer means the subagent exhausted all 3 rounds (2 retries) and still failed one or more satisfaction conditions. It has partial findings and the summary names the unsatisfied conditions.partial and continue pipeline — do not re-run.time_budget_seconds reset to the depth budget, new started_at, and the path to existing partial findings so the subagent appends rather than restarts.skipped in the delivery summary.Blocked handling (per researcher):
verdict: blocked means all fetch attempts across all retry rounds failed (no usable content retrieved at all — domains down, all URLs 4XX/5XX, etc.).blocked in the delivery summary.Track completion status per dimension: complete, partial (timeout or accepted defer), skipped (user skipped), or blocked (no content). Do not proceed to Phase 5 until all dimensions have a final status or the user has made a decision on each outstanding one.
Spawn synthesis agent (agent: deep-researcher:deep-researcher, role: synthesis) with:
findings.md filesSynthesis agent tasks:
Output: <output-dir>/research/<topic-slug>/synthesis.md
This phase always runs. It is the only report generation path.
Spawn parallel enrichment subagents (max 5 concurrent), one per dimension file.
Each subagent:
findings.md<dimension-synthesis> block to that findings.mdProcedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/enrichment.md
Batch dimensions in groups of 5 if dimension count exceeds 5.
Agent: deep-researcher:deep-researcher with role: enrichment
Spawn one grouping subagent that:
<dimension-synthesis> blocks from the enriched findings.md files<output-dir>/research/<topic-slug>/groupings.jsonProcedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/grouping.md
Agent: deep-researcher:deep-researcher with role: grouping
Spawn one executive-summary subagent that:
synthesis.md<dimension-synthesis> blocks from every dim-NN-*.md filegroupings.json to determine N (highest page number)page-0.md (executive summary, 400-700 words)page-{N+1}.md (conclusion, 400-700 words)groupings.json with framing entries (page 0 and page N+1)Both pages are marked with <!-- summary-page: skip-conflict-check --> at the top.
This step runs AFTER Grouping and BEFORE Coherence Writing.
Procedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/executive-summary.md
Agent: deep-researcher:deep-researcher with role: executive-summary
Spawn one coherence subagent per page group (parallel, max 5 concurrent). Skip any page entry whose dimensions array is empty -- these are framing pages already written by Executive Summary.
Citations handoff — CRITICAL. Before spawning any coherence agent, the orchestrator MUST pre-compute the global citations table from all <citations> blocks across dimension files. The extraction and deduplication algorithm is defined in coherence.md Section 1. This table (referenced as {CITATIONS_TABLE}) is passed to every coherence subagent so it can emit inline citation markers (e.g. <a href="#appendix:cite-d1-c1" class="citation">[D1C1]</a>) throughout each page. Without this step, report pages will have no inline citation links pointing to the appendix.
Each subagent receives:
{CITATIONS_TABLE}) for this page group's dimensionsgroupings.jsonEach subagent outputs one page-{N}.md file per page group.
Procedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/coherence.md
Agent: deep-researcher:deep-researcher with role: coherence
Run only if at least one dimension file contains a <citations> block. Detection:
found=0
for f in <output-dir>/research/${TOPIC}/dim-*/findings.md; do
perl -0777 -ne 'exit 0 if /<citations>.*?<\/citations>/s; exit 1' "$f" 2>/dev/null && found=1 && break
done
When citations exist, spawn one appendix subagent that formats citations into appendix.md.
Procedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/appendix.md
Agent: deep-researcher:deep-researcher with role: appendix
Run if Appendix produced any citations. Audits ALL citation types: URL reachability, codebase file:line accuracy, and metric formula validity.
Split all citations into batches of <=50, spawn one peer-review subagent per batch concurrently (max 5 concurrent).
Procedure: ${CLAUDE_PLUGIN_ROOT}/skills/deep-research/pipeline-steps/peer-review.md
Agent: deep-researcher:deep-researcher with role: peer-review
Present to user:
1. Report & Directory Paths Always present full absolute paths:
Report pages:
<ABSOLUTE_PATH>/research/<slug>/page-1.md
<ABSOLUTE_PATH>/research/<slug>/page-2.md
...
Research directory (all dimensions + synthesis):
<ABSOLUTE_PATH>/research/<slug>/
If --visualize was used, the HTML paths are printed in the visualize chaining section below.
2. Dimension Summary -- table of dimensions with status (complete / partial / skipped / blocked) and file sizes.
3. Search Statistics -- total queries, T1/T2/Rejected source counts, subagents spawned.
4. Confidence Summary -- High / Medium / Low / Conflict counts.
5. Key Findings at a Glance -- 3-5 bullet-pointed top-level takeaways.
6. Known Gaps -- unresolved questions, source volatility, pre-GA risks.
7. Artifact Inventory -- tree listing of all files in the research directory.
8. Human Review Invitation -- explicit call for user review on 2-3 high-stakes decisions.
--visualize ChainingWhen --visualize was passed, after Phase 7 delivery, proceed to generate HTML output from the research report. Follow the visualize skill workflow:
<output-dir>/research/<topic-slug>/.groupings.json exists (it always will after Phase 6.5).npx tsx ${CLAUDE_PLUGIN_ROOT}/skills/deep-research-visualize/scripts/reportPipelineEmitter.ts \
--groupings <output-dir>/research/<topic-slug>/groupings.json \
--theme editorial \
--plugin-dir ${CLAUDE_PLUGIN_ROOT}/skills/deep-research-visualize/
file:// paths for report.html and report-offline.html.report-offline.html in the system browser (open on macOS, xdg-open on Linux).The default theme is editorial. The user selected --visualize from the research command which does not accept --theme; if they want a different theme they should run /deep-research-visualize separately.
The orchestrator uses the Task tool to track pipeline progress. Task state is
derived from filesystem checkpoints (see pipeline-steps/resume.md), so tasks
always reflect reality — they are never the single source of truth.
On pipeline start, create one parent task and child tasks for each phase:
TaskCreate: "Research: {TOPIC}" → parent_id (root task)
TaskCreate: "Orient & Dimensions" → child of root
TaskCreate: "Deep Dive (0/{N} dims)" → child of root
TaskCreate: "Synthesis" → child of root
TaskCreate: "Report Pipeline" → child of root
TaskCreate: "Delivery" → child of root
Report Pipeline gets sub-tasks created just-in-time as each sub-step begins:
TaskCreate: "Enrichment (0/{N})" → child of Report Pipeline
TaskCreate: "Grouping" → child of Report Pipeline
TaskCreate: "Executive Summary" → child of Report Pipeline
TaskCreate: "Coherence (0/{P} pages)" → child of Report Pipeline
TaskCreate: "Appendix" → child of Report Pipeline (only if citations exist)
TaskCreate: "Peer Review" → child of Report Pipeline (only if citations exist)
in_progress when the phase begins execution."Deep Dive (3/8 dims)" — after each researcher returns"Coherence (2/3 pages)" — after each page is written"Enrichment (5/8)" — after each enrichment agent returnscompleted when the filesystem checkpoint is confirmed:
dim-*/findings.md exist and are non-emptydim-*/findings.md contain <dimension-synthesis>groupings.json passes --validate-groupingsappendix.md exists (or marked completed with "skipped: no citations")completed with a note:
"Appendix — skipped: no citations" if conditionality check fails"Peer Review — skipped: no citations"On resume, the orchestrator:
resume.md Section 2.For each phase up to RESUME_FROM:
TaskUpdate: status = completed
For partial phases (Deep Dive, Enrichment, Coherence):
TaskUpdate: status = in_progress, description includes progress counter
(e.g., "Deep Dive (5/8 dims)" based on which findings.md files exist)
| Filesystem checkpoint | Task action |
|---|---|
dim-*/findings.md exists for all dims | Deep Dive → completed |
synthesis.md exists | Synthesis → completed |
All findings.md have <dimension-synthesis> | Enrichment → completed |
groupings.json passes validation | Grouping → completed |
page-0.md exists | Executive Summary → completed |
All content page-N.md exist | Coherence → completed |
appendix.md exists | Appendix → completed |
Peer review entry in groupings.json | Peer Review → completed |
The orchestrator never relies solely on task state to determine what to run. It always checks the filesystem. Tasks are a user-facing progress indicator, not a control plane.
.gov, .mil, .eu domains; SEC filings; patent databasesCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub sdkks/deep-researcher-visualized --plugin deep-researcher