From autonomous-research-pipeline
Architecture pattern for building autonomous research agents using Claude Code CLI. Each agent runs N parallel research tracks, cross-pollinates findings, synthesizes insights, and produces a structured report.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autonomous-research-pipeline:research-pipelineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Architecture pattern for building autonomous research agents using Claude Code CLI. Each agent runs N parallel research tracks, cross-pollinates findings, synthesizes insights, and produces a structured report.
Architecture pattern for building autonomous research agents using Claude Code CLI. Each agent runs N parallel research tracks, cross-pollinates findings, synthesizes insights, and produces a structured report.
Single-track research produces narrow, confirmation-biased results. Multi-track parallel research produces broader, more structurally coherent intelligence because:
Phase 1: Parallel Research (N tracks, each is a separate Claude Code CLI invocation)
Track A ──→ track_a.json
Track B ──→ track_b.json
Track C ──→ track_c.json
Track D ──→ track_d.json
(all run simultaneously via bash & + wait)
Phase 2: Cross-Pollination + Synthesis (sequential, reads all track outputs)
Read all track JSONs ──→ Find connections ──→ synthesis.json
Phase 3: Report Generation (sequential, reads synthesis)
Read synthesis.json ──→ Write structured report ──→ report.md
Phase 4: Notification + Distribution (optional)
Send summary via Telegram / email / webhook
Each track is a single Claude Code CLI invocation with:
Each track's prompt should follow this pattern:
claude -p "Read {project}/.claude/CLAUDE.md and {project}/.claude/rules/source_registry.md. \
Then execute the research skill at {project}/.claude/skills/research-{track-name}/SKILL.md \
for date $DATE. Save output to {project}/.tmp/$DATE/{track_name}.json" \
--max-turns 15 \
--allowedTools "WebSearch,WebFetch,Read,Write,Glob,Grep,Bash"
{
"track": "community-pain-points",
"date": "20260312",
"query_count": 8,
"findings": [
{
"title": "Finding title",
"summary": "2-3 sentence summary of the finding",
"source_url": "https://...",
"source_type": "reddit|arxiv|news|government|social",
"relevance_score": 8,
"raw_quotes": ["Direct quote from source"],
"tags": ["tag1", "tag2"]
}
],
"meta": {
"search_queries_used": ["query1", "query2"],
"sources_consulted": 12,
"execution_time_minutes": 4
}
}
Use bash background processes with PID tracking and wait:
# Track A
claude -p "..." --max-turns 15 --allowedTools "..." \
2>&1 | tee -a "$LOG_DIR/track_a.log" &
PID_A=$!
# Track B
claude -p "..." --max-turns 15 --allowedTools "..." \
2>&1 | tee -a "$LOG_DIR/track_b.log" &
PID_B=$!
# Track C
claude -p "..." --max-turns 15 --allowedTools "..." \
2>&1 | tee -a "$LOG_DIR/track_c.log" &
PID_C=$!
# Track D
claude -p "..." --max-turns 15 --allowedTools "..." \
2>&1 | tee -a "$LOG_DIR/track_d.log" &
PID_D=$!
# Wait for all tracks (|| true so one failure doesn't kill the pipeline)
wait $PID_A || log "WARNING: Track A failed"
wait $PID_B || log "WARNING: Track B failed"
wait $PID_C || log "WARNING: Track C failed"
wait $PID_D || log "WARNING: Track D failed"
Key points:
& backgrounds each invocation$! captures the PID of the last backgrounded processwait $PID blocks until that specific process completes|| log "WARNING: ..." handles failures gracefully without killing the pipelinetee -a logs output while still streaming to stdoutThe synthesis step is the most valuable phase. It reads ALL track outputs and finds connections between them.
claude -p "Read {project}/.claude/CLAUDE.md and relevant rules. \
Read all research track outputs in {project}/.tmp/$DATE/: \
track_a.json, track_b.json, track_c.json, track_d.json. \
Cross-pollinate findings: identify connections between tracks, \
contradictions worth investigating, and emergent patterns that \
no single track reveals alone. \
Check dedup registry at {project}/.tmp/registry/history.json. \
Write synthesis.json and report.md to {project}/.tmp/$DATE/." \
--max-turns 20 \
--allowedTools "WebSearch,WebFetch,Read,Write,Edit,Glob,Grep,Bash"
{
"date": "20260312",
"cross_pollinations": [
{
"type": "convergence|contradiction|gap_bridge|pattern",
"tracks_involved": ["track_a", "track_c"],
"insight": "Description of the cross-track connection",
"supporting_findings": ["finding_id_1", "finding_id_2"],
"confidence": "high|medium|low"
}
],
"key_insights": [
{
"title": "Insight title",
"analysis": "Detailed structural analysis",
"implications": "What this means going forward"
}
]
}
Prevent repeating the same findings across days. Maintain a JSON registry with TTL (time-to-live).
{
"seen_topics": {
"topic-fingerprint-hash": {
"title": "Topic title",
"first_seen": "20260301",
"last_seen": "20260312",
"count": 4
}
},
"ttl_days": 14
}
.tmp/ cleanup).{project}/.tmp/registry/topic_history.json or similar.The research pipeline runs as a shell script triggered by cron. See the daily-cron-agent plugin for the complete cron template.
Basic cron entry:
# Industry research pipeline, 4 AM JST daily
0 4 * * * /home/user/project/automation/run_nightly.sh >> /home/user/project/.tmp/cron.log 2>&1
#!/bin/bash
set -euo pipefail
# 1. Resolve paths
# 2. Source environment
# 3. Create date-based log directory
# 4. Phase 1: Run N tracks in parallel
# 5. Phase 2: Run synthesis (sequential)
# 6. Phase 3: Send notification
See daily-cron-agent/reference/cron_template.sh for a complete, production-ready template.
| Tracks | Max Turns per Track | Typical Wall Clock | API Cost Estimate |
|---|---|---|---|
| 2 | 10 | 3-5 min | Low |
| 4 | 15 | 5-10 min | Medium |
| 6 | 15 | 8-15 min | Medium-high |
| 8+ | 10 | 10-20 min | High |
--max-turns conservative to avoid runaway costs. 10-15 per track, 20-35 for synthesis.Architecture pattern based on production pipelines running daily across 10+ autonomous research agents.
npx claudepluginhub mtkana/claude-code-plugins --plugin autonomous-research-pipelineOrchestrates multi-agent deep research: decomposes goals into parallel sub-tasks, runs non-interactive Claude Code subprocesses with web tools, aggregates/refines into chaptered report files + summaries. Activates on 'deep research' or similar mentions, for surveys/analysis.
Orchestrates 10 agents for parallel research execution on topics, gathering and synthesizing information efficiently. Use for deep, multi-faceted investigations requiring speed.
Performs deep, exhaustive research on a topic via parallel-cli, with multi-turn context support. Use only for explicit deep research requests; for normal lookups use parallel-web-search instead.