From autoresearch
Set up a new autoresearch project. Use when the user wants to research any topic, improve anything, run iterative experiments, or says /autoresearch:design. Works for code, documents, analysis, research questions, arguments — anything.
How this command is triggered — by the user, by Claude, or both
Slash command
/autoresearch:designThe summary Claude sees in its command listing — used to decide when to auto-load this command
# Autoresearch Design You are setting up an autonomous research project. The human has a goal — something they want to understand, analyze, or improve. Your job is to understand that goal, produce the configuration files the orchestrator needs, and then run the iterative experiment loop. ## MANDATORY: You MUST follow Phases 1-6 in order. Do NOT skip phases. Do NOT "just do it yourself." The phase count is 6: (1) understand the goal, (2) do bounded domain research, (3) propose the agenda, (4) write the config files, (5) run the loop, (6) present results. The entire point of autoresearch ...
You are setting up an autonomous research project. The human has a goal — something they want to understand, analyze, or improve. Your job is to understand that goal, produce the configuration files the orchestrator needs, and then run the iterative experiment loop.
The phase count is 6: (1) understand the goal, (2) do bounded domain research, (3) propose the agenda, (4) write the config files, (5) run the loop, (6) present results.
The entire point of autoresearch is the iterative loop — multiple rounds of parallel supportive/adversarial workers, each producing evidence. If you bypass the orchestrator and do the work directly, you have defeated the purpose.
Two modes:
Determine which fits the user's goal before proceeding.
Start with the goal. If the user's request is clear, proceed directly. If ambiguous, ask one concise clarifying question — not a checklist.
Any goal is valid: code optimization, document writing, market research, argument development, due diligence, product rebuilds, personal decisions. Do not redirect the user based on goal type.
If the goal involves code: read the codebase — structure, imports, existing tests, benchmarks.
For everything else: determine what the output should contain and what "better" means.
Before you can propose directions, you need to understand the domain. Do the research yourself, without a separate confirmation gate — this is fast scaffolding, not the run.
For code goals: read the target files, imports, call graphs. For everything else: 2-5 web searches or source reads to understand the domain enough to propose directions.
Keep it bounded — 5 minutes of work max. This is setup, not execution. The loop does the deep work. If the goal is already clear and you can name 3-6 good directions without research, skip straight to Phase 3.
Present a list of broad initial directions to investigate. Do NOT pre-decompose into specific sub-directions — the workers will discover specific angles during research and propose them via the roadmap.
Derive directions from the goal itself. If the goal is a thesis with claims, each claim is a direction. If the goal is analysis, each major area of concern is a direction. If the user provides proprietary context or prior analysis, incorporate those as directions too.
**<name>** — <one-line goal>
Directions:
- <broad direction 1>
- <broad direction 2>
- <broad direction 3>
- <broad direction 4>
<N> workers (N/2 supportive + N/2 adversarial), <M> rounds, ~$<X>. Ready?
Aim for 3-6 broad directions. Workers will discover sub-directions during research and the judge curates them into the roadmap each round.
Wait for the human to edit and confirm. They may strike, add, or rearrange.
Things you figure out yourself (do NOT ask the human):
evidence.md (citation catalog) and synthesis.md (decision-grade output). The synthesizer judge produces both with appropriate shapes each round. Use a single file only when the output is genuinely a single artifact (a press release, a code change, a configuration file). Never ask the user which they want — pick based on the goal.correctness and evidence are allowed.## Audience section in program.md — the orchestrator reads this to produce a final audience-targeted brief.md after the run. If the user did not specify an audience, infer one from the goal type and state it explicitly so the user can correct it. Do not ask the user to specify it — capture it from context.Each initiative gets its own directory under autoresearch/. Create autoresearch/<name>/ with:
# Research Program
## Target
{what we're investigating, in plain language}
## Metric
{what "better" means and how we measure it}
## Strategy
collaborative
## Measurement
{quantitative or qualitative}
## Direction
maximize
## Audience
{one-sentence description of who will read the output and what they'll do with it — used by the post-run brief judge to target the brief.md output. Required for qualitative initiatives. Example: "Senior Industry Advisor preparing for a management session with the company's CTO/CIO. Needs concrete technical questions to put to engineering leadership."}
## Editable files
- {file1}
- {file2}
## Directions
- {broad direction 1}
- {broad direction 2}
- {broad direction 3}
For qualitative initiatives that produce a research output: declare TWO editable files — evidence.md (citation catalog, exhaustive, no inferences) and synthesis.md (decision-grade, under ~2500 words, observation/inference structure). The synthesizer judge will produce both each round, with different shapes appropriate to each. Use a single editable file only when the initiative produces a single artifact type (e.g., a press release draft, a code change, a single document).
For qualitative measurement, add a ## Rubric section with hard and soft gates:
## Rubric
Hard gates (fail any = score 0):
- correctness: no factual errors — every specific claim backed by a named, plausible, verifiable source
- evidence: every non-trivial claim has a specific, named, non-marketing source
Soft gates (each pass = +1 point):
- technical_specificity: concrete details (numbers, versions, measurements), not generalizations
- analytical_reasoning: connects facts into arguments with stated conclusions, named alternative readings considered
- causal_implications: traces cause -> effect -> consequence with evidence; downstream claims labelled as inferences not facts
- investigative_effort: evidence of real digging (source code, commits, APIs, configs) not just summarizing docs pages
- neutral_synthesis: distinguishes observations from inferences; load-bearing words ("fragile," "exposed," "collapses," "structurally weak," "doesn't survive") only used where specific cited evidence supports them; language calibrated to the evidence, not the other way around
{add domain-specific soft gates here based on the initiative's goal}
Score: 0 (hard gate fail) or 0-N (soft gate count).
The five universal soft gates (technical_specificity, analytical_reasoning, causal_implications, investigative_effort, neutral_synthesis) are validator-enforced. The run will not start without all five. Add domain-specific gates on top — as soft gates only. The two hard gates (correctness, evidence) are also validator-enforced; custom hard gates are rejected.
For quantitative: an executable bash script that accepts a directory argument ($1) and prints one number to stdout. Make it executable.
For qualitative: a bash script that calls the LLM-as-judge evaluator:
#!/usr/bin/env bash
set -euo pipefail
WORKER_DIR="$1"
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
/opt/homebrew/bin/python3.13 "$SCRIPT_DIR/../../bin/eval_qualitative.py" "$WORKER_DIR" "$SCRIPT_DIR"
Make it executable.
Files the workers must not edit, one per line.
If the editable file is a document, write an initial version with a solid outline. This becomes the baseline the judge iteratively improves. Don't leave it empty.
After writing the files, ask: "Ready to start? How many rounds? (default: 5)"
When they confirm, find the orchestrator script:
~/Desktop/Projects/autoresearch-skills/bin/orchestrator.pyfind ~ -path "*/autoresearch-skills/bin/orchestrator.py" -maxdepth 4 2>/dev/null | head -1Run it directly using the Bash tool (NOT in background). The orchestrator uses from bin.program_parser import ..., so you must invoke it as a module from the repo root — running the script path directly will crash with ModuleNotFoundError: No module named 'bin':
cd <repo_root_containing_bin/> && /opt/homebrew/bin/python3.13 -m bin.orchestrator <rounds> <project_dir> <name> --workers <N>
Where <repo_root_containing_bin/> is the directory holding bin/orchestrator.py (typically ~/Desktop/Projects/autoresearch-skills), and <project_dir> is the project whose autoresearch/<name>/ you're running (use . if the project and the repo root are the same).
Optional flags: --workers <N> (must be even, default 2), --max-cost <USD>, --max-writeup-words <N>, --max-proposals <N>.
Set the Bash timeout to 600000 (10 minutes).
When the orchestrator finishes, invoke /autoresearch:review to present the results.
npx claudepluginhub mirceastrugaru/autoresearch/designDesigns system architecture, APIs, components, and databases producing specifications, diagrams, or code. Accepts target and optional --type (architecture|api|component|database) and --format flags.
/designGuides interactive frontend design workflow: project discovery, trend research, moodboard creation, color/typography selection, and production-ready code generation.
/designCreates comprehensive feature design documents with research and architecture for a given feature name or idea.
/designGenerates Markdown technical design document framework for a feature (doc mode, default). Also supports checklist mode for S-Tier SaaS design quality checks.
/designEnforces SwiftUI design rules for uniform constants, flexible accessible layouts, system styling, and inclusive practices across devices.
/designDesigns system architecture with mandatory C4 (Mermaid) diagrams and tech stack recommendations, consulting prior requirements and flagging contradictions.