From dspy-api-skills
Helps choose between DSPy modules (Predict, ChainOfThought, ReAct, etc.) and decide between single module vs pipeline for AI features. Walks through input/output, tool needs, and reasoning complexity.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dspy-api-skills:ai-choosing-architectureThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- Already know what module to use — go to the matching `/dspy-*` skill
/dspy-* skill/ai-fixing-errors/dspy-* skill/ai-planningBefore recommending anything, get answers to these three questions from the user (or infer them from context):
Walk the decision tree:
Does it need tools?
├── Yes: Does it need to write and run code?
│ ├── Yes → CodeAct
│ └── No → ReAct
└── No: How complex is the reasoning?
├── Simple (direct mapping) → Predict
├── Moderate (needs explanation) → ChainOfThought
├── Complex (math/computation) → ProgramOfThought
└── Very complex (compare approaches) → MultiChainComparison
Module tradeoff summary:
| Module | Accuracy | Latency | Cost | Best for |
|---|---|---|---|---|
| Predict | Baseline | 1x | 1x | Simple classification, extraction, formatting |
| ChainOfThought | +10-30% | 1.5-2x | 1.5-2x | Most tasks — default choice when unsure |
| ProgramOfThought | +20-40% on math | 2-3x | 2-3x | Math, computation, data manipulation |
| ReAct | Varies | 3-10x | 3-10x | Tasks requiring external information or actions |
| CodeAct | Varies | 3-10x | 3-10x | Tasks requiring code generation and execution |
| MultiChainComparison | +5-15% | 3-5x | 3-5x | When you need the best possible single answer |
| BestOfN | +5-10% | Nx | Nx | When you have a reward function and acceptance threshold |
For the full module list including Refine, RLM, and Parallel, see reference.md.
Use this table to decide whether one module is enough or a pipeline is warranted:
| Signal | Single module | Pipeline |
|---|---|---|
| Input maps directly to output | Yes | -- |
| Task has distinct phases (classify then generate) | -- | Yes |
| Different parts need different LM capabilities | -- | Yes |
| Need to validate intermediate results | -- | Yes |
| Simple input-output with clear signature | Yes | -- |
| Need to combine retrieval + generation | -- | Yes |
Rule of thumb: start with a single module. Add pipeline stages only when you have measured a quality gap that a single module cannot close.
Verification: After implementing the chosen architecture, run dspy.Evaluate(devset, metric=your_metric) on 20-50 examples to confirm the module choice was correct before optimizing.
| Architecture | First optimizer | Best optimizer | Why |
|---|---|---|---|
| Single Predict | BootstrapFewShot | MIPROv2 | Simple, fast to optimize |
| Single ChainOfThought | BootstrapFewShot | MIPROv2 | Reasoning benefits from good demos |
| ReAct agent | BootstrapFewShot | BootstrapFewShot | Agents are hard to optimize, start simple |
| Multi-module pipeline | BootstrapFewShot | MIPROv2 | End-to-end optimization tunes all stages |
| Pipeline with fine-tuning | BootstrapFinetune | BetterTogether | Weight tuning for max quality |
Output the recommendation in this format:
## Architecture Recommendation
**Module:** dspy.ChainOfThought (or whatever was chosen)
**Why:** [1-2 sentences tying the module to the task]
**Skeleton:**
[minimal code showing the module or pipeline structure]
**Optimizer path:**
1. Start with BootstrapFewShot (quick baseline)
2. Move to MIPROv2 if accuracy needs to improve
**Alternative considered:** [what else was considered and why it was not chosen]
import dspy
class MyTask(dspy.Signature):
"""One sentence describing the task."""
input_text: str = dspy.InputField()
output_label: str = dspy.OutputField()
predictor = dspy.Predict(MyTask)
result = predictor(input_text="...")
print(result.output_label)
import dspy
class MyTask(dspy.Signature):
"""One sentence describing the task."""
question: str = dspy.InputField()
answer: str = dspy.OutputField()
cot = dspy.ChainOfThought(MyTask)
result = cot(question="...")
print(result.answer)
import dspy
def search(query: str) -> str:
"""Search external knowledge base."""
...
def lookup(term: str) -> str:
"""Look up a term in a database."""
...
class MyAgentTask(dspy.Signature):
"""Answer questions using search and lookup tools."""
question: str = dspy.InputField()
answer: str = dspy.OutputField()
agent = dspy.ReAct(MyAgentTask, tools=[search, lookup])
result = agent(question="...")
print(result.answer)
import dspy
class Classify(dspy.Signature):
"""Classify the input into a category."""
text: str = dspy.InputField()
category: str = dspy.OutputField()
class Generate(dspy.Signature):
"""Generate a response given the category and original text."""
text: str = dspy.InputField()
category: str = dspy.InputField()
response: str = dspy.OutputField()
class ClassifyThenGenerate(dspy.Module):
def __init__(self):
self.classify = dspy.Predict(Classify)
self.generate = dspy.ChainOfThought(Generate)
def forward(self, text: str) -> dspy.Prediction:
category = self.classify(text=text).category
response = self.generate(text=text, category=category).response
return dspy.Prediction(category=category, response=response)
import dspy
retriever = dspy.Retrieve(k=3)
class Reason(dspy.Signature):
"""Given context passages, identify the key facts relevant to the question."""
question: str = dspy.InputField()
context: list[str] = dspy.InputField()
key_facts: str = dspy.OutputField()
class Answer(dspy.Signature):
"""Answer the question using the identified key facts."""
question: str = dspy.InputField()
key_facts: str = dspy.InputField()
answer: str = dspy.OutputField()
class RAGPipeline(dspy.Module):
def __init__(self):
self.retrieve = retriever
self.reason = dspy.ChainOfThought(Reason)
self.answer = dspy.ChainOfThought(Answer)
def forward(self, question: str) -> dspy.Prediction:
passages = self.retrieve(question).passages
key_facts = self.reason(question=question, context=passages).key_facts
answer = self.answer(question=question, key_facts=key_facts).answer
return dspy.Prediction(answer=answer, passages=passages)
Defaulting to ChainOfThought for everything. Predict is better for simple classification or extraction where reasoning adds noise, not signal. If the correct output is a fixed label from a known set, CoT can hallucinate reasoning that leads it astray.
Using ReAct when a pipeline suffices. ReAct is for tasks that need dynamic tool selection at runtime. If you know the steps upfront (e.g., always retrieve then answer), use a pipeline — it is cheaper, faster, and easier to optimize.
Over-engineering with MultiChainComparison. MCC runs 3-5x the cost of a single pass. Only reach for it after measuring that single-pass accuracy is insufficient for your use case.
Building a pipeline before proving a single module works. Always start with the simplest module that could work. Measure it on your eval set. Add pipeline stages only when you have a specific, measured quality gap.
Ignoring cost implications early. A ReAct agent with 10 tool calls costs roughly 10x a single Predict call. Factor cost and latency into architecture decisions before you build, not after.
Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-* skill for your chosen module/ai-building-pipelines/ai-planning/ai-auditing-code/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-donpx claudepluginhub lebsral/dspy-programming-not-prompting-lms-skills --plugin dspy-tools-skillsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.