From skillry-ai-and-agent-systems
Use when you need to design agent roles, handoffs, tool boundaries, workflow routing, and coordination rules.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skillry-ai-and-agent-systems:40-agent-workflow-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design multi-agent workflows with explicit role boundaries, handoff contracts, tool ownership, recursion limits, and state-passing schemas. Produce a concrete workflow specification — not a vague diagram — that a developer can implement directly in code. Every agent has one job, every handoff is typed, every loop is bounded, and every write/destructive action passes a code-level approval gate.
Design multi-agent workflows with explicit role boundaries, handoff contracts, tool ownership, recursion limits, and state-passing schemas. Produce a concrete workflow specification — not a vague diagram — that a developer can implement directly in code. Every agent has one job, every handoff is typed, every loop is bounded, and every write/destructive action passes a code-level approval gate.
41-agent-governance-review).42-prompt-systems-review).retriever, reasoner, executor, validator, formatter), not agent_1.max_depth for any agent that spawns sub-agents and max_iterations for any retry/reflection loop, as hard-coded constants. Defaults: max_depth = 3, max_iterations = 5 unless the use case requires more.task_id, origin_agent, step_history, current_context, remaining_budget_tokens, and error_state. Stateless raw-text handoffs are a red flag — the receiver cannot tell what was already tried.write/destructive action requires an explicit code-level approval step before the tool call. Name the gate agent, the approval signal format, and the timeout behavior.max_depth and max_iterations are explicit integers, not "unlimited" or "default".task_id, step_history, and remaining_budget_tokens.{ok: bool, error_code: str, result: any}.Handoff table
| Caller | Callee | Trigger condition | Input schema | Output schema |
|--------------|-----------|----------------------|-------------------------------|----------------------------------------|
| orchestrator | retriever | user query received | {query: str, filters: dict} | {chunks: list[Chunk], scores: list[f]} |
Tool ownership table
| Tool | Owner agent | Max calls/turn | Side-effect level |
|-------------|-------------|----------------|-------------------|
| web_search | retriever | 5 | read |
| file_write | executor | 1 | destructive |
# State object passed between agents (TypedDict)
from typing import Optional, TypedDict
class StepRecord(TypedDict):
agent: str; tool: str; ok: bool; summary: str
class WorkflowState(TypedDict):
task_id: str # unique top-level task id
origin_agent: str # which agent created this state
step_history: list[StepRecord] # every tool call + outcome
current_context: str # working context for the next agent
remaining_budget_tokens: int # estimated session budget left
error_state: Optional[dict] # set on a recovery path
# Hard limits — constants, never "unlimited"
MAX_DEPTH = 3
MAX_ITERATIONS = 5
# Standard worker return envelope
def envelope(ok: bool, result=None, error_code: str = "") -> dict:
return {"ok": ok, "error_code": error_code, "result": result}
Traces are how you prove the design terminates and stays in-role before writing code. For a research-and-act workflow (orchestrator + retriever + executor + validator):
Happy path — "summarize the latest pricing doc and update the cache"
1. orchestrator receives task → routes to retriever (depth 1)
handoff: {query:"pricing doc", filters:{recency:"30d"}}
2. retriever calls web_search (1/5 calls) → returns {chunks:[...], ok:true}
3. orchestrator → reasoner: summarize chunks → {summary:"...", ok:true}
4. summary has a write side-effect (cache update) → APPROVAL GATE
executor requests file_write; gate returns approved
5. executor calls file_write (1/1) → {ok:true}
6. orchestrator: all steps ok, no further work → TERMINATE. depth never exceeded 1.
Failure path — web_search times out
2'. retriever web_search times out → returns envelope {ok:false, error_code:"timeout"}
3'. orchestrator sees ok:false → retry with backoff (attempt 1 of 3)
4'. second attempt also fails → escalate per failure routing: return partial
result {summary:null, error_state:{code:"retrieval_failed"}} and STOP.
The executor is never reached, so no cache write happens on bad data.
The failure trace is the important one: it confirms the approval gate and the error envelope prevent a write from happening when upstream retrieval failed. A design whose failure trace ends in "executor writes anyway" or "loops forever" is not ready.
step_history.max_depth it loops until the budget is exhausted.file_write directly, breaking isolation so you cannot scope the tool. Enforce ownership; add a broker if cross-role use is genuinely needed.max_iterations plus a concrete exit criterion.null on error; the orchestrator cannot distinguish "no result" from "tool failed". Require the standard envelope.Cost and latency are part of the design, not an afterthought. Match each role to the smallest model that does its job reliably:
Record the tier in the agent roster so reviewers can see where budget goes. A common waste is running the orchestrator's routing decision on a frontier model when a small model with a typed output schema routes just as accurately at a fraction of the cost and latency.
Produce a workflow specification document containing: (1) agent roster — name, one-sentence role, model tier, max turns; (2) handoff table; (3) tool ownership table; (4) state schema with field types and descriptions; (5) recursion limits as explicit integers with rationale; (6) failure routing table; (7) approval gates list; (8) 2 happy-path traces and 1 failure-path trace, step by step.
Done means each agent has one role, all handoffs and tools are typed and owned, recursion/iteration limits are explicit integers, the state schema and error envelope are defined, failure routing and approval gates are specified, and the design is verified with at least two happy-path traces and one failure-path trace.
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub fluxonlab/skillry --plugin skillry-ai-and-agent-systems