Skill

ds-delegate

Subagent delegation for data analysis. Dispatches fresh Task agents per step with output-first verification. Enforced via hooks to prevent analysis code in main chat.

Python

data-engineering

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/workflows:ds-delegate

Not user invocable

Model invocation disabled

Inline context

Default effort

Hooks

PreToolUse

Matcher: Agent

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-pre-subagent-clear.py

Matcher: Read

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-read-after-subagent-guard.py

Matcher: Grep

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-read-after-subagent-guard.py

Matcher: Glob

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-read-after-subagent-guard.py

Matcher: Write

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-no-main-chat-code-guard.py

Matcher: Edit

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-no-main-chat-code-guard.py

Matcher: Bash

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-no-main-chat-code-guard.py

PostToolUse

Matcher: Agent

Hooks:

commanduv run python3 ${CLAUDE_PLUGIN_ROOT}/hooks/ds-post-subagent-guard.py

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- [The Iron Law of Delegation](#the-iron-law-of-delegation)

Supporting Files

references/sql-patterns.md

SKILL.md

527 lines · ~4.8k tokens

Stats

LanguageJupyter Notebook

Stars16

Forks5

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Core Principle

Fresh subagent per task + output-first verification = reliable analysis

Analyst subagent does the work
Must produce visible output at each step
Methodology reviewer checks approach
Loop until output verified

When to Use

Called by ds-implement for each task in PLAN.md. Don't invoke directly.

The Process

For each task:
    1. Dispatch analyst subagent
       - If questions → answer, re-dispatch
       - Implements with output-first protocol
    2. Verify outputs are present and reasonable
    3. Dispatch methodology reviewer (if complex)
    4. Mark task complete, log to LEARNINGS.md

Task Type Detection

Each task in PLAN.md should have a type field. Detect and route accordingly:

Task Type	Agent	Constraints	Example Tasks
`engineering`	`workflows:ds-engineer`	ds-engineering-constraints.md index + atomic E1-E5 files	ETL, merge, clean, transform, pipeline, schema, join
`analysis`	`workflows:ds-analyst`	ds-analysis-constraints.md index + atomic A1-A7 files	regression, test, model, visualize, estimate, summarize

Detection heuristic (when type field is missing):

Task contains these keywords	Type
merge, join, clean, ETL, transform, pipeline, ingest, schema, deduplicate, normalize	engineering
regression, estimate, test, model, plot, chart, visualize, summarize, correlate, panel	analysis
ambiguous	Default to `analysis` (safer — analysis constraints are stricter)

Step 1: Dispatch Analyst/Engineer

Pattern: Use structured delegation template from references/delegation-template.md

Every delegation MUST include:

TASK - What to analyze
EXPECTED OUTCOME - Success criteria
REQUIRED SKILLS - Statistical/ML methods needed
REQUIRED TOOLS - Data access and analysis tools
MUST DO - Output-first verification
MUST NOT DO - Methodology violations
CONTEXT - Data sources and previous work
VERIFICATION - Output requirements

Use this Task invocation (fill in brackets). Route based on task type detected above:

All paths below are relative to this skill's base directory.

For analysis tasks:

Task(subagent_type="workflows:ds-analyst", prompt="""
# TASK

Analyze: [TASK NAME]

## EXPECTED OUTCOME

You will have successfully completed this task when:
- [ ] [Specific analysis output 1]
- [ ] [Specific analysis output 2]
- [ ] Output-first verification at each step
- [ ] Results documented with evidence

## REQUIRED SKILLS

This task requires:
- [Statistical method]: [Why needed]
- [Programming language]: Data manipulation
- Output-first verification (mandatory)
- SQL reference: Read `${CLAUDE_SKILL_DIR}/../../skills/ds-delegate/references/sql-patterns.md` for dialect-specific patterns
- Data quality checks: Read `${CLAUDE_SKILL_DIR}/../../skills/ds-implement/references/ds-checks.md` for DQ1-DQ6 verification patterns (mandatory)
- Analysis constraints: Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-analysis-constraints.md` for the constraint index, then load:
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-robustness-checks.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-standard-error-spec.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-visualization-integrity.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-table-figure-pairing.md`
- Analysis conventions: Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-common-conventions.md` for the convention index, then load:
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-statistical-validity.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-p-hacking-prevention.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-sample-selection.md`
  Read `${CLAUDE_SKILL_DIR}/../../references/constraints/ds-deviation-rules-analysis.md`

## REQUIRED TOOLS

You will need:
- Read: Load datasets and existing code
- Write: Create analysis scripts/notebooks
- Bash: Run analysis and verify outputs

**Tools denied:** None (full analysis access)

## MUST DO

- [ ] Print state BEFORE each operation (shape, head)
- [ ] Print state AFTER each operation (nulls, sample)
- [ ] Verify outputs are reasonable at each step
- [ ] Document methodology decisions

## MUST NOT DO

- ❌ Skip verification outputs
- ❌ Proceed with questionable data without flagging
- ❌ Guess on methodology (ask if unclear)
- ❌ Claim completion without visible outputs

## CONTEXT

### Task Description
[PASTE FULL TASK TEXT FROM PLAN.md]

### Analysis Context
- Analysis objective: [from SPEC.md]
- Data sources: [list with paths]
- Previous steps: [summary from LEARNINGS.md]

## Output-First Protocol (MANDATORY)
For EVERY operation:
1. Print state BEFORE (shape, head)
2. Execute operation
3. Print state AFTER (shape, nulls, sample)
4. Verify output is reasonable

Example:
```python
print(f"Before: {df.shape}")
df = df.merge(other, on='key')
print(f"After: {df.shape}")
print(f"Nulls introduced: {df.isnull().sum().sum()}")
df.head()

Required Outputs by Operation

Operation	Required Output
Load data	shape, dtypes, head()
Filter	shape before/after, % removed
Merge/Join	shape, null check, sample
Groupby	result shape, sample groups
Model fit	metrics, convergence

If Unclear

Ask questions BEFORE implementing. Don't guess on methodology.

Output

Report: what you did, key outputs observed, any data quality issues found. """)


**For `engineering` tasks:**

Task(subagent_type="workflows:ds-engineer", prompt="""

TASK

Engineer: [TASK NAME]

EXPECTED OUTCOME

You will have successfully completed this task when:

[Specific engineering output 1]
[Specific engineering output 2]
Output-first verification at each step
Results documented with evidence

REQUIRED SKILLS

This task requires:

[Engineering method]: [Why needed]
[Programming language]: Data manipulation
Output-first verification (mandatory)
SQL reference: Read ${CLAUDE_SKILL_DIR}/../../skills/ds-delegate/references/sql-patterns.md for dialect-specific patterns
Data quality checks: Read ${CLAUDE_SKILL_DIR}/../../skills/ds-implement/references/ds-checks.md for DQ1-DQ6 verification patterns (mandatory)
Engineering constraints: Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-engineering-constraints.md for the constraint index, then load: Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-determinism.md Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-schema-contracts.md Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-join-audits.md Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-idempotency.md Read ${CLAUDE_SKILL_DIR}/../../references/constraints/ds-error-handling.md

REQUIRED TOOLS

You will need:

Read: Load datasets and existing code
Write: Create ETL scripts/pipelines
Bash: Run transformations and verify outputs

Tools denied: None (full engineering access)

MUST DO

Print state BEFORE each operation (shape, head)
Print state AFTER each operation (nulls, sample)
Verify schema contracts at each step
Validate determinism (same input → same output)
Check join key uniqueness before merging
Document pipeline decisions

MUST NOT DO

❌ Skip verification outputs
❌ Proceed with non-deterministic transforms without flagging
❌ Introduce silent data loss (row drops without logging)
❌ Claim completion without visible outputs

CONTEXT

Task Description

[PASTE FULL TASK TEXT FROM PLAN.md]

Engineering Context

Pipeline objective: [from SPEC.md]
Data sources: [list with paths]
Previous steps: [summary from LEARNINGS.md]

Output-First Protocol (MANDATORY)

For EVERY operation:

Print state BEFORE (shape, head)
Execute operation
Print state AFTER (shape, nulls, sample)
Verify output is reasonable

Example:

print(f"Before: {df.shape}")
df = df.merge(other, on='key')
print(f"After: {df.shape}")
print(f"Nulls introduced: {df.isnull().sum().sum()}")
df.head()

Required Outputs by Operation

Operation	Required Output
Load data	shape, dtypes, head()
Filter	shape before/after, % removed
Merge/Join	shape, null check, key uniqueness
Transform	before/after sample, determinism check
Pipeline step	input shape → output shape, schema validation

If Unclear

Ask questions BEFORE implementing. Don't guess on architecture.

Output

Report: what you did, key outputs observed, any data quality or schema issues found. """)


**If agent asks questions:** Answer clearly, especially about methodology choices (analysis) or architecture decisions (engineering).

**If agent completes task:** Verify outputs, then proceed or review.

## Step 2: Verify Outputs (Post-Subagent Boundary)

<EXTREMELY-IMPORTANT>
**After analyst returns, you are at the post-subagent boundary. Constraint C2 (Post-Subagent Boundary) from ds-common-constraints.md applies.**

**ALLOWED (Verification):**
- [ ] Read the analyst's returned report/summary
- [ ] Check LEARNINGS.md for output documentation
- [ ] Confirm output files exist (`ls -la`)
- [ ] Compare task counts (expected vs actual)

**FORBIDDEN (Investigation):**
- ❌ Read project source code, notebooks, or data files
- ❌ Run analysis code to "confirm" results
- ❌ Query databases or inspect intermediate files
- ❌ Grep/Glob project files

**If the analyst's report shows problems, re-dispatch a Task agent. Do NOT investigate yourself.**
</EXTREMELY-IMPORTANT>

Upon verification failure, re-dispatch analyst with specific fix instructions. **Bound this loop: at most 3 fix-and-re-dispatch cycles per task.** If the reviewer still returns ISSUES after 3 cycles, STOP and escalate to the user (the task is harder than the plan assumed — a 4th identical re-dispatch rarely converges). This mirrors ds-review's max-3 cycle cap; a per-task loop with no limit can spin or be silently abandoned.

## Step 3: Dispatch Methodology Reviewer (Complex Tasks)

For statistical analysis, modeling, or methodology-sensitive tasks, dispatch a methodology reviewer. **Tailor the review checklist to the task type:**

Task(subagent_type="general-purpose", allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"], prompt=""" Review methodology for: [TASK NAME] Task type: [engineering | analysis]

What Was Done

[SUMMARY FROM ANALYST/ENGINEER OUTPUT]

Original Requirements

[FROM SPEC.md - especially any replication requirements]

Tool Restrictions: The methodology reviewer is READ-ONLY. It reads code, verifies outputs, and returns a verdict. It MUST NOT use Write or Edit.

CRITICAL: Do Not Trust the Report

The agent may have:

Reported success without actually running the code
Cherry-picked output that looks correct
Glossed over data quality issues
Made methodology choices without justification

DO:

Read the actual code or notebook cells
Verify outputs exist and match claims
Check for silent failures (empty DataFrames, all nulls)
Confirm assumptions were checked

Review Checklist — Engineering Tasks

Use this checklist when task type is engineering:

Are schema contracts validated at each pipeline stage?
Is the pipeline deterministic (same input → same output)?
Is the transform idempotent (safe to re-run)?
Are error handling and edge cases covered (empty inputs, missing keys)?
Are join keys validated for uniqueness before merge?
Is data loss accounted for (row counts before/after, logged drops)?

Review Checklist — Analysis Tasks

Use this checklist when task type is analysis:

Is the statistical method appropriate for the data type?
Are assumptions documented and checked?
Is sample size adequate for conclusions?
Is the specification justified (why these controls, why this functional form)?
Are robustness checks included (alternative specs, subsamples)?
Is the standard error specification appropriate (clustered, HC, bootstrap)?
Are there data leakage or p-hacking concerns?
Is the approach reproducible (seeds, versions)?

Confidence Scoring

Rate each issue 0-100. Only report issues >= 80 confidence.

Output Format

APPROVED: Methodology sound (after verifying code/outputs yourself)
ISSUES: List concerns with confidence scores and file:line references """)


## Step 4: Log to LEARNINGS.md

Append to `.planning/LEARNINGS.md` after each task:

```markdown
## Task N: [Name] - COMPLETE

**Input:** [describe input state]

**Operation:** [what was done]

**Output:**
- Shape: [final shape]
- Key findings: [observations]

**Verification:**
- [how you confirmed it worked]

**Next:** [what comes next]

Gate: Exit Delegation (Per-Task)

Checkpoint type: human-verify (task completion is machine-verifiable)

Before marking any task as complete, execute this gate:

1. IDENTIFY → What proves this task is done?
   - Task agent returned output (not just "done")
   - Output matches PLAN.md expected output for this task
2. RUN      → Read the agent's actual output (not just the summary)
3. READ     → Verify: shapes reasonable? No unexpected nulls? Sample looks correct?
4. VERIFY   → If statistical task: methodology reviewer approved
5. CLAIM    → Only log "Task N: COMPLETE" in LEARNINGS.md if ALL checks pass

If agent returned no visible output, this gate FAILS. Re-dispatch with explicit output requirements.

Skipping output verification is NOT HELPFUL — unverified results lead the user to act on wrong analysis.

Delegation Facts

"Step complete" asserts four things at once: a Task agent ran the analysis, output was visible, you personally verified it (not just trusted the agent's word), and the methodology reviewer approved (for statistical tasks). If any of these didn't happen, the claim is unverified — it gives the user false confidence in work no one checked.

A Task agent dispatched without SPEC/PLAN context guesses — and executes its guess literally. A summarized PLAN.md strips details the analyst needs; provide the full task text.
An agent's "completion" is a claim, not a result — accepting it without reading the actual output is an unverified claim passed to the user as fact.

Delete & Restart

If you wrote analysis code in the main chat instead of delegating to a task agent, DELETE it immediately and dispatch a Task agent.

Code written in main chat is contaminated by orchestrator context, skips the output-first protocol, and bypasses methodology review. It cannot be salvaged — it must be replaced.

Failure Handling

When analyst produces no visible output:

You must re-dispatch with explicit output requirements
Treat this as a hard failure, not something to work around

When analyst fails a task:

You must dispatch a fix subagent with specific instructions
Don't fix it yourself in main chat—you'll pollute context and hide the real issue

Example Flow

Me: Implementing Task 1: Load and clean transaction data

[Dispatch analyst with full task text]

Analyst:
- Loaded transactions.csv: (50000, 12)
- Found 5% nulls in amount column
- "Should I drop or impute nulls?"

Me: "Impute with median, flag imputed rows"

[Re-dispatch with answer]

Analyst:
- Imputed 2,500 rows with median ($45.50)
- Added is_imputed flag column
- Final shape: (50000, 13)
- Sample output: [shows head with flag]

[Verify: shapes match, flag exists, no unexpected changes]

[Log to LEARNINGS.md]

[Mark Task 1 complete, move to Task 2]

Model Tier Hints

When dispatching subagents, match model capability to task complexity via the Agent tool's model parameter (omit it to inherit the session model -- the right default for judgment-heavy work).

Task Complexity	Model Tier	Signals	Example
Mechanical	Cheapest capable	Data loading, simple filtering, descriptive stats, file format conversion	"Load CSV and compute summary statistics"
Integration	Standard	Merges/joins across sources, aggregations, visualization, data reshaping	"Merge transaction and customer tables, create pivot summary"
Architecture/Review	Most capable	Feature engineering strategy, model selection, statistical assumption validation, methodology review	"Select appropriate model family and validate distributional assumptions"

Complexity signals:

Reads/writes 1 file with clear spec -> mechanical
Joins/reshapes across sources or produces visualizations -> integration
Requires statistical judgment or methodology design -> architecture

When in doubt, use the standard tier. Over-allocating is wasteful; under-allocating produces poor results.

Integration

This skill is invoked by ds-implement during the output-first implementation phase. After all tasks complete, ds-implement proceeds to ds-review.

ds-delegate

Popularity

Invocation

Hooks

PreToolUse

PostToolUse

Context Preview

Supporting Files

SKILL.md

ds-delegate

Popularity

Invocation

Hooks

PreToolUse

PostToolUse

Context Preview

Supporting Files

SKILL.md

Contents

Core Principle

When to Use

The Process

Task Type Detection

Step 1: Dispatch Analyst/Engineer

Required Outputs by Operation

If Unclear

Output

TASK

EXPECTED OUTCOME

REQUIRED SKILLS

REQUIRED TOOLS

MUST DO

MUST NOT DO

CONTEXT

Task Description

Engineering Context

Output-First Protocol (MANDATORY)

Required Outputs by Operation

If Unclear

Output

What Was Done

Original Requirements

CRITICAL: Do Not Trust the Report

Review Checklist — Engineering Tasks

Review Checklist — Analysis Tasks

Confidence Scoring

Output Format

Gate: Exit Delegation (Per-Task)

Delegation Facts

Delete & Restart

Failure Handling

Example Flow

Model Tier Hints

Integration

Similar Skills

Contents

Core Principle

When to Use

The Process

Task Type Detection

Step 1: Dispatch Analyst/Engineer

Required Outputs by Operation

If Unclear

Output

TASK

EXPECTED OUTCOME

REQUIRED SKILLS

REQUIRED TOOLS

MUST DO

MUST NOT DO

CONTEXT

Task Description

Engineering Context

Output-First Protocol (MANDATORY)

Required Outputs by Operation

If Unclear

Output

What Was Done

Original Requirements

CRITICAL: Do Not Trust the Report