From toolkit-pipeline
Author or validate a DiscoveryConfig dataspec — the YAML describing a target table (columns, grain, business rules, load strategy) consumed by toolkit agent discovery. Use when the user wants to define a new pipeline target, has a dataspec to check, or asks what a dataspec needs to contain.
How this skill is triggered — by the user, by Claude, or both
Slash command
/toolkit-pipeline:specThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A dataspec is one YAML file per target table. Full field reference: the plugin's
A dataspec is one YAML file per target table. Full field reference: the plugin's
references/spec-schema.md (two levels up from this skill's directory); complete examples per
tooling under references/examples/. Read the schema reference before writing anything.
Read it and validate against the schema reference:
content present and substantive (target table + columns + grain + rules)?tooling/materialization combination legal (per-tooling enums)?loadStrategy keys spelled right (type, change_detection, watermark_column, ...)?targetPlatform consistent with tooling (pyspark → databricks unless they know better)?targetRequirement/resolvedTransformation (producer-only fields)?Report problems with concrete fixes; small gaps (missing grain, vague rules) matter more than formatting — discovery quality tracks spec quality.
Gather, in order (skip what the user already said):
sql, dbt, or pyspark — what should pipeline-build emit?snowflake or databricks (pyspark defaults to databricks).format: ddl.format: natural_language: columns (name, source or "derived from X",
nullability), the grain ("one row per ..."), and business rules (derivations, filters,
SCD expectations). Push for the grain and rules — they drive the generated tests.materialization: merge + loadStrategy.type: incremental_merge with
change_detection: watermark; append-only facts: incremental_append; small/reference
tables: full_refresh/ctas.filters block in
toolkit.conf, that bounds what discovery can see at all.Write the file as specs/<target_table>.yaml in the working directory (one spec per target
table), show it to the user, and point at /toolkit-pipeline:discover as the next step.
Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub phdata/agent-marketplace --plugin toolkit-pipeline