Skill

validate-data

Validates dlt pipeline-loaded schemas and data: mermaid diagrams, dashboard/MCP queries, fixes types (Decimal for money), nested structures, missing columns.

Python

data-engineering

database

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/rest-api-pipeline:validate-data

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

After a successful pipeline load, verify the schema and data make sense. Fix data types, nested structures, and missing columns as needed.

SKILL.md

73 lines · ~725 tokens

Stats

LanguagePython

Parent stars39

Parent forks3

MaintenanceExcellent

Last CommitMar 31, 2026

Actions

View Source View Plugin View on GitHub View README

Validate loaded data

After a successful pipeline load, verify the schema and data make sense. Fix data types, nested structures, and missing columns as needed.

Parse $ARGUMENTS:

pipeline-name (optional): the dlt pipeline name. If omitted, infer from session context. If ambiguous, ask the user and stop.
hints (optional, after --): specific validation concerns

1. Inspect schema

Export schema as mermaid

dlt pipeline <pipeline_name> schema --format mermaid

Show the mermaid diagram to the user. This gives a quick overview of tables, columns, types, and relationships (parent/child).

2. View the data

For the human: Workspace Dashboard

Tell the user to run Workspace Dashboard:

dlt pipeline <pipeline_name> show

This opens a browser with table schemas, row counts, and sample data.

For the agent: set up pipeline MCP server to query the data

You have mcp with a right set of tools available

3. Review with user

Ask the user if the schema and data look right. Common issues to address:

Data type fixes

Use processing_steps in the resource config to transform data before loading. Available steps: map, filter, yield_map.

"processing_steps": [
    {"map": lambda item: {**item, "amount": Decimal(item["amount"])}},
]

IMPORTANT: NEVER convert monetary amounts or precision-sensitive values to float. Always use Decimal.

Nested structures

dlt auto-unnests nested arrays into child tables (e.g., results inside a response becomes <resource>__results). This is often fine for analytics. If the user wants a flat structure, use yield_map to flatten, or adjust data_selector to point deeper into the response.

Missing columns

Columns that are all-null on first load won't have inferred types. Options:

Add columns hints to the resource config: "columns": {"field": {"data_type": "text"}}
Add group_by or other API params to populate the columns

4. Iterate

Re-run the pipeline after changes (dev_mode gives a fresh dataset each time). Use debug-pipeline to inspect traces and load packages after each run. Inspect again with MCP or dlt pipeline <name> schema --format mermaid. Repeat until the user is happy with the schema.

Next steps

User is happy with data → suggest new-endpoint for more resources, view-data for querying, or the data-exploration toolkit for interactive notebooks and reports
Need to fix pipeline code → edit and re-run with debug-pipeline
User wants to see the data -> Workspace Dashboard with command above

validate-data

Popularity

Invocation

Context Preview

SKILL.md

validate-data

Popularity

Invocation

Context Preview

SKILL.md

Validate loaded data

1. Inspect schema

Export schema as mermaid

2. View the data

For the human: Workspace Dashboard

For the agent: set up pipeline MCP server to query the data

3. Review with user

Data type fixes

Nested structures

Missing columns

4. Iterate

Next steps

Similar Skills

Validate loaded data

1. Inspect schema

Export schema as mermaid

2. View the data

For the human: Workspace Dashboard

For the agent: set up pipeline MCP server to query the data

3. Review with user

Data type fixes

Nested structures

Missing columns

4. Iterate

Next steps

Similar Skills