From astronomer-data
Traces downstream data lineage for tables and DAGs to identify dependents, build impact trees, categorize criticality, and assess change risks before modifications.
How this skill is triggered — by the user, by Claude, or both
Slash command
/astronomer-data:tracing-downstream-lineageThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Answer the critical question: "What breaks if I change this?"
Answer the critical question: "What breaks if I change this?"
Use this BEFORE making changes to understand the blast radius.
Find everything that reads from this target:
For Tables:
Search DAG source code: Look for DAGs that SELECT from this table
af dags list to get all DAGsaf dags source <dag_id> to search for table referencesFROM target_table, JOIN target_tableCheck for dependent views:
-- Snowflake
SELECT * FROM information_schema.view_table_usage
WHERE table_name = '<target_table>'
-- Or check SHOW VIEWS and search definitions
Look for BI tool connections:
If you're running on Astro, the Lineage tab in the Astro UI provides visual dependency graphs across DAGs and datasets, making downstream impact analysis faster. It shows which DAGs consume a given dataset and their current status, reducing the need for manual source code searches.
For DAGs:
af dags source <dag_id> to find output tablesMap the full downstream impact:
SOURCE: fct.orders
|
+-- TABLE: agg.daily_sales --> Dashboard: Executive KPIs
| |
| +-- TABLE: rpt.monthly_summary --> Email: Monthly Report
|
+-- TABLE: ml.order_features --> Model: Demand Forecasting
|
+-- DIRECT: Looker Dashboard "Sales Overview"
Critical (breaks production):
High (causes significant issues):
Medium (inconvenient):
Low (minimal impact):
For the proposed change, evaluate:
Schema Changes (adding/removing/renaming columns):
Data Changes (values, volumes, timing):
Deletion/Deprecation:
Identify who owns downstream assets:
owners field in DAG definitions"Changing fct.orders will impact X tables, Y DAGs, and Z dashboards"
+--> [agg.daily_sales] --> [Executive Dashboard]
|
[fct.orders] -------+--> [rpt.order_details] --> [Ops Team Email]
|
+--> [ml.features] --> [Demand Model]
| Downstream | Type | Criticality | Owner | Notes |
|---|---|---|---|---|
| agg.daily_sales | Table | Critical | data-eng | Updated hourly |
| Executive Dashboard | Dashboard | Critical | analytics | CEO views daily |
| ml.order_features | Table | High | ml-team | Retraining weekly |
| Change Type | Risk Level | Mitigation |
|---|---|---|
| Add column | Low | No action needed |
| Rename column | High | Update 3 DAGs, 2 dashboards |
| Delete column | Critical | Full migration plan required |
| Change data type | Medium | Test downstream aggregations |
Before making changes:
transform_daily_salesnpx claudepluginhub astronomer/agents --plugin astronomer-dataTraces upstream data lineage for Airflow tables, columns, and DAGs via CLI commands, source code, and UI. Identifies SQL sources, external systems like Postgres and Salesforce.
Explores data lineage, traces upstream/downstream dependencies, performs impact analysis, and maps data pipelines using DataHub CLI.
Verifies ETL/ELT pipeline quality, data contracts, idempotency, and test coverage. Analyzes DAG structure, transformation logic, and data quality checks across dbt, Airflow, Dagster, and Prefect pipelines.