From grimoire
Designs data pipelines using functional principles: idempotency, immutability, declarative transformations. Guides on ELT, partitioning, dbt layers, data quality tests, and DAG orchestration.
How this skill is triggered — by the user, by Claude, or both
Slash command
/grimoire:design-data-pipelineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design data pipelines using functional principles — idempotency, immutability, and declarative transformations — for reliability and maintainability.
Design data pipelines using functional principles — idempotency, immutability, and declarative transformations — for reliability and maintainability.
Adopted by: Airbnb (Airflow originator), Fishtown Analytics (dbt originator, now dbt Labs), Lyft, GitLab Impact: Beauchemin's functional data engineering principles, adopted by thousands of data teams, eliminate an entire class of pipeline bugs (non-idempotent transforms, mutable state) that cause silent data corruption. dbt's adoption grew from 0 to 30,000+ companies in 5 years due to its application of software engineering practices to data transformation.
Why best: Traditional ETL pipelines are stateful, brittle, and difficult to test. Functional data engineering applies software engineering principles: pure transformations (same input → same output), immutable historical data, idempotent operations (safe to re-run), and declarative SQL-based transforms that are version-controlled and testable.
INSERT OVERWRITE or CREATE OR REPLACE semantics; never INSERT APPEND without deduplication guards.processed_at timestamp.ingested_at timestamps.dbt model structure:
models/
staging/
stg_orders.sql -- raw → typed, renamed
stg_payments.sql
intermediate/
int_order_items.sql -- join orders + line items
marts/
fct_orders.sql -- fact table for analytics
dim_customers.sql -- dimension table
Idempotent load (BigQuery):
CREATE OR REPLACE TABLE `project.dataset.orders_2026_03_01`
AS SELECT * FROM `project.raw.orders`
WHERE DATE(created_at) = '2026-03-01';
INSERT OVERWRITE or MERGE semantics.npx claudepluginhub jeffreytse/grimoire --plugin grimoireDesign batch and streaming data pipelines. Plan ingestion, transformation, quality checks, and failure recovery. Use when building ETL/ELT systems or data infrastructure.
Designs data pipelines and ETL processes covering extraction, transformation, loading, data quality checks, orchestration, and patterns for batch, streaming, CDC, ELT. Useful for building pipelines, data flows, syncing, or moving data between systems.
Builds scalable data pipelines, modern data warehouses, and real-time streaming architectures using Apache Spark, dbt, Airflow, and cloud-native platforms.