From bootstrap-domain
Analytics engineering practice patterns for dbt, SQL modeling, data transformation, dimensional modeling, metrics layers, and semantic models. Covers dbt Core, dbt Cloud, Jinja, YAML configuration, testing, documentation, and data modeling best practices. Use when reviewing or building dbt projects, SQL transformations, or data models. Do not use for BI dashboards or visualization (use analytics), data pipeline infrastructure (use data-engineering), or ML model development (use data-science).
How this skill is triggered — by the user, by Claude, or both
Slash command
/bootstrap-domain:analytics-engineeringThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Domain-specific patterns and checklists for analytics engineering work.
Domain-specific patterns and checklists for analytics engineering work.
models/
staging/
source_name/
_source_name__sources.yml
_source_name__models.yml
stg_source_name__table.sql
intermediate/
domain/
int_domain__purpose.sql
marts/
domain/
_domain__models.yml
dim_entity.sql
fct_events.sql
tests/
generic/
test_custom.sql
singular/
assert_specific_condition.sql
macros/
generate_schema_name.sql
custom_macros.sql
| Prefix | Layer | Purpose |
|---|---|---|
| stg_ | Staging | 1:1 with source, clean and rename |
| int_ | Intermediate | Business logic, joins, aggregations |
| dim_ | Marts | Dimension tables (entities) |
| fct_ | Marts | Fact tables (events, transactions) |
| rpt_ | Reports | Pre-aggregated for BI tools |
with source as (
select * from {{ source('source_name', 'table_name') }}
),
renamed as (
select
-- Primary key
id as entity_id,
-- Foreign keys
related_id as related_entity_id,
-- Dimensions
name as entity_name,
type as entity_type,
-- Dates
created_at,
updated_at
from source
)
select * from renamed
{{
config(
materialized='incremental',
unique_key='event_id',
incremental_strategy='merge'
)
}}
with source_data as (
select *
from {{ ref('stg_events') }}
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}
)
select * from source_data
models:
- name: fct_orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: customer_id
tests:
- not_null
- relationships:
to: ref('dim_customers')
field: customer_id
- name: status
tests:
- accepted_values:
values: ['pending', 'completed', 'cancelled']
-- tests/assert_total_revenue_positive.sql
select *
from {{ ref('fct_orders') }}
where total_revenue < 0
models:
- name: fct_orders
description: >
Order transactions at the order level. One row per order.
Includes order totals, status, and timestamps.
Grain: one row per order_id.
columns:
- name: order_id
description: Primary key. Unique identifier for the order.
- name: total_revenue
description: >
Total revenue for the order in USD.
Excludes tax and shipping.
Snowflake:
BigQuery:
Redshift:
Architecture decisions and dimensional modeling patterns in the context of analytics engineering tooling.
| Pattern | When to use | Trade-off |
|---|---|---|
| Star schema (fact + dim tables) | BI tools, standard analytics, join performance matters | More joins; simpler business logic per table |
| Wide / OBT (one big table) | Embedded analytics, columnar warehouse, denormalized BI | Faster queries; harder to maintain, schema drift |
| Normalized (3NF-adjacent) | Source-aligned staging, data contracts, high write volume | Fewer redundant columns; more joins downstream |
Decision rule: Start with star schema in the mart layer. Denormalize (OBT) only when you have empirical evidence of join performance issues at scale.
Bronze (raw) → Silver (cleaned/conformed) → Gold (business-level)
dbt mapping: Staging models ≈ Silver boundary. Mart models = Gold layer. Bronze is source-system tables.
A data contract is a schema + SLA agreement between a producer and a consumer. Minimum contract fields:
schema: column names, types, nullabilitygrain: what one row representssla: freshness expectation (e.g., updated within 6 hours of period close)owner: team or person responsible for the producing modelIn dbt: document contracts in schema.yml with constraints and data_tests. Use dbt source freshness for SLA monitoring.
| Workload | Snowflake | BigQuery | Databricks |
|---|---|---|---|
| Standard SQL analytics, BI | ✓ Strong | ✓ Strong | Adequate |
| Large-scale ML + analytics | Adequate | Adequate | ✓ Strong |
| Streaming + batch unified | Limited | ✓ (via Dataflow) | ✓ Strong |
| Cost predictability | ✓ (credit model) | Variable (on-demand) | Variable |
| dbt compatibility | ✓ Native | ✓ Native | ✓ Native |
Decision rule: Default to the warehouse your data team already operates. Migration costs exceed optimization gains in most cases. Adopt Databricks if you have significant ML workloads that need the same data as your analytics.
npx claudepluginhub davekim917/bootstrap --plugin bootstrap-domainProvides dbt patterns for building staging, intermediate, and marts data models, incremental materializations, schema tests, and transformation pipelines in analytics engineering.
Provides production-ready patterns for dbt including model organization, testing, documentation, and incremental processing for analytics engineering.
Builds and modifies dbt models with SQL transformations using ref() and source(), creates tests, validates results with dbt show. For dbt projects: modeling, debugging errors, data exploration, testing, change evaluation.