From vibedata-dbt-skills
Define semantic models and MetricFlow metrics in dbt on Microsoft Fabric. Use when creating semantic model YAML, defining entities, dimensions, and measures, or configuring metric types. Also use when choosing between denormalized marts and semantic layer queries.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vibedata-dbt-skills:dbt-semantic-layerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Practitioner patterns for defining semantic models and MetricFlow metrics in dbt on Microsoft Fabric. Focuses on what LLMs consistently get wrong: entity type selection, metric type decisions, join path design, and the semantic layer vs denormalized mart trade-off.
Practitioner patterns for defining semantic models and MetricFlow metrics in dbt on Microsoft Fabric. Focuses on what LLMs consistently get wrong: entity type selection, metric type decisions, join path design, and the semantic layer vs denormalized mart trade-off.
This is not "it depends." Use these concrete criteria:
| Signal | Use semantic layer | Use denormalized mart |
|---|---|---|
| Metric consumed by 3+ tools (BI, notebooks, APIs) | Yes -- single definition, multiple consumers | No |
| Metric needs flexible dimensionality (different group-bys per consumer) | Yes -- MetricFlow composes joins dynamically | No |
| One dashboard with fixed dimensions | No -- overkill | Yes -- simpler, faster |
| Metric requires cross-model joins (orders + customers + products) | Yes -- MetricFlow handles join paths | Possible but creates wide, fragile tables |
| Team has no dbt Cloud (SL API requires Cloud) | Partial -- dbt sl query works locally, no API | Yes -- only option for ad-hoc BI |
| Source models change schema frequently | Yes -- metric definitions absorb changes | No -- every mart must be updated |
| Need percentile/median aggregations on Fabric | No -- see Fabric limitations below | Yes -- compute in the mart SQL directly |
Default rule: If the metric is consumed in more than one place, define it in the semantic layer. If it's a one-off dashboard column, a mart is fine.
Keep marts normalized (separate fct_ and dim_ tables). Let MetricFlow denormalize dynamically at query time. This:
dim_customers instead of customer columns copied into every fact)Anti-pattern: Building fct_orders_with_customers_and_products as a pre-joined wide mart when you also have semantic models. This creates two sources of truth.
| Entity type | Guarantees | Use when |
|---|---|---|
primary | Exactly one row per value, no nulls | The column is the grain of the model (e.g., order_id in fct_orders) |
unique | At most one row per value, nulls allowed | The column uniquely identifies rows but may have nulls (e.g., email in dim_customers) |
foreign | Multiple rows per value allowed | The column references a primary/unique entity in another model (e.g., customer_id in fct_orders) |
natural | No uniqueness guarantee | Shared business concept without enforced cardinality (rarely used -- prefer foreign + primary pair) |
Common mistake: Using primary for a foreign key column. If customer_id appears in fct_orders (many orders per customer), it must be foreign, not primary. MetricFlow uses entity types to determine join cardinality -- wrong types produce incorrect joins or fan-out errors.
Every semantic model needs these four sections:
semantic_models:
- name: orders # Must match a dbt model
description: "Order fact table. Grain: one row per order."
model: ref('fct_orders') # Points to the dbt model
defaults:
agg_time_dimension: order_date # Required -- MetricFlow needs a time spine
entities:
- name: order_id
type: primary # Grain of this model
- name: customer_id
type: foreign # FK to dim_customers
expr: customer_id # Optional if name matches column
dimensions:
- name: order_date
type: time
type_params:
time_granularity: day # Controls minimum query grain
- name: order_status
type: categorical
measures:
- name: order_total
description: "Sum of order amounts"
agg: sum
expr: amount # SQL expression or column name
- name: order_count
agg: sum
expr: "1" # Count pattern: sum of literal 1
Key rules:
defaults.agg_time_dimension is required. Without it, MetricFlow cannot place metrics on a time axis.primary entity.expr on measures accepts any valid SQL expression, including CASE WHEN for conditional aggregations.entity__dimension syntax to disambiguate.| Aggregation | T-SQL compatible | Notes |
|---|---|---|
sum | Yes | Most common |
average / avg | Yes | |
count_distinct | Yes | |
min / max | Yes | |
sum_boolean | Yes | Sums a boolean/bit column |
count | Yes | Counts non-null values |
percentile | No on Fabric | Requires PERCENTILE_CONT -- see Fabric limitations |
median | No on Fabric | Syntactic sugar for 50th percentile |
The dbt Semantic Layer (via dbt Cloud API) does not officially support Fabric as a query platform. However, you can still define semantic models and metrics in a dbt-fabric project and query them locally via dbt sl query. Limitations:
dbt sl query (CLI) or by materializing via saved query exports.PERCENTILE_CONT SQL, which Fabric's T-SQL does not support. Use a measure with expr containing a manual workaround or compute percentiles in the mart model instead.QUALIFY: MetricFlow may generate QUALIFY clauses for certain filter operations. Fabric does not support QUALIFY -- if you hit this, file a bug or restructure the metric to avoid the pattern.sum_boolean measures should reference BIT columns (0/1), not TRUE/FALSE literals.expr: Use T-SQL syntax (LEN(), CHARINDEX(), ISNULL()) not Postgres/Snowflake syntax (LENGTH(), POSITION(), COALESCE() for two-arg).When the Semantic Layer API is not available (Fabric), use saved query exports to materialize metrics as tables:
saved_queries:
- name: monthly_revenue_by_customer
description: "Revenue metrics grouped by customer and month"
query_params:
metrics:
- revenue
- order_count
group_by:
- TimeDimension('metric_time', 'month')
- Dimension('customer__customer_segment')
where:
- "{{ TimeDimension('metric_time', 'month') }} >= '2024-01-01'"
exports:
- name: monthly_revenue_export
config:
export_as: table # or 'view'
schema: gold # Target schema on Fabric
Run with dbt build --select saved_queries to materialize. This bridges the gap when the full Semantic Layer API is unavailable.
What LLMs consistently get wrong with dbt semantic models and metrics:
fct_/dim_), not stg_ models. Staging models are 1:1 with sources and lack business logic.primary on a column that has many rows per value. MetricFlow will reject the join or produce incorrect results.create_metric: true AND a separate metric block. Pick one.agg_time_dimension -- Every semantic model needs a default time dimension. Without it, time-series queries fail.natural entity type by default -- Natural entities provide no join guarantees. Use foreign + primary pairs for predictable joins.expr -- Using || for concatenation, INTERVAL for date math, or :: for casting in measure/dimension expressions. These fail on Fabric.entity in a conversion metric must exist in both the base and conversion semantic models. Mismatched entities produce empty results.For deeper guidance on specific patterns:
npx claudepluginhub accelerate-data/vibedata-official --plugin vibedata-dbt-skillsSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.