Create first-class Cube.dev cube definitions from existing RevOS dbt gold models. Use when asked to: build a semantic layer, create cubes, generate Cube definitions from dbt, define cube files, or create a semantic model from gold models.
How this skill is triggered — by the user, by Claude, or both
Slash command
/revos-data-engineering:create-cubesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill delegates dbt-related knowledge to `create-dbt-transformations`:
This skill delegates dbt-related knowledge to create-dbt-transformations:
project layout, finding gold models, resolving dbt model names to BigQuery table references,
creating bridge models from JSON arrays, and dbt validation commands.
If create-dbt-transformations is not installed: discover gold models directly via
find dbt/models/gold -name "*.sql" and skip bridge model delegation.
Warn the user: "The create-dbt-transformations skill is not installed — bridge model creation and dbt validation are limited."
If BigQuery exploration is needed (listing tables, inspecting schemas, previewing rows),
load the explore-lakehouse skill on demand.
Expose existing dbt gold models as queryable Cube.dev semantic models without manually writing YAML boilerplate. Gold models may be tables or views.
Each cube is a complete, standalone definition stored in cubes/. There is no patching or merging — what is in the file is what gets deployed.
This skill does not build gold models. If a needed gold model is missing, hand off to create-dbt-transformations.
Strip the gold_ prefix for cube names and file names. Keep gold_ in sql_table (physical table).
Cube names must be valid SQL/JS identifiers — snake_case (letters, digits, underscores), never hyphens. The cube identifier is metadata.name: the IaC loader mirrors it into definition.name, and the Cube.dev compiler parses that as the identifier referenced by ${CUBE} and joins. Hyphens in metadata.name surface as a generic 500 from the compiler with no field-level message. Never put name inside spec — the loader strips it. (Connection resources follow the opposite convention — metadata.name is hyphenated there; this rule is cube-specific.)
gold SQL file: dbt/models/gold/gold_hubspot_companies.sql
BigQuery table: gold_hubspot_companies
cube file: cubes/hubspot_companies.yml
cube name: hubspot_companies
join reference: ${hubspot_companies}
sql_table: "`<dataset>.gold_hubspot_companies`"
Same rule for bridge cubes: gold_deals_companies -> cube name deals_companies, file cubes/deals_companies.yml.
sql_table ReferenceReference the physical warehouse table directly. Cube does not understand Jinja — never use dbt ref().
Use the literal dataset value resolved in Phase 2. Wrap in BigQuery backticks:
sql_table: "`<dataset>.gold_hubspot_companies`"
Apply the same fully qualified format in refresh_key.sql.
After discovering gold models, show available models and ask which should participate. Do not proceed until the user selects. Exception: if the user named one specific model, treat it as selected.
When selected models are not directly connected, search remaining gold models for connectors. Present the connector path and ask approval before adding.
When a many-to-many relationship is detected and no suitable bridge model exists, ask whether to create one. If approved, delegate to create-dbt-transformations.
Present validated relationships with join directions, cardinality, and match rates. Ask the user to confirm before generating cube files. Do not present unvalidated joins as confirmed — mark them as validation pending.
Generate default count measure, suggest additional measures, and ask the user to confirm or define custom measures.
Follow these phases in order. Do not skip ahead.
find dbt/models/gold -name "*.sql".create-dbt-transformations.cubes/ to detect conventions (extends:, public:, refresh_key style). Apply detected conventions to new cubes. Always use flat single-cube YAML (never cubes: or views: root).Before any SQL or YAML generation, resolve $REVOS_BQ_DATASET to a literal:
echo "DATASET=$REVOS_BQ_DATASET"
If empty, stop and ask the user to set it. Use this literal everywhere downstream.
For each selected gold model, inspect columns and types. Use explore-lakehouse if needed.
Check whether _airbyte_extracted_at exists (needed for refresh_key and as the only Airbyte dimension to expose).
Detect primary keys, secondary keys, foreign keys, and JSON/array keys. See references/key-patterns.md for common patterns and detection rules.
Validate primary key uniqueness with SQL. See references/validation-queries.md, section 1.
Output a schema summary for each model (format in key-patterns.md).
Build a candidate relationship graph. These are candidates only — they must be validated in Phase 4.
If only one model was selected and it has no JSON/array relationship columns, skip to Phase 6. If it has JSON/array columns, check for bridge/junction needs.
Use Cube types: one_to_one, one_to_many, many_to_one, many_to_many. Direction is always from the perspective of the current cube.
many_to_one / reverse one_to_many.one_to_one.many_to_many.Look for FK-to-PK matches, secondary key matches, existing bridge models, and JSON/array relationship fields.
When selected models are disconnected:
If an existing bridge model is found in dbt/models/gold/:
If no bridge exists and user approves creating one (Checkpoint 2b), delegate to create-dbt-transformations. Once built, generate a bridge cube with public: false.
Verify with SQL against BigQuery that candidate joins work and direction is correct.
For each candidate relationship, validate:
See references/validation-queries.md for all SQL templates.
Validate both directions of every candidate relationship where possible.
Do not present unvalidated joins as confirmed. If validation cannot run, mark as validation pending.
Present all validated relationships to the user: selected models, approved connectors, keys, joins with cardinality, match rates, and validation evidence.
Ask user to confirm or modify (Checkpoint 3). Do not generate files until confirmed.
If a relationship could not be validated but user proceeds, tag it with # UNVALIDATED: <reason> in the generated YAML.
Expose all business columns from each selected gold model as Cube dimensions.
Key rules:
primary_key: true. For composite PKs, use a synthetic CONCAT dimension — see references/cube-examples.md, Composite Primary Key section._airbyte_* columns, include only _airbyte_extracted_at as dimension airbyte_extracted_at (reference as ${CUBE}._airbyte_extracted_at). Exclude all other _airbyte_* columns.See references/cube-examples.md for type mapping and dimension examples.
count measure.count_distinct on FK columns: define inside the cube that owns the FK, not the parent cube. Joins produce fan-out that distorts distinct counts.Create Cube.dev YAML files in cubes/. Follow the existing style detected in Phase 1.
Key rules:
name: at the root level. Never wrap with cubes: or views: at the root.name (no gold_ prefix) + .yml.sql_table uses fully qualified BigQuery reference with gold_ prefix.public: false.refresh_key. Use SELECT MAX(<timestamp_col>) with columns in this priority: _airbyte_extracted_at (present on all Airbyte sources), updated_at/modified_at (CDC streams), created_at (insert-only facts). Only use every: <interval> as absolute last resort when no timestamp column exists in the table — add a YAML comment explaining why (e.g. # no timestamp column available).refresh_key.sql references the same table as sql_table.# UNVALIDATED: <reason>.revos apply on a Connection), set meta.abConnectionId: <connection-id>. This groups cubes by their originating connection in the UI. Resolve the connection id with revos connections list --json — match by spec.prefix against the bronze table prefix the gold model reads from. Bridge / junction cubes built on top of connection-sourced models inherit the same abConnectionId. Cubes derived from purely local data (e.g. hand-written silver/gold models with no upstream connection) omit abConnectionId.meta.nameDimension: <short-dimension-name>. The value is the short dimension key (no ${CUBE}. prefix and no cube-name prefix) — e.g. properties_name, displayName, dealname. The frontend uses this to pick the column shown as the entity's name when the cube is added as a table (see useCubeFromMeta in the frontend). Pick the dimension that a user would recognize as "the name of this thing"; if no such single column exists (pure join / bridge cubes, fact tables, event logs), omit nameDimension. The dimension must exist on this cube under dimensions:.meta.icon only when needed — see the decision rule in "Choose an icon for each cube" below.meta is closed: only abConnectionId, nameDimension, and icon are allowed. Omit the whole meta block if none apply.See references/cube-examples.md for canonical standard cube, bridge cube, join direction examples, and refresh key variants.
Bridge/junction cubes (public: false) always omit meta.icon.
For all other cubes, decide whether meta.icon is needed:
Cube has meta.abConnectionId — check whether the source behind this connection has a branded icon.
The abConnectionId value is the RevOS connection ID (same one you resolved in rule 9).
Run revos connections get <abConnectionId> --json | jq '.spec.source.id' to get the source id,
then revos sources get <source-id> --json | jq '.icon'.
icon is a non-empty string → the UI will automatically show the integration logo. Skip meta.icon.icon is null, empty, or the field is absent → set meta.icon using the catalog below.Cube has no meta.abConnectionId → always set meta.icon using the catalog below.
| Key | Meaning | Typical cube names |
|---|---|---|
users | End-users, customers, account holders | *users*, *customers*, *accounts* |
orders | Purchase orders, transactions | *orders*, *purchases*, *transactions* |
events | Behavioral events, activity logs | *events*, *activity*, *logs* |
products | Products, SKUs, items | *products*, *items*, *skus* |
sessions | Web/app sessions | *sessions*, *visits* |
revenue | Revenue, payments, invoices | *revenue*, *payments*, *invoices* |
companies | B2B companies, organizations | *companies*, *organizations*, *accounts* |
contacts | Individual people / CRM contacts | *contacts*, *people*, *persons* |
deals | Sales deals, opportunities | *deals*, *opportunities* |
tickets | Support tickets, issues | *tickets*, *issues*, *cases* |
campaigns | Marketing campaigns | *campaigns*, *ads* |
emails | Email messages | *emails*, *messages* |
leads | Sales leads | *leads* |
tasks | Tasks, to-dos | *tasks*, *todos* |
meta:
icon: "users" # catalog key (preferred)
# — or —
icon: "url:https://cdn.example.com/icon.svg" # explicit external URL
# — or —
icon: "data:image/svg+xml;base64,…" # inline SVG data URI
Use a catalog key whenever possible. Fall back to url: or data: only for
custom brand icons the catalog does not cover. Malformed values cause
revos apply to reject the file with a clear error.
create-dbt-transformations was invoked (bridge model), it already validated dbt models. Otherwise run dbt parse.bq show <dataset>.<table_name>. If missing, document as pending.sql_table, all dimensions present, refresh_key included, joins in both directions.Created cube definitions.
Selected gold models:
- dbt/models/gold/<gold_model_1>.sql
Approved connector models:
- dbt/models/gold/<connector_model>.sql
Bridge/support models created (via create-dbt-transformations):
- dbt/models/gold/<bridge_model>.sql
Cube files:
- cubes/<entity_1>.yml (cube name: <entity_1>)
- cubes/<bridge_entity>.yml (cube name: <bridge_entity>, public: false)
Validated relationships:
- <entity_a>.<key> -> <entity_b>.<key> (<relationship_type>)
Measures:
- count
- <approved_measure_1>
Validation:
- dbt: <passed / pending / not run>
- physical tables: <passed / pending>
- join validation: <passed / pending>
- semantic validation: <passed / pending>
Unvalidated joins (tagged with # UNVALIDATED):
- <cube>.<join_target>: <reason>
Assumptions:
- <assumption>
Pending items:
- <pending_item>
Next step:
revos apply
If validation is incomplete, say exactly what remains pending.
npx claudepluginhub revosai/skills --plugin data-engineeringGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.