Inspect the RevOS BigQuery lakehouse: list datasets and tables, introspect table schemas and column types, preview sample rows, assess data layers (bronze/silver/gold), and check data completeness and null rates. Required companion skill for create-dbt-transformations and create-cubes — load before generating dbt models or cube definitions to introspect warehouse columns and types. Use when asked to: explore the lakehouse, list BigQuery tables, inspect a table schema, preview data, check raw source tables, assess data quality, check null rates, understand available data, or perform BigQuery schema introspection.
How this skill is triggered — by the user, by Claude, or both
Slash command
/revos-data-engineering:explore-lakehouseThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use `bq` CLI to explore the BigQuery lakehouse for this project.
Use bq CLI to explore the BigQuery lakehouse for this project.
Resolve connection details from env vars before running any command:
echo "Project: $GOOGLE_CLOUD_PROJECT"
echo "Dataset: $REVOS_BQ_DATASET"
$GOOGLE_CLOUD_PROJECT — BQ project ID$REVOS_BQ_DATASET — default dataset (may be overridden by user)INFORMATION_SCHEMA queries: omit --location flag — use plain bq query --nouse_legacy_sqlList tables in the org's dataset:
bq ls $REVOS_BQ_DATASET
List all datasets in the project (only if the user explicitly asks):
bq ls --project_id=$GOOGLE_CLOUD_PROJECT
Inspect a table schema (filter out internal columns):
bq show --schema --format=prettyjson $REVOS_BQ_DATASET.<table> | python3 -c "
import json, sys
cols = json.load(sys.stdin)
names = [c['name'] for c in cols if not c['name'].startswith('_airbyte')]
print('\n'.join(names))
"
Preview sample rows:
bq head -n 5 $REVOS_BQ_DATASET.<table>
Get row counts for a list of tables:
for table in table1 table2 table3; do
echo -n "$table: "
bq query --nouse_legacy_sql --format=csv \
"SELECT COUNT(*) FROM \`$GOOGLE_CLOUD_PROJECT.$REVOS_BQ_DATASET.$table\`" 2>/dev/null | tail -1
done
Check null rates on a set of columns:
bq query --nouse_legacy_sql "
SELECT
COUNTIF(col1 IS NULL) AS col1_null,
COUNTIF(col2 IS NULL) AS col2_null,
COUNT(*) AS total
FROM \`$GOOGLE_CLOUD_PROJECT.$REVOS_BQ_DATASET.<table>\`
"
bq ls $REVOS_BQ_DATASETrevos sources create to open the RevOS UIrevos sources listsalesforce_*, stripe_*, hubspot_*)find dbt/models -type f | sort.gitkeep → that layer hasn't been built yetproperties_*, fields_*), no aggregations, no joins visible in schema_id foreign keysarr, churn_rate, ltv)bq ls <dataset>id, *_id, email, namecreated_at, *_date, *_atCOUNTIF(col IS NULL) query per table covering those columnsarchived / is_deleted flag, filter it out: WHERE archived = false OR archived IS NULL| Field | Nulls | % Missing | Status |
|---|---|---|---|
| ... | ... | ...% | ✅ / ⚠️ / ❌ |
Status thresholds: ✅ < 5% · ⚠️ 5–50% · ❌ > 50%
bq show --schema — get full column list (omit _airbyte_* columns)SELECT COUNT(*) — row countbq head -n 5 — sample rows_airbyte_* columns by name or explain their origin; omit them from schema summaries_airbyte_extracted_at"npx claudepluginhub revosai/skills --plugin data-engineeringGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.