From delta-skills
Inspect a Delta Lake table — version, schema, file count, row stats, partitioning, and last commit info. Use when the user wants to understand the current state of a Delta table. Triggers on phrases like "inspect table", "what's in this delta table", "describe delta table", "check my delta table".
How this skill is triggered — by the user, by Claude, or both
Slash command
/delta-skills:inspect-tableThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Inspect a Delta Lake table by reading its `_delta_log` transaction log directly. No Spark, no `deltalake` library — just filesystem access.
Inspect a Delta Lake table by reading its _delta_log transaction log directly. No Spark, no deltalake library — just filesystem access.
The user provides a path to a Delta table directory:
/delta-skills:inspect-table path/to/table
Check that _delta_log exists inside the given path:
TABLE_PATH="$1"
if [ ! -d "$TABLE_PATH/_delta_log" ]; then
echo "ERROR: No _delta_log directory found at $TABLE_PATH"
echo "This does not appear to be a Delta Lake table."
exit 1
fi
echo "✓ Valid Delta table: $TABLE_PATH"
List all JSON commit files in _delta_log, find the highest numbered one:
ls "$TABLE_PATH/_delta_log/"*.json 2>/dev/null | sort -V | tail -1
The filename without .json and leading zeros is the current version number.
Read the latest _delta_log/XXXXXXXXXX.json file. Each line is a JSON action. Look for:
commitInfo — operation name, timestamp, user, operation metricsadd — files added, stats (numRecords, minValues, maxValues), partition valuesremove — files removedmetaData — schema definition, partition columns, table configurationThe metaData action contains schemaString — a JSON-encoded schema. Parse it to extract column names and types. metaData may not be in the latest commit — scan backwards through log files until you find it, or check the latest checkpoint.
Replay add and remove actions across the log (or from the latest checkpoint) to determine:
numRecords from add stats where available)For a quick inspection, reading the last 10–20 commit files is sufficient for recent context.
Look for _delta_log/_last_checkpoint — if it exists, read it to find the latest checkpoint version. Checkpoints (.parquet files) consolidate the log and can be used for faster state reconstruction.
if [ -f "$TABLE_PATH/_delta_log/_last_checkpoint" ]; then
cat "$TABLE_PATH/_delta_log/_last_checkpoint"
fi
Present a clean summary:
✓ Delta table: tables/customers
Version: 42
Last commit: 2025-03-18 09:12 UTC
Operation: MERGE
Files: 127 parquet files (~2.4 GB)
Estimated rows: ~4,200,000
Partitioned by: country
Checkpoint: version 40
Schema:
id long (not null)
name string
email string
country string
updated_at timestamp
Last operation metrics:
numTargetRowsInserted: 1,204
numTargetRowsUpdated: 11,196
numOutputRows: 12,400
_delta_log is missing → tell the user it's not a Delta tableshow-history to find when it was last setcommitInfo is always the first action in a commit filemetaData only appears when the schema or configuration changes — it is not in every commitadd.partitionValues as a map of column name to string valuenpx claudepluginhub chanukyapekala/delta-skills --plugin delta-skillsGenerates detailed profiles of database tables including metadata, row counts, column statistics, cardinality analysis, sample data, and quality checks for completeness, uniqueness, and freshness.
Explores Bauplan lakehouse data using Python SDK: inspect namespaces, tables, schemas, samples, profiling queries; export results to files. Read-only, phased execution produces summary.md.
Reads and explores Parquet, CSV, JSON, Arrow IPC, Avro files locally, from S3/GCS using datafusion-cli for schema inspection, row counts, and data previews.