From databricks-pack
Collects Databricks diagnostic bundle with environment info, cluster state/events, job details, Spark logs, and Delta history for support tickets and troubleshooting.
How this skill is triggered — by the user, by Claude, or both
Slash command
/databricks-pack:databricks-debug-bundleThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
!`databricks --version 2>/dev/null || echo 'CLI not installed'`
!databricks --version 2>/dev/null || echo 'CLI not installed'
!python3 -c "import databricks.sdk; print(f'SDK {databricks.sdk.__version__}')" 2>/dev/null || echo 'SDK not installed'
Collect all diagnostic information needed for Databricks support tickets: environment info, cluster state, cluster events, job run details, Spark driver logs, and Delta table history. Produces a redacted tar.gz bundle safe to share with support.
#!/bin/bash
set -euo pipefail
# databricks-debug-bundle.sh [cluster_id] [run_id] [table_name]
BUNDLE_DIR="databricks-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"
CLUSTER_ID="${1:-}"
RUN_ID="${2:-}"
TABLE_NAME="${3:-}"
echo "=== Databricks Debug Bundle ===" | tee "$BUNDLE_DIR/summary.txt"
echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE_DIR/summary.txt"
echo "Workspace: ${DATABRICKS_HOST:-unset}" >> "$BUNDLE_DIR/summary.txt"
{
echo ""
echo "--- Environment ---"
echo "CLI: $(databricks --version 2>&1)"
echo "SDK: $(pip show databricks-sdk 2>/dev/null | grep Version || echo 'not installed')"
echo "Python: $(python3 --version 2>&1)"
echo "OS: $(uname -srm)"
echo ""
echo "--- Current User ---"
databricks current-user me --output json 2>&1 | jq '{userName, active}' || echo "Auth failed"
} >> "$BUNDLE_DIR/summary.txt"
if [ -n "$CLUSTER_ID" ]; then
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- Cluster: $CLUSTER_ID ---" >> "$BUNDLE_DIR/summary.txt"
# Full cluster config
databricks clusters get --cluster-id "$CLUSTER_ID" --output json \
> "$BUNDLE_DIR/cluster_config.json" 2>&1
# Key fields summary
jq '{state, spark_version, node_type_id, num_workers,
autotermination_minutes, termination_reason}' \
"$BUNDLE_DIR/cluster_config.json" >> "$BUNDLE_DIR/summary.txt"
# Recent cluster events (state changes, errors, resizing)
databricks clusters events --cluster-id "$CLUSTER_ID" --limit 30 --output json \
> "$BUNDLE_DIR/cluster_events.json" 2>&1
# Extract event timeline
jq -r '.events[]? | "\(.timestamp): \(.type) — \(.details // "no details")"' \
"$BUNDLE_DIR/cluster_events.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null
fi
if [ -n "$RUN_ID" ]; then
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- Run: $RUN_ID ---" >> "$BUNDLE_DIR/summary.txt"
# Full run details
databricks runs get --run-id "$RUN_ID" --output json \
> "$BUNDLE_DIR/run_details.json" 2>&1
# Run state summary
jq '{state: .state, start_time, end_time, run_duration}' \
"$BUNDLE_DIR/run_details.json" >> "$BUNDLE_DIR/summary.txt"
# Task-level breakdown
jq -r '.tasks[]? | " Task \(.task_key): \(.state.result_state // "RUNNING") — \(.state.state_message // "ok")"' \
"$BUNDLE_DIR/run_details.json" >> "$BUNDLE_DIR/summary.txt"
# Run output (error messages, stdout)
databricks runs get-output --run-id "$RUN_ID" --output json \
> "$BUNDLE_DIR/run_output.json" 2>&1
jq '{error, error_trace: (.error_trace // "" | .[0:2000])}' \
"$BUNDLE_DIR/run_output.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null
fi
if [ -n "$CLUSTER_ID" ]; then
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- Spark Driver Logs (last 500 lines) ---" >> "$BUNDLE_DIR/summary.txt"
python3 << 'PYEOF' > "$BUNDLE_DIR/driver_logs.txt" 2>&1
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
try:
content = w.dbfs.read("/cluster-logs/${CLUSTER_ID}/driver/log4j-active.log")
# Take last 500 lines
lines = content.data.decode().splitlines()[-500:]
print("\n".join(lines))
except Exception as e:
print(f"Could not fetch driver logs: {e}")
print("Tip: Enable cluster log delivery in cluster config for persistent logs")
PYEOF
fi
if [ -n "$TABLE_NAME" ]; then
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- Delta Table: $TABLE_NAME ---" >> "$BUNDLE_DIR/summary.txt"
python3 << PYEOF > "$BUNDLE_DIR/delta_diagnostics.txt" 2>&1
from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()
print("=== Table Details ===")
spark.sql("DESCRIBE DETAIL ${TABLE_NAME}").show(truncate=False)
print("\n=== Recent History (last 20 operations) ===")
spark.sql("DESCRIBE HISTORY ${TABLE_NAME} LIMIT 20").show(truncate=False)
print("\n=== Schema ===")
spark.sql("DESCRIBE ${TABLE_NAME}").show(truncate=False)
print("\n=== File Stats ===")
detail = spark.sql("DESCRIBE DETAIL ${TABLE_NAME}").first()
print(f"Files: {detail.numFiles}, Size: {detail.sizeInBytes / 1024 / 1024:.1f} MB")
PYEOF
fi
# Redact sensitive data from config snapshot
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- Config (redacted) ---" >> "$BUNDLE_DIR/summary.txt"
if [ -f ~/.databrickscfg ]; then
sed 's/token = .*/token = ***REDACTED***/' \
~/.databrickscfg > "$BUNDLE_DIR/config-redacted.txt"
sed -i 's/client_secret = .*/client_secret = ***REDACTED***/' \
"$BUNDLE_DIR/config-redacted.txt"
fi
# Network connectivity test
echo "--- Network ---" >> "$BUNDLE_DIR/summary.txt"
echo -n "API reachable: " >> "$BUNDLE_DIR/summary.txt"
curl -s -o /dev/null -w "%{http_code}" \
"${DATABRICKS_HOST}/api/2.0/clusters/list" \
-H "Authorization: Bearer ${DATABRICKS_TOKEN}" >> "$BUNDLE_DIR/summary.txt"
echo "" >> "$BUNDLE_DIR/summary.txt"
# Create archive
tar -czf "$BUNDLE_DIR.tar.gz" "$BUNDLE_DIR"
rm -rf "$BUNDLE_DIR"
echo ""
echo "Bundle created: $BUNDLE_DIR.tar.gz"
echo "Contents: summary.txt, cluster_config.json, cluster_events.json,"
echo " run_details.json, run_output.json, driver_logs.txt,"
echo " delta_diagnostics.txt, config-redacted.txt"
databricks-debug-YYYYMMDD-HHMMSS.tar.gz containing:
summary.txt — Human-readable diagnostic summarycluster_config.json — Full cluster configurationcluster_events.json — State changes, errors, resizing eventsrun_details.json — Job run with task-level breakdownrun_output.json — Stdout/stderr and error tracesdriver_logs.txt — Last 500 lines of Spark driver logdelta_diagnostics.txt — Table details, history, schemaconfig-redacted.txt — CLI config with secrets removed| Item | Included | Notes |
|---|---|---|
| Tokens/secrets | NEVER | Redacted with ***REDACTED*** |
| PII in logs | Review before sharing | Scan driver_logs.txt manually |
| Cluster IDs | Yes | Safe to share with support |
| Error traces | Yes | Check for embedded connection strings |
# Environment only
bash databricks-debug-bundle.sh
# With cluster diagnostics
bash databricks-debug-bundle.sh 0123-456789-abcde
# With cluster + job run
bash databricks-debug-bundle.sh 0123-456789-abcde 12345
# Full diagnostics including Delta table
bash databricks-debug-bundle.sh 0123-456789-abcde 12345 catalog.schema.table
bash databricks-debug-bundle.sh [args]summary.txt for sensitive data.tar.gz bundleadb-<workspace-id>)For rate limit issues, see databricks-rate-limits.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin databricks-packDiagnoses and fixes common Databricks errors like cluster not ready, Spark OOM, Delta concurrent writes, using CLI and SDK commands.
Guides Databricks CLI operations: authentication, profiles, data exploration with AI tools, bundles, and command execution. Ensures CLI installation, auth, and profile selection.
Diagnose failed Spark jobs, unhealthy Livy sessions, and performance bottlenecks in Microsoft Fabric via read-only CLI triage. Identifies OOM, shuffle spill, data skew, and retrieves logs.