By wedneyyuri
Claude Code skills for executing Python on Databricks clusters via a stateful REPL session, with session consolidation into clean scripts.
Consolidate a Databricks REPL session into a single, clean Python file. Use this skill when the user wants to finalize, export, or consolidate a REPL session into a committable script. Triggers on requests to consolidate session output, produce a final script from REPL commands, export session to Python, clean up REPL artifacts into production code, or finalize a Databricks workflow.
Execute Python code on a Databricks cluster via a stateful REPL session. Use this skill when the user wants to run Python on Databricks, perform data analysis with Spark, train models on a cluster, query Unity Catalog tables, use sub_llm() for recursive LM calls, or any task requiring a persistent Databricks execution context. Always use a dedicated --project-dir (e.g., examples/my-task/) to isolate session.json and repl_outputs per task. Covers session lifecycle (create, exec, await, cancel, destroy), output file management, and eviction recovery.
Genie gives you AI inside Databricks. This gives you Databricks inside AI.
Run code on Databricks clusters while your agent orchestrates everything else — other skills, subagents, MCPs, local files, and parallel hypothesis validation. One session, no boundaries.
Databricks is powerful. But Databricks inside an AI agent that can parallelize work, compose tools, and cross every boundary? That's something else.
Works with Claude Code, Cursor, GitHub Copilot, and 40+ other agents.
Genie works inside one notebook, one workspace. When the real work crosses boundaries, it stops. Your AI agent doesn't.
| What you need | Genie | Your agent + this skill |
|---|---|---|
| Analyze a repo and cross-reference with Databricks logs | Workspace only | Reads repo + queries cluster in one session |
| Validate 3 hypotheses in parallel on different datasets | One notebook at a time | Spawns subagents, each running its own cluster query |
| Train on cluster, compare with local baselines, commit results | Can't access local files or git | Cluster compute + local files + git — same session |
| Use an MCP to enrich data before running Spark | No MCP support | Calls MCPs, APIs, other skills, then sends to cluster |
| Explore Python + Scala + SQL across multiple repos | Single-language notebooks | Subagents explore each language, agent synthesizes |
| Resume after cluster eviction | Start over | Append-only session log with replay |
The difference isn't features. It's architecture. Genie is an assistant scoped to Databricks. This makes Databricks one resource inside an orchestrator that can do anything — use GSD, superpowers, compose skills, spawn subagents, interact with MCPs, and parallelize work across tools.
/plugin marketplace add wedneyyuri/databricks-repl
/plugin install databricks-repl@wedneyyuri-databricks-repl
npx skills add wedneyyuri/databricks-repl
The CLI detects which agents you have and installs to each one automatically.
You: "Load the customers table, train a classifier,
compare with last quarter's local baseline,
and open a PR with the results"
Claude:
→ creates a REPL session on your Databricks cluster
→ runs the training code, captures outputs as files
→ reads your local baseline for comparison
→ consolidates everything into a clean .py file
→ commits and opens the PR
Five tools, one session. No switching between terminal, notebooks, and browser.
Context stays clean. Sessions stay productive for 50+ interactions.
| Example | What It Shows |
|---|---|
| primes | Basic Python execution on a Databricks cluster |
| monte-carlo-pi | Distributed Spark — estimate π scaling from 100M to 10B samples |
| iris-classification | Full ML pipeline — load, train, evaluate, persist model to Volumes |
| Skill | What It Does |
|---|---|
| databricks-repl | Execute Python on Databricks via a stateful REPL session |
| databricks-repl-consolidate | Turn a REPL session into a single committable .py file |
~/.databrickscfgpip install databricks-sdk)These skills follow the Agent Skills Specification. If you prefer not to use the marketplace or npx skills, copy the skills manually:
git clone https://github.com/wedneyyuri/databricks-repl.git /tmp/databricks-repl
# Cursor
cp -r /tmp/databricks-repl/skills/databricks-repl .cursor/skills/
cp -r /tmp/databricks-repl/skills/databricks-repl-consolidate .cursor/skills/
# GitHub Copilot
mkdir -p .github/skills
cp -r /tmp/databricks-repl/skills/databricks-repl .github/skills/
cp -r /tmp/databricks-repl/skills/databricks-repl-consolidate .github/skills/
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub wedneyyuri/databricks-replClaude Code skill pack for Databricks (24 skills)
Databricks development toolkit with skills for data engineering, ML, and AI agents plus MCP tools for direct Databricks operations
Databricks skills for CLI, Apps, Unity Catalog, Model Serving, Declarative Automation Bundles (DABs), and more.
This plugin provides a specialized suite of skills for data engineers and database practitioners working on Google Cloud. It acts as an expert assistant, allowing you to use natural language prompts in your preferred coding agent to architect complex data pipelines, transform data with dbt, write Spark and BigQuery SQL notebooks, and orchestrate end-to-end workflows across GCP's data ecosystem.
Editorial "Data Engineering" bundle for Claude Code from Antigravity Awesome Skills.
Data engineering plugin - warehouse exploration, pipeline authoring, Airflow integration