Search everything...

Stats

Actions

Available In

dataproduct-builder-databricks

Name: dataproduct-builder-databricks
Author: entropy-data

By entropy-data

Build data products on Databricks with Declarative Automation Bundles and Lakeflow Spark Declarative Pipelines, integrated with Entropy Data.

npx claudepluginhub entropy-data/dataproduct-builder-databricks --plugin dataproduct-builder-databricks

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills9

datacontract-edit

/datacontract-edit

Edit an output-port ODCS file under src/output_ports/v<N>/, run the contract test against the live data, and classify any failures as breaking or non-breaking changes — with suggested fixes. Only edits output-port contracts (the spec this data product commits to); input-port contracts under src/input_ports/ are upstream's responsibility and refreshed by dataproduct-implement. Trigger when the user asks to "add/remove/change a column in the data contract", "update the data contract", or "test contract changes".

datacontract-test

/datacontract-test

Run the Data Contract CLI (`datacontract test`) against ODCS contracts in the project to verify the live data still conforms — schema, quality rules, and freshness. Handles two kinds of contracts with different semantics: output-port contracts under `src/output_ports/**/*.odcs.yaml` (tested against this project's Databricks warehouse — "am I still producing what I promised?") and input-port contracts under `src/input_ports/*.odcs.yaml` (tested against the upstream warehouse — "is upstream still producing what I trusted?"). Trigger when the user asks to "test the data contracts", "verify the data product matches its contract", "are we still contract-conformant", "check upstream drift", or "run the contract tests".

dataproduct-deploy

/dataproduct-deploy

Validate, deploy, and run the Declarative Automation Bundle's Lakeflow pipeline against a chosen Databricks target. Wraps `databricks bundle validate`, `databricks bundle deploy`, and `databricks bundle run`, then polls `databricks pipelines get` for completion and surfaces any failed expectations or pipeline-level errors. Trigger when the user asks to "deploy the data product", "run the Lakeflow pipeline", "deploy and run the bundle", or "ship this to dev".

dataproduct-exampledata

/dataproduct-exampledata

Extract a small sample of rows from a Databricks output port via a non-production SQL warehouse, scrub anything classified as PII or sensitive in the data contract, and upload the scrubbed sample to Entropy Data via the entropy-data CLI. Trigger when the user asks to "upload example data", "publish sample rows for the data product", or "give consumers a preview of the data".

dataproduct-implement

/dataproduct-implement

Given an Entropy Data data product URL or id, fetch its data contracts (output port ODCS files written next to the Python under src/output_ports/v<N>/, input port ODCS files cached next to their Spark reader under src/input_ports/), translate the schema into Lakeflow @dp.materialized_view / @dp.table Python pipelines, and ensure the project has the publishing layer (ODPS, GitHub Actions). Trigger when the user asks to "implement the data product <url>", "build the Lakeflow pipeline for this data product", or "scaffold output-port tables from a data contract".

Hooks1

Event Hooks

File writes

1 hook across 1 event

Stats

Version0.3.4

LanguageShell

Stars0

MaintenanceExcellent

LicenseMIT

Last CommitMay 29, 2026

AddedMay 22, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

dataproduct-builder-databricks

Safety Signals

Caution

Modifies files

Hook triggers on file write and edit operations

README

dataproduct-builder-databricks

Skills for your favorite coding agent that build data products on Databricks with Declarative Automation Bundles and Lakeflow Spark Declarative Pipelines, integrated with Entropy Data.

Sibling plugin to dataproduct-builder-dbt. Pick the one that matches your stack.

It also supports a contract-driven approach: specify your requirements as a data contract, and the builder implements the data product in minutes.

Skills

The plugin ships nine skills:

dataproduct-init scaffolds a new Databricks bundle from scratch: databricks.yml, a serverless Lakeflow pipeline resource, a Lakeflow Job that schedules it, the src/{input_ports,transformations,output_ports/v1}/ layout, pyproject.toml, README. Shells out to databricks bundle init lakeflow-pipelines and overlays the Entropy Data pieces on top.
dataproduct-implement analyzes the input and output data contracts, generates @dp.table Python files under src/output_ports/v1/, resolves access agreements into @dp.view input ports under src/input_ports/, and runs databricks bundle validate to verify the result.
dataproduct-deploy wraps databricks bundle validate → deploy → run with target selection and polls the pipeline run for completion.
dataproduct-share reads entropy-data access list --provider-dataproduct and applies Unity Catalog GRANT SELECT for internal consumers or creates a Delta Share for external ones, per active access agreement.
entropy-data-publish audits an existing bundle against the Entropy Data reference layout, adds missing ODPS/ODCS files, generates the GitHub Actions publish workflow, and registers git connections.
datacontract-edit edits an output-port src/output_ports/v<N>/*.odcs.yaml using natural language and classifies the change as breaking or additive.
datacontract-test runs datacontract test to verify the live data still matches the schema and quality rules.
dataproduct-exampledata extracts sample rows, drops PII columns flagged in the contract, and uploads the scrubbed sample to Entropy Data.
entropy-data-teams lists the teams configured in Entropy Data so the user can pick an owner.

Install

The skills are plain markdown, any coding agent that can read instruction files can run them.

For major coding agents, those can be installed as a plugin:

Claude Code

In your terminal:

claude plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
claude plugin install dataproduct-builder-databricks@dataproduct-builder-databricks -s project

OpenAI Codex

In your terminal:

codex plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
codex plugin add dataproduct-builder-databricks@dataproduct-builder-databricks

GitHub Copilot CLI

In your terminal:

copilot plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
copilot plugin install dataproduct-builder-databricks@dataproduct-builder-databricks

Other agents (Cursor, Aider, etc.)

Any agent that reads AGENTS.md picks up the routing manifest. Alternatively, copy the skills to the directory that your coding agent expects.

Connect

The skills authenticate against three systems. Configure each once.

Databricks — workspace auth via the Databricks CLI:

brew install databricks/tap/databricks
databricks auth login --host https://<your-workspace>.cloud.databricks.com

Entropy Data — API key registered with the entropy-data CLI (requires uv).

The skills use a per-project venv for both entropy-data and datacontract. After dataproduct-init scaffolds a project, run uv sync from the project root to install both CLIs at the pinned versions, then invoke them as uv run entropy-data … / uv run datacontract ….

The one exception is the first call to dataproduct-init itself, which runs against an empty directory (no pyproject.toml, no venv yet) and needs entropy-data available globally for its lookup step. Install once per machine:

uv tool install --upgrade entropy-data
entropy-data connection add default --api-key <your-api-key> --host <your-entropy-data-host>

Create a user-scoped key in the Entropy Data web UI (Organization Settings → API Keys → Create new API key, scope User (personal token)). For CI workflows, add a connection with a team-scoped or organization-scoped key.

View full README on GitHub

dataproduct-builder-databricks

Popularity

What's Inside

Confidence

README

dataproduct-builder-databricks

Skills

Install

Claude Code

OpenAI Codex

GitHub Copilot CLI

Other agents (Cursor, Aider, etc.)

Connect

Similar Plugins

nature-skills

planning-with-files

superpowers

ecc

superpowers-plus

More by entropy-data

dataproduct-builder-demo

dataproduct-builder-dbt

dataproduct-builder-databricks

Skills

Install

Claude Code

OpenAI Codex

GitHub Copilot CLI

Other agents (Cursor, Aider, etc.)

Connect

Popularity

Health & Quality

More by entropy-data

dataproduct-builder-demo

dataproduct-builder-dbt

Similar Plugins

nature-skills

planning-with-files

superpowers

ecc

superpowers-plus