By entropy-data
Build data products on Databricks with Declarative Automation Bundles and Lakeflow Spark Declarative Pipelines, integrated with Entropy Data.
Edit an output-port ODCS file under src/output_ports/v<N>/, run the contract test against the live data, and classify any failures as breaking or non-breaking changes — with suggested fixes. Only edits output-port contracts (the spec this data product commits to); input-port contracts under src/input_ports/ are upstream's responsibility and refreshed by dataproduct-implement. Trigger when the user asks to "add/remove/change a column in the data contract", "update the data contract", or "test contract changes".
Run the Data Contract CLI (`datacontract test`) against ODCS contracts in the project to verify the live data still conforms — schema, quality rules, and freshness. Handles two kinds of contracts with different semantics: output-port contracts under `src/output_ports/**/*.odcs.yaml` (tested against this project's Databricks warehouse — "am I still producing what I promised?") and input-port contracts under `src/input_ports/*.odcs.yaml` (tested against the upstream warehouse — "is upstream still producing what I trusted?"). Trigger when the user asks to "test the data contracts", "verify the data product matches its contract", "are we still contract-conformant", "check upstream drift", or "run the contract tests".
Validate, deploy, and run the Declarative Automation Bundle's Lakeflow pipeline against a chosen Databricks target. Wraps `databricks bundle validate`, `databricks bundle deploy`, and `databricks bundle run`, then polls `databricks pipelines get` for completion and surfaces any failed expectations or pipeline-level errors. Trigger when the user asks to "deploy the data product", "run the Lakeflow pipeline", "deploy and run the bundle", or "ship this to dev".
Extract a small sample of rows from a Databricks output port via a non-production SQL warehouse, scrub anything classified as PII or sensitive in the data contract, and upload the scrubbed sample to Entropy Data via the entropy-data CLI. Trigger when the user asks to "upload example data", "publish sample rows for the data product", or "give consumers a preview of the data".
Given an Entropy Data data product URL or id, fetch its data contracts (output port ODCS files written next to the Python under src/output_ports/v<N>/, input port ODCS files cached next to their Spark reader under src/input_ports/), translate the schema into Lakeflow @dp.materialized_view / @dp.table Python pipelines, and ensure the project has the publishing layer (ODPS, GitHub Actions). Trigger when the user asks to "implement the data product <url>", "build the Lakeflow pipeline for this data product", or "scaffold output-port tables from a data contract".
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Skills for your favorite coding agent that build data products on Databricks with Declarative Automation Bundles and Lakeflow Spark Declarative Pipelines, integrated with Entropy Data.
Sibling plugin to dataproduct-builder-dbt. Pick the one that matches your stack.
It also supports a contract-driven approach: specify your requirements as a data contract, and the builder implements the data product in minutes.
The plugin ships nine skills:
databricks.yml, a serverless Lakeflow pipeline resource, a Lakeflow Job that schedules it, the src/{input_ports,transformations,output_ports/v1}/ layout, pyproject.toml, README. Shells out to databricks bundle init lakeflow-pipelines and overlays the Entropy Data pieces on top.@dp.table Python files under src/output_ports/v1/, resolves access agreements into @dp.view input ports under src/input_ports/, and runs databricks bundle validate to verify the result.databricks bundle validate → deploy → run with target selection and polls the pipeline run for completion.entropy-data access list --provider-dataproduct and applies Unity Catalog GRANT SELECT for internal consumers or creates a Delta Share for external ones, per active access agreement.src/output_ports/v<N>/*.odcs.yaml using natural language and classifies the change as breaking or additive.datacontract test to verify the live data still matches the schema and quality rules.The skills are plain markdown, any coding agent that can read instruction files can run them.
For major coding agents, those can be installed as a plugin:
In your terminal:
claude plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
claude plugin install dataproduct-builder-databricks@dataproduct-builder-databricks -s project
In your terminal:
codex plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
codex plugin add dataproduct-builder-databricks@dataproduct-builder-databricks
In your terminal:
copilot plugin marketplace add https://github.com/entropy-data/dataproduct-builder-databricks
copilot plugin install dataproduct-builder-databricks@dataproduct-builder-databricks
Any agent that reads AGENTS.md picks up the routing manifest. Alternatively, copy the skills to the directory that your coding agent expects.
The skills authenticate against three systems. Configure each once.
Databricks — workspace auth via the Databricks CLI:
brew install databricks/tap/databricks
databricks auth login --host https://<your-workspace>.cloud.databricks.com
Entropy Data — API key registered with the entropy-data CLI (requires uv).
The skills use a per-project venv for both entropy-data and datacontract. After dataproduct-init scaffolds a project, run uv sync from the project root to install both CLIs at the pinned versions, then invoke them as uv run entropy-data … / uv run datacontract ….
The one exception is the first call to dataproduct-init itself, which runs against an empty directory (no pyproject.toml, no venv yet) and needs entropy-data available globally for its lookup step. Install once per machine:
uv tool install --upgrade entropy-data
entropy-data connection add default --api-key <your-api-key> --host <your-entropy-data-host>
Create a user-scoped key in the Entropy Data web UI (Organization Settings → API Keys → Create new API key, scope User (personal token)). For CI workflows, add a connection with a team-scoped or organization-scoped key.
npx claudepluginhub entropy-data/dataproduct-builder-databricks --plugin dataproduct-builder-databricksDemo-grade builder: scaffold a Snowflake dbt data product, implement it, run dbt, run tests, run data contract tests, and publish OpenLineage — all in one go.
Build data products with dbt and integrate them with Entropy Data.
A growing collection of Claude-compatible academic workflow bundles. Covers scientific figures, manuscript writing and polishing, reviewer assessment, citation retrieval, data availability, paper reading, literature search, response letters, paper-to-PPTX conversion, and evidence-grounded Chinese invention patent drafting. Rules are organized as reusable skill folders with explicit workflows and quality checks.
Persistent file-based planning for AI coding agents. Crash-proof markdown plans (task_plan.md, findings.md, progress.md) that survive context loss and /clear, with an opt-in completion gate and multi-agent shared state. Manus-style. Works with Claude Code, Codex CLI, Cursor, Kiro, OpenCode and 60+ agents via the SKILL.md standard. Includes Arabic, German, Spanish, and Chinese (Simplified and Traditional).
Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques
Harness-native ECC operator layer - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, selective install profiles, and production-ready workflows for Claude Code, Codex, OpenCode, Cursor, and related agent harnesses
Superpowers Plus core skills library for Claude Code: planning, execution routing, TDD, debugging, and collaboration workflows