By entropy-data
Build data products with dbt and integrate them with Entropy Data.
Edit an output-port ODCS file under models/output_ports/v<N>/, run the contract test against the live data, and classify any failures as breaking or non-breaking changes — with suggested fixes. Only edits output-port contracts (the spec this data product commits to); input-port contracts under models/input_ports/ are upstream's responsibility and refreshed by dataproduct-implement. Trigger when the user asks to "add/remove/change a column in the data contract", "update the data contract", or "test contract changes".
Run the Data Contract CLI (`datacontract test`) against ODCS contracts in the project to verify the live data still conforms — schema, quality rules, and freshness. Handles two kinds of contracts with different semantics: output-port contracts under `models/output_ports/**/*.odcs.yaml` (tested against this project's warehouse — "am I still producing what I promised?") and input-port contracts under `models/input_ports/*.odcs.yaml` (tested against the upstream warehouse — "is upstream still producing what I trusted?"). Trigger when the user asks to "test the data contracts", "verify the data product matches its contract", "are we still contract-conformant", "check upstream drift", or "run the contract tests".
Bootstrap a brand-new dbt data product from scratch — create dbt_project.yml, the Entropy Data model layout (input_ports, staging, intermediate, output_ports/v1), README with uv install instructions, .gitignore, and a profiles.yml.example for the chosen warehouse. After scaffolding, hands off to the entropy-data-sync skill to add the publishing layer (ODPS, ODCS, OpenLineage, GitHub Actions). Trigger when the user asks to start a new data product, scaffold a new dbt project, or "create a data product from scratch."
Build out the transformation layers (`staging/`, `intermediate/`) of a data product and run dbt against them, following project-wide conventions adapted from dbt's best practices (v1.12). Trigger when the user asks to "add a staging model", "build out the staging layer", "create an intermediate model", "refactor this output port into staging + intermediate", "make this model incremental", or "run dbt for this data product".
Extract a small sample of rows from a dbt output port using a non-production profile, scrub anything classified as PII or sensitive in the data contract, and upload the scrubbed sample to Entropy Data via the entropy-data CLI. Trigger when the user asks to "upload example data", "publish sample rows for the data product", or "give consumers a preview of the data".
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Skills for your favorite coding agent that helps you implement data products with dbt, compliant to your organization's conventions, and fully integrated with Entropy Data.
It also supports a contract-driven approach: specify your requirements as a data contract, and the builder implements the data product in minutes.
The plugin ships seven skills:
dbt_project.yml, model layout, README, profiles.yml.example, and more.models/output_ports/v<N>/*.odcs.yaml using natural language.datacontract test to verify the live data still matches the schema and quality rules.The skills are plain markdown, any coding agent that can read instruction files can run them.
For major coding agents, those can be installed as a plugin:
In your terminal:
claude plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
claude plugin install dataproduct-builder-dbt@dataproduct-builder-dbt -s project
In your terminal:
codex plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
codex plugin add dataproduct-builder-dbt@dataproduct-builder-dbt
In your terminal:
copilot plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
copilot plugin install dataproduct-builder-dbt@dataproduct-builder-dbt
Any agent that reads AGENTS.md picks up the routing manifest.
Alternatively, copy the skills to the directory that your coding agent expects.
The skills authenticate against Entropy Data through a connection registered with the entropy-data CLI (requires uv).
The skills use a per-project venv for both entropy-data and datacontract. After dataproduct-bootstrap scaffolds a project, run uv sync from the project root to install both CLIs at the pinned versions, then invoke them as uv run entropy-data … / uv run datacontract ….
The one exception is the first call to dataproduct-bootstrap itself, which runs against an empty directory (no pyproject.toml, no venv yet) and needs entropy-data available globally for its lookup step. Install once per machine:
uv tool install --upgrade entropy-data
entropy-data connection add default --api-key <your-api-key> --host <your-entropy-data-host>
Create a user-scoped key in the Entropy Data web UI (Organization Settings → API Keys → Create new API key, scope User (personal token)). For CI workflows, add a connection with a team-scoped or organization-scoped API key.
Ask the agent:
Implement the data product url or id or new name.
Or:
Add the property name to the data contract and find data products that we could use as input ports. Request access and implement the dbt pipeline.
Organizations with their own data-product stack and naming conventions are encouraged to fork or copy this repository and adapt it to their environment.
Common extension points:
npx claudepluginhub entropy-data/dataproduct-builder-dbt --plugin dataproduct-builder-dbtDemo-grade builder: scaffold a Snowflake dbt data product, implement it, run dbt, run tests, run data contract tests, and publish OpenLineage — all in one go.
Build data products on Databricks with Declarative Automation Bundles and Lakeflow Spark Declarative Pipelines, integrated with Entropy Data.
MCP server that saves 98% of your context window with session continuity. Sandboxed code execution in 11 languages, FTS5 knowledge base with BM25 ranking, and automatic state restore across compactions.
Complete AI coding workflow system. Self-correcting memory + persistent FTS5-indexed research wikis + auto-research loop + multi-LLM council on a single SQLite store. 33 skills, 8 agents, 22 commands, 37 hook scripts across 24 events. Cross-agent via SkillKit.
Open-source, local-first Claude Code plugin for token reduction, context compression, and cost optimization using hybrid RAG retrieval (BM25 + vector search), reranking, AST-aware chunking, and compact context packets.