Accelerate and improve DataHub connector development and metadata management with skills for planning connectors, reviewing code against golden standards, enriching metadata, tracing lineage, managing data quality, searching the catalog, setting up environments, and scaffolding micro frontends.
Add or update metadata in DataHub - descriptions, tags, glossary terms, ownership
Explore lineage, trace data dependencies, and perform impact analysis
Manage data quality — assertions, incidents, and subscriptions
Search the DataHub catalog and answer questions about your data
Set up DataHub connection, install CLI, configure authentication and default scopes
Use this agent when you need to verify whether a PR author has genuinely addressed previous review comments before re-review. This agent fetches review comments, classifies them by type (code change request vs. discussion vs. question), and checks whether each was substantively addressed — not just marked as resolved. <example> Context: A PR has been updated after review and the author is requesting re-review. user: "Check if the author addressed all review comments on PR #1234" assistant: "I'll use the comment-resolution-checker agent to verify whether all review comments on PR #1234 have been substantively addressed." <commentary> PR re-review readiness check triggers this agent. </commentary> </example> <example> Context: User wants to know what's still outstanding on a PR before approving. user: "What review comments are still unaddressed on this PR?" assistant: "I'll use the comment-resolution-checker agent to analyze the PR's review comments and identify any that haven't been addressed." <commentary> Checking for unaddressed comments triggers this agent. </commentary> </example>
Research source systems for DataHub connector development. Gathers documentation, finds similar connectors, identifies entity mappings, and assesses implementation complexity. Returns structured findings for planning phase. <example> Context: User wants to build a new DataHub connector for a source system. user: "Research Snowplow for a new DataHub connector" assistant: "I'll use the connector-researcher agent to gather comprehensive research on Snowplow including API documentation, similar connectors, and entity mappings." <commentary> New connector research request triggers this agent. </commentary> </example> <example> Context: User is starting connector development and needs background information. user: "I need to build a connector for DuckDB, what do I need to know?" assistant: "I'll use the connector-researcher agent to research DuckDB's metadata APIs, find similar DataHub connectors, and assess implementation complexity." <commentary> Connector development information request triggers this agent. </commentary> </example>
Run provided validation scripts, analyze their output, and report results for DataHub connector verification steps. Handles extraction verification, capability checks, code quality gates, source connectivity, ingestion runs, and CLI verification. <example> Context: Workflow needs to verify that extraction output contains expected entities. user: "Run the verify-extraction script on the output file" assistant: "I'll use the connector-validator agent to run the verification script and analyze the results." <commentary> Extraction verification is a procedural script-running task that triggers this agent. </commentary> </example> <example> Context: Workflow needs to check that declared capabilities produce actual output. user: "Run the capability check on the connector" assistant: "I'll use the connector-validator agent to run the capability check script and report coverage." <commentary> Capability validation is a script-based check that triggers this agent. </commentary> </example>
Execute DataHub search, browse, and lineage operations, retrieve entity metadata, and return structured results. Used by the datahub-search and datahub-lineage skills to delegate catalog queries. <example> Context: User wants to find all Snowflake datasets with PII tags. user: "Search DataHub for Snowflake datasets tagged with PII" assistant: "I'll use the metadata-searcher agent to query DataHub for Snowflake datasets with PII tags." <commentary> The search skill delegates the actual search execution to this agent, which runs the queries and returns structured results. </commentary> </example> <example> Context: User asks who owns the revenue pipeline and needs metadata gathered. user: "Who owns the revenue pipeline?" assistant: "I'll use the metadata-searcher agent to find revenue-related pipelines and retrieve their ownership metadata." <commentary> The search skill delegates multi-step metadata retrieval to this agent, which searches, fetches aspects, and returns evidence for answering the question. </commentary> </example>
Plans new DataHub connectors by classifying the source system, researching it using a dedicated agent or inline research, and generating a _PLANNING.md blueprint with entity mapping and architecture decisions. Use when building a new connector, researching a source system for DataHub, or designing connector architecture. Triggers on: "plan a connector", "new connector for X", "research X for DataHub", "design connector for X", "create planning doc", or any request to plan/research/design a DataHub ingestion source.
Reviews DataHub connector implementations against 22 golden standards for compliance, code quality, silent failures, test coverage, type design, and merge readiness. Use when reviewing connector code, checking a PR, auditing a connector implementation, or verifying connector standards compliance.
Use this skill when the user wants to add or update metadata in DataHub: descriptions, tags, glossary terms, ownership, deprecation, domains, data products, structured properties, documents, or field-level metadata. Triggers on: "add tag to X", "update description for X", "set owner of X", "add glossary term", "deprecate X", "create a domain", "create a glossary term", "add a document", or any request to modify DataHub metadata.
Use this skill when the user wants to explore lineage, trace data dependencies, perform impact analysis, find root causes, map data pipelines, or understand how data flows between systems. Triggers on: "what feeds into X", "what depends on X", "show lineage for X", "impact analysis", "trace the pipeline", "root cause", "upstream of X", "downstream of X", or any request involving data lineage and dependency tracking.
Configure a DataHub instance to load and display a Micro Frontend (MFE) app. Use when the user wants to register an MFE with DataHub, add an MFE to the nav sidebar, set up MFE config for local dev or production/k8s, or troubleshoot MFE loading issues.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Agent skills for working with DataHub — plan and review connectors, search the catalog, enrich metadata, trace lineage, manage data quality, and set up connections. Works with Claude Code, Cortex Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, and other Agent Skills-compatible tools.
Search the DataHub catalog, discover entities, and answer ad-hoc questions about your data. Supports keyword search, filtered browse, column-name search, structured property queries, and multi-step question answering.
> Find revenue tables in Snowflake
> Who owns the customer pipeline?
> /datahub-search datasets tagged PII
Add or update metadata in DataHub — descriptions, tags, glossary terms, ownership, and deprecation. Shows a before/after plan and asks for approval before making changes.
> Add a description to the orders table
> Tag these columns as PII
> /datahub-enrich set owner of revenue_daily to @jdoe
Explore data lineage, trace upstream sources and downstream consumers, perform impact analysis, and map cross-platform data flows.
> What feeds into the revenue dashboard?
> Impact analysis for changing the orders table
> /datahub-lineage trace the customer pipeline
Manage data quality — create and run assertions (freshness, volume, SQL, field, schema), set up smart AI-inferred assertions, raise and resolve incidents, and configure notification subscriptions. Separates Open Source (diagnostic) from Cloud (full management) capabilities.
> Find datasets with failing assertions
> Create a freshness assertion on the orders table
> /datahub-quality raise an incident on the customer pipeline
> Subscribe me to assertion failures via Slack
Install the DataHub CLI, configure authentication, verify connectivity, and set up default scopes and profiles for the other interaction skills.
> Set up my DataHub connection
> /datahub-setup focus on Snowflake in the Finance domain
> Create a profile for the data-eng team
Walks you through building a new DataHub connector in four steps: classify the source system type, research it (using a dedicated agent or inline), generate a _PLANNING.md with entity mapping and architecture, and get your sign-off before anyone writes code.
> Plan a connector for ClickHouse
> /connector-planning duckdb
Checks connector code against the 22 standards (see below). On Claude Code it runs five agents in parallel — silent failures, test coverage, type design, simplification, comment resolution. On other platforms it does the same checks one at a time.
> Review my connector
> /connector-review postgres
> Review PR #1234
If you're on Claude Code and want the parallel review, also install pr-review-toolkit:
claude plugin install pr-review-toolkit@claude-plugins-official
Loads all 22 connector standards into context. Run this before starting connector work so the agent actually knows what it's checking against.
> Load the DataHub standards
> What are the connector standards?
The Skills CLI detects your installed agents and sets things up:
npx skills add datahub-project/datahub-skills
Works with most agents including Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, Cline, and Roo Code.
# Option A: Plugin install (gets you hooks, slash commands, multi-agent dispatch)
claude plugin install datahub-skills
# Also install pr-review-toolkit for multi-agent reviews:
claude plugin install pr-review-toolkit@claude-plugins-official
# Option B: Skills CLI (project-level, installs to .claude/skills/)
npx skills add datahub-project/datahub-skills -a claude-code
Then:
> Search for revenue tables in Snowflake
> /datahub-search who owns the customer pipeline?
> /datahub-enrich add description to orders table
> /datahub-lineage what feeds into the revenue dashboard?
> /datahub-quality find datasets with failing assertions
> /datahub-setup verify my connection
> /connector-review snowflake
> /connector-planning duckdb
npx skills add datahub-project/datahub-skills -a cursor
# Installs to .agents/skills/
Cursor picks up skills from .agents/skills/ automatically:
> Search DataHub for customer tables
> Review my DataHub connector
> Plan a connector for ClickHouse
npx claudepluginhub datahub-project/datahub-skills --plugin datahub-skillsSpec-Driven Development framework for Data Engineering — 58 agents, 24 KB domains, 5-phase SDD workflow, 31 commands
Skills and tools powered by the Honeydew MCP that help coding agents query data and build semantic models
Quick insights from dlt pipeline data. Connect to a pipeline, profile tables, plan charts, and assemble marimo dashboards.
Atlan context layer plugin for Claude Code. Search, explore, govern, and manage your data assets through natural language. Powered by the Atlan MCP server with semantic search, lineage traversal, glossary management, data quality rules, and more.
Connect to Knowledge Catalog to discover, manage, monitor, and govern data and AI artifacts across your data platform
Skills for working with Bauplan data lakehouses. Includes data exploration, pipeline creation, safe S3 ingestion, pipeline debugging, data assessment, and quality check generation.