Skill

Design Data Mesh Architecture

Distributes data ownership to domain teams treating data as a product, with a self-serve platform and federated governance. Useful when centralized data platforms become bottlenecks.

data-engineering

Popularity

Stars

Forks

Shared by

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/grimoire:design-data-mesh-architecture

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Distribute data ownership to domain teams treating data as a product, connected by a self-serve platform and governed by federated standards.

SKILL.md

62 lines · ~1.7k tokens

Stats

LanguageShell

Stars12

Forks1

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Design Data Mesh Architecture

Distribute data ownership to domain teams treating data as a product, connected by a self-serve platform and governed by federated standards.

Why This Is Best Practice

Adopted by: Zalando (data mesh pioneer), JPMorgan Chase, Saxo Bank, Netflix (elements of data mesh), HelloFresh — enterprises with 10+ business domains adopting this pattern Impact: Zalando: reduced time-to-insight from weeks to days after domain team ownership; centralized data teams at scale are a bottleneck — average ticket wait time of 3-6 weeks for data access in large enterprises (Dehghani 2019) Why best: Centralized data lakes fail at scale because domain expertise cannot be centralized; Conway's Law means data architecture must mirror organizational structure

Sources: Dehghani "Data Mesh" O'Reilly (2022); Dehghani martinfowler.com (2019); Thoughtworks Technology Radar (data mesh as Adopt since 2021)

Steps

Identify domain boundaries — Map your organization's business domains (Sales, Inventory, Logistics, Finance). Each domain that generates data and consumes data from others is a candidate data owner. Boundaries should align with Conway's Law: how your teams are organized determines where domain boundaries are.
Define data products for each domain — A data product is the unit of data ownership. Each domain exposes its data as a product: analytical datasets (Parquet/Delta Lake files), APIs (REST/GraphQL), events (Kafka topics), or semantic models (dbt models). A data product has: a clear owner, an SLA, a schema, and a quality contract.
Apply data product thinking — Data products must satisfy: discoverability (searchable in a catalog), addressability (stable URI/identifier), trustworthiness (SLAs on freshness and quality), self-describing (schema, lineage, documentation), interoperable (standard formats), and secure (access-controlled). These are Dehghani's six characteristics of data products.
Build a self-serve data platform — Domain teams cannot own data products if they need a central team to provision storage, compute, and pipelines. The platform team provides: infrastructure-as-code templates for common patterns (dbt + Snowflake, Spark + Delta Lake), CI/CD templates for data pipelines, monitoring templates, and a data catalog. Platform is a product for internal domain teams.
Establish a data catalog — All data products must be discoverable. Implement a data catalog (Apache Atlas, DataHub, Amundsen, AWS Glue Data Catalog) where every data product is registered with: owner, schema, lineage, SLA, access policy, and sample data. Without discoverability, data products are invisible and unused.
Design federated data governance — Governance is not centralized; it is federated. The central data governance team defines: global standards (naming conventions, data classification, privacy requirements), interoperability contracts (common schemas for shared concepts like Customer), and quality thresholds (minimum SLA requirements for published data products). Domain teams implement governance standards in their products.
Implement data contracts — A data contract is a formal agreement between a data producer and consumer: schema, SLA (freshness, availability, quality metrics), change management process (deprecation notice period), and owner contact. Tools: DataContract CLI, Soda Core, or dbt contracts. Data contracts enforce the same discipline as API contracts for analytical data.
Design for data lineage — Every data product must expose its upstream dependencies and downstream consumers. Implement lineage tracking at the platform level (OpenLineage, Marquez). Lineage enables impact analysis: "Which data products are affected if the Orders table schema changes?" Without lineage, cross-product dependencies are invisible.
Define access management per data product — Each data product has its own access policy: public (any internal user), restricted (authenticated domain teams), classified (explicit approval required). Implement as attribute-based access control at the platform layer. Data governance team audits access policies quarterly.
Measure data product health — Each data product owner tracks: freshness SLA compliance (% of time data is updated within the SLA), schema stability (breaking changes per quarter), access request response time, and consumer satisfaction (quarterly survey). Publish metrics in the data catalog. Unhealthy data products lose consumer trust.

Rules

Domain teams own their data products end-to-end: ingestion, transformation, quality, and serving; central teams build the platform, not the products.
No data product may access another domain's raw source tables; all cross-domain data access goes through the published data product interface.
Global governance standards are non-negotiable for regulated data (PII, financial); federated autonomy does not mean ungoverned.
A data product with no SLA is not a data product; it is a data swamp.

Common Mistakes

Renaming the data lake team to "data mesh" — data mesh requires organizational change (domain ownership) not just tooling change; a renamed central team is still a centralized bottleneck.
No self-serve platform — domain teams without platform abstractions spend 80% of their time on infrastructure, not data products; the platform is a prerequisite for domain ownership.
Skipping data contracts — data products without contracts are used unpredictably; consumers build on undocumented assumptions and break when producers change schemas.
Federated governance without global standards — without interoperability standards, data mesh becomes a collection of isolated silos that cannot be joined across domains.

When NOT to Use

Organizations with fewer than 5-6 distinct business domains where a well-managed data warehouse is simpler
Early-stage companies where all data flows through a single product and a single team can own it
Teams without platform engineering capability to build the self-serve infrastructure layer — data mesh without the platform devolves to ungoverned chaos

Design Data Mesh Architecture

Popularity

Invocation

Context Preview

SKILL.md

Design Data Mesh Architecture

Popularity

Invocation

Context Preview

SKILL.md

Design Data Mesh Architecture

Why This Is Best Practice

Steps

Rules

Common Mistakes

When NOT to Use

Similar Skills

Design Data Mesh Architecture

Why This Is Best Practice

Steps

Rules

Common Mistakes

When NOT to Use

Similar Skills