From software-architect-assistant
Design data architectures with modeling, pipelines, and governance
How this skill is triggered — by the user, by Claude, or both
Slash command
/software-architect-assistant:skills/data-architectureThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design data architectures including data models, pipeline designs, governance frameworks, and quality management for operational and analytical systems.
Design data architectures including data models, pipeline designs, governance frameworks, and quality management for operational and analytical systems.
| Parameter | Type | Required | Validation | Default |
|---|---|---|---|---|
data_domain | string | ✅ | min: 20 chars | - |
design_type | enum | ⚪ | model|pipeline|governance|quality | model |
data_type | enum | ⚪ | operational|analytical|streaming | operational |
volume_tier | enum | ⚪ | small|medium|large|massive | medium |
output_format | enum | ⚪ | erd|yaml|json | erd |
┌──────────────────────────────────────────────────────────┐
│ 1. VALIDATE: Check data domain and requirements │
│ 2. DISCOVER: Identify data sources and entities │
│ 3. MODEL: Create conceptual/logical/physical model │
│ 4. DESIGN: Pipeline or governance framework │
│ 5. QUALITY: Define data quality rules │
│ 6. VALIDATE: Check model consistency │
│ 7. DOCUMENT: Return data architecture │
└──────────────────────────────────────────────────────────┘
| Error | Retry | Backoff | Max Attempts |
|---|---|---|---|
VALIDATION_ERROR | No | - | 1 |
MODEL_GENERATION_ERROR | Yes | 1s | 2 |
FORMAT_ERROR | Yes | 500ms | 3 |
log_points:
- event: design_started
level: info
data: [design_type, data_type]
- event: entities_identified
level: info
data: [entity_count, relationship_count]
- event: quality_rules_defined
level: info
data: [rule_count, dimensions_covered]
metrics:
- name: models_created
type: counter
labels: [design_type]
- name: design_time_ms
type: histogram
- name: entity_count
type: gauge
| Error Code | Description | Recovery |
|---|---|---|
E401 | Missing data domain | Request domain description |
E402 | Invalid relationships | Highlight circular/missing refs |
E403 | Schema validation failed | Show validation errors |
E404 | Unsupported volume tier | Suggest architectural changes |
test_cases:
- name: "E-commerce data model"
input:
data_domain: "E-commerce order management"
design_type: "model"
output_format: "erd"
expected:
has_entities: true
entities_include: ["Customer", "Order", "Product"]
has_relationships: true
valid_erd: true
- name: "Analytics pipeline"
input:
data_domain: "Customer analytics"
design_type: "pipeline"
data_type: "analytical"
expected:
has_ingestion: true
has_transformation: true
has_serving: true
- name: "Data quality rules"
input:
data_domain: "User profiles"
design_type: "quality"
expected:
has_dimensions: true
dimensions_include: ["completeness", "accuracy"]
has_rules: true
| Symptom | Root Cause | Resolution |
|---|---|---|
| Missing relationships | Incomplete domain | Add missing entities |
| Invalid ERD syntax | Format error | Validate Mermaid ERD |
| Missing quality rules | Dimensions not specified | Add quality dimensions |
□ Is data domain clearly defined?
□ Are all entities identified?
□ Are relationships correctly typed?
□ Is output format valid?
□ Are quality dimensions covered?
| Dimension | Example Rule |
|---|---|
| Completeness | NOT NULL checks |
| Accuracy | Regex validation |
| Consistency | Referential integrity |
| Timeliness | SLA monitoring |
| Uniqueness | Primary key constraints |
| Component | Trigger | Data Flow |
|---|---|---|
| Agent 06 | Design request | Receives domain, returns model |
| Agent 04 | Cloud data services | Cloud data platform |
| Version | Date | Changes |
|---|---|---|
| 2.0.0 | 2025-01 | Production-grade: ERD, pipelines, DQ framework |
| 1.0.0 | 2024-12 | Initial release |
npx claudepluginhub pluginagentmarketplace/custom-plugin-software-architect --plugin software-architect-assistantDesigns data pipeline architectures for batch ETL, streaming, or hybrid scenarios including tech stacks, ASCII diagrams, data quality strategies, and cost analysis. Useful for real-time processing, BI reporting, or migrations.
Design batch and streaming data pipelines. Plan ingestion, transformation, quality checks, and failure recovery. Use when building ETL/ELT systems or data infrastructure.
Defines a systematic data quality framework across six dimensions (completeness, accuracy, consistency, timeliness, validity, uniqueness) for data pipelines and warehouses. Use when standardizing data quality measurement, automation, and accountability.