Search everything...

Stats

Actions

Available In

fabric-dataflow-migration-toolkit

Name: fabric-dataflow-migration-toolkit
Author: kavasimihaly

By KavasiMihaly

End-to-end Power BI Dataflow Gen1 to Microsoft Fabric medallion notebook migration. Export, analyze, and generate bronze/silver notebooks on either engine — distributed PySpark (synapse_pyspark) or single-node Python (polars/duckdb/delta-rs) — then deploy and validate in Fabric.

npx claudepluginhub kavasimihaly/ai-plugins --plugin fabric-dataflow-migration-toolkit

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Agents6

fabric-bronze-builder

/agent

Build bronze layer PySpark **or** Python notebooks that ingest raw data into Fabric lakehouses with Delta Lake format. Handle OData, Excel-via-SharePoint, CSV, Parquet, JSON, and API sources. Add ingestion metadata columns, enable schema evolution, and implement append-only audit trails. The `engine` input (`pyspark` default | `python`) selects the codegen idiom: PySpark/Spark-session or single-node polars + delta-rs. Output is always `.ipynb` (Jupyter JSON) for Fabric deployment, never `.py`. MUST BE USED when creating the first ingestion layer in a medallion architecture.

fabric-migration-orchestrator

/agent

End-to-end Power BI Dataflow Gen1 → Microsoft Fabric medallion notebook migration orchestrator. Drives the full workflow: extract Gen1 dataflows, analyze M code (m-query-analyst), gather refactor decisions (migration-analyst), scaffold the Fabric medallion project, generate bronze + silver PySpark notebooks via builders, deploy via the Fabric CLI, validate (fabric-pipeline-validator), and maintain a single migration-design.md document that every stage reads and updates. MUST BE USED as the top-level agent for end-to-end Gen1 migration. Requires THREE user touch points: config Q&A, refactor Q&A, and a plan-mode approval. Run via `claude --agent fabric-dataflow-migration-toolkit:fabric-migration-orchestrator:fabric-migration-orchestrator`.

fabric-pipeline-validator

/agent

End-to-end Fabric migration validator. Invoked by the orchestrator at Stage 12. Runs static checks on every generated .ipynb notebook (valid JSON, lakehouse binding, read_bronze() contract for silver) and runtime checks against deployed lakehouses (row counts, schema match) when not in dry-run mode. Writes Section 10 (Validation Results) of `1 - Documentation/migration-design.md`. Does NOT write any other file.

fabric-silver-builder

/agent

Build silver layer notebooks (PySpark or Python engine) that clean, conform, and transform bronze Delta tables into analysis-ready datasets. Silver notebooks read exclusively from bronze tables via read_bronze() — never from external storage. Handle type casting, renaming, null handling, deduplication, and unpivot/pivot transforms. Use when creating the second transformation layer in a medallion architecture.

m-query-analyst

/agent

Mechanical Power Query M analysis. Parses .pq files exported from Power BI Dataflow Gen1 dataflows, classifies each query (output_entity / staging / transformation / parameter / helper), detects source type (CSV, Excel, AzureStorage, SQL Server, etc.), builds a dependency map, and scans for known M-conversion risk patterns. Produces JSON envelopes for the orchestrator to merge into the migration design document. NO USER INTERACTION — pure mechanical analysis. Spawned in two passes: inventory (Stage 3) and risk scan (Stage 4).

Skills8

dataflow-gen1-extractor

/dataflow-gen1-extractor

Extract Dataflow Gen1 (Power BI Dataflows) definitions from a Fabric/Power BI workspace and parse them into individual M query (.pq) files. Use when the user wants to: (1) export dataflow definitions from a Power BI workspace, (2) extract M code / Power Query code from Dataflow Gen1, (3) migrate Dataflow Gen1 to notebooks or Dataflow Gen2, (4) inventory all queries in a workspace's dataflows, (5) parse exported dataflow JSON files into .pq files. Triggers on mentions of "dataflow gen1", "export dataflow", "extract M code from dataflow", "dataflow migration", "Power BI dataflow export".

fabric-cli-runner

/fabric-cli-runner

Execute Fabric CLI (fab) commands for notebook deployment, execution, and workspace management. Use when deploying notebooks to Fabric, running notebooks, listing workspace items, or managing Fabric resources. Supports authentication, import/export, synchronous job execution, and JSON output.

fabric-lakehouse-reader

/fabric-lakehouse-reader

Query Fabric lakehouse SQL analytics endpoints with read-only access. Validate data after notebook execution, inspect table schemas, run row count checks, and export results to CSV. Use when validating pipeline output, debugging data issues, or exporting query results from Fabric lakehouses. Requires ODBC Driver 18 and Azure authentication.

fabric-notebook-deployer

/fabric-notebook-deployer

Deploy multiple .ipynb notebook files to a Microsoft Fabric workspace in batch. Wraps the Fabric REST API (via `fab api`) for createOrUpdate notebook operations with optional folder placement. Supports dry-run, glob patterns, retry on rate-limit, and structured JSON output. Use when migrating multiple notebooks at once or as part of an orchestrated deployment pipeline.

fabric-preflight-check

/fabric-preflight-check

Pre-flight validation for the fabric-dataflow-migration-toolkit. Verifies that the Fabric CLI (`fab`) is installed and authenticated, that Azure auth is current, and (optionally) that a target workspace + lakehouses exist and are accessible. Use BEFORE running long migrations to fail fast on auth/config issues. Outputs human-readable status or a JSON envelope for orchestrator integration.

Hooks1

Event Hooks

Bash

File writes

3 hooks across 2 events

Stats

Version0.6.0

LanguageJupyter Notebook

Stars0

MaintenanceExcellent

LicenseMIT

Last CommitJun 11, 2026

AddedMay 14, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

OneDayBI-Marketplace

Safety Signals

Caution

Executes bash commands

Hook triggers when Bash tool is used

Modifies files

Hook triggers on file write and edit operations

Setup

Configuration

This plugin requires configuration values that are prompted when the plugin is enabled. Sensitive values are stored in your system keychain.

gold_lakehouse

Name of the gold layer lakehouse (business-ready). Default: lh_gold

${user_config.gold_lakehouse}

azure_client_id

Azure Entra ID client/application ID. Required for service principal auth.

${user_config.azure_client_id}

azure_tenant_id

Azure Entra ID tenant ID. Required for service principal auth; leave empty for interactive auth.

${user_config.azure_tenant_id}

notebook_engine

Compute engine for generated notebooks: 'pyspark' (default) emits Spark-cluster notebooks (synapse_pyspark kernel) for larger/distributed workloads; 'python' emits single-node notebooks (jupyter kernel, polars/duckdb/delta-rs) for low-volume workloads. One engine per migration. See the decision matrix in the README.

${user_config.notebook_engine}

bronze_lakehouse

Name of the bronze layer lakehouse (raw ingestion). Default: lh_bronze

${user_config.bronze_lakehouse}

silver_lakehouse

Name of the silver layer lakehouse (cleaned/conformed). Default: lh_silver

${user_config.silver_lakehouse}

azure_client_secret

sensitive

Azure Entra ID client secret. Stored in system keychain. Required for service principal auth.

${user_config.azure_client_secret}

fabric_workspace_id

GUID of the target Fabric workspace where notebooks will be deployed. Find it in the Fabric portal URL.

${user_config.fabric_workspace_id}

source_workspace_id

GUID of the Power BI workspace containing the Dataflow Gen1 dataflows to migrate. Leave empty to specify at runtime.

${user_config.source_workspace_id}

fabric_workspace_name

Display name of the target Fabric workspace (e.g., 'Analytics Dev').

${user_config.fabric_workspace_name}

report_unknown_patterns

When the m-query-analyst detects an M pattern not in the plugin's risk catalog, should the plugin (with explicit per-pattern preview + approval) file a GitHub issue against the plugin repo so the pattern can be added to a future release? Values: 'never' (default — patterns recorded only in your local _Documentation/conversion-backlog.md), 'ask' (migration-analyst asks each run), 'always' (auto-prompt the report-unknown-patterns skill at Stage 13 with sanitization preview). Connection strings, GUIDs, file paths, and sheet names are auto-redacted before any issue is filed; you still review and approve each pattern individually.

${user_config.report_unknown_patterns}

README

Fabric Dataflow Migration Toolkit

End-to-end Power BI Dataflow Gen1 → Microsoft Fabric medallion notebook migration. Export Gen1 dataflows, analyze M code, generate bronze + silver notebooks — distributed PySpark or single-node Python (polars/delta-rs) — deploy to Fabric, and validate, in one Claude Code session.

A Claude Code plugin that automates the full Gen1-to-Fabric migration workflow: extract dataflow definitions from a Power BI workspace, analyze every M query for conversion risks, scaffold a medallion lakehouse project, generate .ipynb notebooks for the bronze and silver layers on your chosen compute engine, deploy via the Fabric CLI, and produce a validation report. Pick PySpark (synapse_pyspark kernel) for distributed workloads or Python (jupyter kernel, polars/duckdb/delta-rs) for low-volume single-node workloads — one engine per migration. Ships with bundled sample dataflows and --dry-run support so you can try the full pipeline without a Fabric workspace.

⚠️ Status: Technology Demonstration & Teaching Tool

This plugin is primarily a technology demonstration and teaching tool — built to show how Claude Code plugins can orchestrate a multi-stage data engineering migration end-to-end. It packages a working reference workflow into an installable plugin and applies hard-won lessons from the companion dbt-pipeline-toolkit plugin.

Do NOT use this plugin in production without thorough validation. Specifically:

Review every generated notebook before deployment. The m-to-pyspark-converter produces best-effort drafts; risky patterns (Excel.Workbook, AzureStorage.Blobs, custom M functions, synthetic IDs) are wrapped in # === HIGH RISK / HUMAN REVIEW REQUIRED === isolation cells precisely because they need human judgment.
Run against --sample --dry-run first, then a non-production workspace, then a staging workspace, before any production migration.
Validate row counts, schemas, and business logic against the original Dataflow Gen1 outputs. The plugin's validator checks structural shape and basic non-zero rows; it does not verify business correctness.
Test refresh schedules and downstream dependencies (semantic models, reports) before decommissioning the source dataflows.
Treat the bundled risk catalog (30 patterns) as a starting point, not a guarantee. Unknown M patterns are auto-tracked in _Documentation/conversion-backlog.md for follow-up.

The plugin is suitable for: learning the migration pattern, demonstrating agentic data engineering, prototyping Fabric medallion projects, accelerating manual migrations with a strong human-review loop. It is not suitable for unattended production cutover.

Why this plugin

Microsoft marked Dataflow Gen1 as Legacy in April 2026. The official migration paths target Dataflow Gen2 — but if your target architecture is Fabric medallion lakehouses + notebooks (the recommended pattern for new Fabric workloads), there is no first-party migration path. This plugin fills that gap, and lets you choose the compute engine per migration: PySpark for distributed workloads or single-node Python (polars/duckdb/delta-rs) for low-volume ones where spinning up a Spark cluster is overkill.

Features

Dual compute engine — generate bronze/silver notebooks as distributed PySpark (synapse_pyspark kernel) or single-node Python (jupyter kernel, polars/duckdb/delta-rs); chosen once per migration via the notebook_engine config or --engine flag, with an engine-aware validator and structure hook enforcing the right idioms on each path
6 Agents — orchestrator, mechanical M analyst, interactive migration analyst, engine-aware bronze + silver builders, end-to-end validator
9 Skills — Dataflow Gen1 extractor, M converter (PySpark or Python target), Fabric CLI runner, lakehouse reader, project initializer, data profiler, notebook deployer, pre-flight check, opt-in pattern-sharing reporter
3 Hooks — Bash auto-approval for plugin scripts, engine-aware structural validation, session-start config check
Orchestrator-as-main-agent launch — single claude --agent ... invocation drives the full 13-stage pipeline
Dry-run mode — full pipeline without Fabric access, using bundled sample dataflows
30 documented M-conversion risk patterns — best-effort notebook output with explicit human-review markers
Reference materials — per-engine style guides, notebook templates, Delta Lake patterns, M-to-PySpark/polars mappings, risk catalog

Requirements

Before installing, make sure you have:

View full README on GitHub

Similar Plugins

creative-writing

28·58·

Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.

4mo

v1.7.0

greyhaven-ai

fullstack-dev-skills

10.0k·455·

Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.

v0.4.15

Jeffallan

nature-skills

20.0k·124·

A growing collection of Claude-compatible academic workflow bundles. Covers scientific figures, manuscript writing and polishing, reviewer assessment, citation retrieval, data availability, paper reading, literature search, response letters, paper-to-PPTX conversion, and evidence-grounded Chinese invention patent drafting. Rules are organized as reusable skill folders with explicit workflows and quality checks.

v1.0.0

Yuan1z0825

context7-plugin

55.5k·266·

Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.

3mo

[email protected]

upstash

drawio-diagramming

12·108·

Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.

2mo

v2.0.0

markus41

startup-business-analyst

35.6k·106·

Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research

2mo

v1.0.5

wshobson

fabric-dataflow-migration-toolkit

Popularity

What's Inside

Confidence

Setup

Configuration

README

Fabric Dataflow Migration Toolkit