By KavasiMihaly
End-to-end Power BI Dataflow Gen1 to Microsoft Fabric medallion notebook migration. Export, analyze, and generate bronze/silver notebooks on either engine — distributed PySpark (synapse_pyspark) or single-node Python (polars/duckdb/delta-rs) — then deploy and validate in Fabric.
Build bronze layer PySpark **or** Python notebooks that ingest raw data into Fabric lakehouses with Delta Lake format. Handle OData, Excel-via-SharePoint, CSV, Parquet, JSON, and API sources. Add ingestion metadata columns, enable schema evolution, and implement append-only audit trails. The `engine` input (`pyspark` default | `python`) selects the codegen idiom: PySpark/Spark-session or single-node polars + delta-rs. Output is always `.ipynb` (Jupyter JSON) for Fabric deployment, never `.py`. MUST BE USED when creating the first ingestion layer in a medallion architecture.
End-to-end Power BI Dataflow Gen1 → Microsoft Fabric medallion notebook migration orchestrator. Drives the full workflow: extract Gen1 dataflows, analyze M code (m-query-analyst), gather refactor decisions (migration-analyst), scaffold the Fabric medallion project, generate bronze + silver PySpark notebooks via builders, deploy via the Fabric CLI, validate (fabric-pipeline-validator), and maintain a single migration-design.md document that every stage reads and updates. MUST BE USED as the top-level agent for end-to-end Gen1 migration. Requires THREE user touch points: config Q&A, refactor Q&A, and a plan-mode approval. Run via `claude --agent fabric-dataflow-migration-toolkit:fabric-migration-orchestrator:fabric-migration-orchestrator`.
End-to-end Fabric migration validator. Invoked by the orchestrator at Stage 12. Runs static checks on every generated .ipynb notebook (valid JSON, lakehouse binding, read_bronze() contract for silver) and runtime checks against deployed lakehouses (row counts, schema match) when not in dry-run mode. Writes Section 10 (Validation Results) of `1 - Documentation/migration-design.md`. Does NOT write any other file.
Build silver layer notebooks (PySpark or Python engine) that clean, conform, and transform bronze Delta tables into analysis-ready datasets. Silver notebooks read exclusively from bronze tables via read_bronze() — never from external storage. Handle type casting, renaming, null handling, deduplication, and unpivot/pivot transforms. Use when creating the second transformation layer in a medallion architecture.
Mechanical Power Query M analysis. Parses .pq files exported from Power BI Dataflow Gen1 dataflows, classifies each query (output_entity / staging / transformation / parameter / helper), detects source type (CSV, Excel, AzureStorage, SQL Server, etc.), builds a dependency map, and scans for known M-conversion risk patterns. Produces JSON envelopes for the orchestrator to merge into the migration design document. NO USER INTERACTION — pure mechanical analysis. Spawned in two passes: inventory (Stage 3) and risk scan (Stage 4).
Extract Dataflow Gen1 (Power BI Dataflows) definitions from a Fabric/Power BI workspace and parse them into individual M query (.pq) files. Use when the user wants to: (1) export dataflow definitions from a Power BI workspace, (2) extract M code / Power Query code from Dataflow Gen1, (3) migrate Dataflow Gen1 to notebooks or Dataflow Gen2, (4) inventory all queries in a workspace's dataflows, (5) parse exported dataflow JSON files into .pq files. Triggers on mentions of "dataflow gen1", "export dataflow", "extract M code from dataflow", "dataflow migration", "Power BI dataflow export".
Execute Fabric CLI (fab) commands for notebook deployment, execution, and workspace management. Use when deploying notebooks to Fabric, running notebooks, listing workspace items, or managing Fabric resources. Supports authentication, import/export, synchronous job execution, and JSON output.
Query Fabric lakehouse SQL analytics endpoints with read-only access. Validate data after notebook execution, inspect table schemas, run row count checks, and export results to CSV. Use when validating pipeline output, debugging data issues, or exporting query results from Fabric lakehouses. Requires ODBC Driver 18 and Azure authentication.
Deploy multiple .ipynb notebook files to a Microsoft Fabric workspace in batch. Wraps the Fabric REST API (via `fab api`) for createOrUpdate notebook operations with optional folder placement. Supports dry-run, glob patterns, retry on rate-limit, and structured JSON output. Use when migrating multiple notebooks at once or as part of an orchestrated deployment pipeline.
Pre-flight validation for the fabric-dataflow-migration-toolkit. Verifies that the Fabric CLI (`fab`) is installed and authenticated, that Azure auth is current, and (optionally) that a target workspace + lakehouses exist and are accessible. Use BEFORE running long migrations to fail fast on auth/config issues. Outputs human-readable status or a JSON envelope for orchestrator integration.
Executes bash commands
Hook triggers when Bash tool is used
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
This plugin requires configuration values that are prompted when the plugin is enabled. Sensitive values are stored in your system keychain.
gold_lakehouseName of the gold layer lakehouse (business-ready). Default: lh_gold
${user_config.gold_lakehouse}azure_client_idAzure Entra ID client/application ID. Required for service principal auth.
${user_config.azure_client_id}azure_tenant_idAzure Entra ID tenant ID. Required for service principal auth; leave empty for interactive auth.
${user_config.azure_tenant_id}notebook_engineCompute engine for generated notebooks: 'pyspark' (default) emits Spark-cluster notebooks (synapse_pyspark kernel) for larger/distributed workloads; 'python' emits single-node notebooks (jupyter kernel, polars/duckdb/delta-rs) for low-volume workloads. One engine per migration. See the decision matrix in the README.
${user_config.notebook_engine}bronze_lakehouseName of the bronze layer lakehouse (raw ingestion). Default: lh_bronze
${user_config.bronze_lakehouse}silver_lakehouseName of the silver layer lakehouse (cleaned/conformed). Default: lh_silver
${user_config.silver_lakehouse}azure_client_secretAzure Entra ID client secret. Stored in system keychain. Required for service principal auth.
${user_config.azure_client_secret}fabric_workspace_idGUID of the target Fabric workspace where notebooks will be deployed. Find it in the Fabric portal URL.
${user_config.fabric_workspace_id}source_workspace_idGUID of the Power BI workspace containing the Dataflow Gen1 dataflows to migrate. Leave empty to specify at runtime.
${user_config.source_workspace_id}fabric_workspace_nameDisplay name of the target Fabric workspace (e.g., 'Analytics Dev').
${user_config.fabric_workspace_name}report_unknown_patternsWhen the m-query-analyst detects an M pattern not in the plugin's risk catalog, should the plugin (with explicit per-pattern preview + approval) file a GitHub issue against the plugin repo so the pattern can be added to a future release? Values: 'never' (default — patterns recorded only in your local _Documentation/conversion-backlog.md), 'ask' (migration-analyst asks each run), 'always' (auto-prompt the report-unknown-patterns skill at Stage 13 with sanitization preview). Connection strings, GUIDs, file paths, and sheet names are auto-redacted before any issue is filed; you still review and approve each pattern individually.
${user_config.report_unknown_patterns}Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
End-to-end Power BI Dataflow Gen1 → Microsoft Fabric medallion notebook migration. Export Gen1 dataflows, analyze M code, generate bronze + silver notebooks — distributed PySpark or single-node Python (polars/delta-rs) — deploy to Fabric, and validate, in one Claude Code session.
A Claude Code plugin that automates the full Gen1-to-Fabric migration workflow: extract dataflow definitions from a Power BI workspace, analyze every M query for conversion risks, scaffold a medallion lakehouse project, generate .ipynb notebooks for the bronze and silver layers on your chosen compute engine, deploy via the Fabric CLI, and produce a validation report. Pick PySpark (synapse_pyspark kernel) for distributed workloads or Python (jupyter kernel, polars/duckdb/delta-rs) for low-volume single-node workloads — one engine per migration. Ships with bundled sample dataflows and --dry-run support so you can try the full pipeline without a Fabric workspace.
This plugin is primarily a technology demonstration and teaching tool — built to show how Claude Code plugins can orchestrate a multi-stage data engineering migration end-to-end. It packages a working reference workflow into an installable plugin and applies hard-won lessons from the companion dbt-pipeline-toolkit plugin.
Do NOT use this plugin in production without thorough validation. Specifically:
m-to-pyspark-converter produces best-effort drafts; risky patterns (Excel.Workbook, AzureStorage.Blobs, custom M functions, synthetic IDs) are wrapped in # === HIGH RISK / HUMAN REVIEW REQUIRED === isolation cells precisely because they need human judgment.--sample --dry-run first, then a non-production workspace, then a staging workspace, before any production migration._Documentation/conversion-backlog.md for follow-up.The plugin is suitable for: learning the migration pattern, demonstrating agentic data engineering, prototyping Fabric medallion projects, accelerating manual migrations with a strong human-review loop. It is not suitable for unattended production cutover.
Microsoft marked Dataflow Gen1 as Legacy in April 2026. The official migration paths target Dataflow Gen2 — but if your target architecture is Fabric medallion lakehouses + notebooks (the recommended pattern for new Fabric workloads), there is no first-party migration path. This plugin fills that gap, and lets you choose the compute engine per migration: PySpark for distributed workloads or single-node Python (polars/duckdb/delta-rs) for low-volume ones where spinning up a Spark cluster is overkill.
synapse_pyspark kernel) or single-node Python (jupyter kernel, polars/duckdb/delta-rs); chosen once per migration via the notebook_engine config or --engine flag, with an engine-aware validator and structure hook enforcing the right idioms on each pathclaude --agent ... invocation drives the full 13-stage pipelineBefore installing, make sure you have:
npx claudepluginhub kavasimihaly/ai-plugins --plugin fabric-dataflow-migration-toolkitEnd-to-end dbt pipeline automation for SQL Server. CSV to star schema with staging, dimensions, facts, tests, and validation.
Curated hooks and tooling for Claude Code: per-session agent action logs, an install-command guard, and a token-aware statusline.
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
A growing collection of Claude-compatible academic workflow bundles. Covers scientific figures, manuscript writing and polishing, reviewer assessment, citation retrieval, data availability, paper reading, literature search, response letters, paper-to-PPTX conversion, and evidence-grounded Chinese invention patent drafting. Rules are organized as reusable skill folders with explicit workflows and quality checks.
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research