From dbt-pipeline-toolkit
Initialize a new data engineering project with the complete folder structure, dbt configuration, Python virtual environment, and CLAUDE.md for agentic development. Use when starting a new analytics/BI project, creating a data pipeline repository, or scaffolding a dbt + Power BI project from scratch. Sets up everything needed for personal agents and skills to start implementing pipelines.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dbt-pipeline-toolkit:dbt-project-initializerThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Initialize a complete data engineering project with the standard folder structure, environment setup, and agentic development configuration.
Initialize a complete data engineering project with the standard folder structure, environment setup, and agentic development configuration.
This skill creates a fully configured project structure for dbt + SQL Server + Power BI development workflows. It sets up:
Run the initialization script interactively:
python "${CLAUDE_PLUGIN_ROOT}/skills/dbt-project-initializer/scripts/initialize_project.py" --target "C:\path\to\new\project"
The script will prompt for:
Provide all parameters via command line:
python "${CLAUDE_PLUGIN_ROOT}/skills/dbt-project-initializer/scripts/initialize_project.py" \
--target "C:\path\to\new\project" \
--name "sales_analytics" \
--database "SalesDB" \
--schema "raw" \
--description "Sales performance analytics pipeline"
| Parameter | Required | Default | Description |
|---|---|---|---|
--target | Yes | - | Target directory for the new project |
--name | No | Prompted | Project name (lowercase, underscores) |
--database | Yes | — | SQL Server database name |
--schema | No | "raw" | Schema for source data loaded via sql-executor |
--dbt-schema | No | "dbo" | dbt default schema prefix (see Schema Convention below) |
--description | No | Prompted | Project description |
--skip-venv | No | False | Skip virtual environment creation |
--skip-deps | No | False | Skip dependency installation |
--force, -f | No | False | Force initialization even if target directory is not empty |
The project uses two schema parameters:
--schema (default: raw): Where sql-executor loads source CSV data
raw schema (e.g., raw.carbon_intensity)--dbt-schema (default: dbo): dbt's default profile target schema
generate_schema_name) uses each folder's +schema value verbatim — it does NOT prefix it — so final schemas are:stagingintermediateanalytics--dbt-schema value only applies to models that have no +schema of their own)Final Schema Structure:
raw # Source data (sql-executor)
staging # Staging views (stg_*)
intermediate # Intermediate models (int_*)
analytics # Marts - facts and dimensions (fct_*, dim_*)
ProjectName/
├── .claude/
│ └── settings.local.json # Auto-allows skills and safe bash commands
├── 0 - Architecture Setup/
│ ├── setup_environment.ps1 # Python environment setup
│ ├── project-config.yml # Project configuration
│ └── README.md # Setup documentation
├── 1 - Documentation/
│ └── data-profiles/ # Data profiler JSON outputs (profile_*.json)
├── 2 - Source Files/ # CSV source data (empty)
├── 3 - Data Pipeline/
│ ├── dbt_project.yml # dbt project config
│ ├── packages.yml # dbt packages (dbt_utils)
│ ├── profiles.yml # Generated connection profile (gitignored)
│ ├── profiles.yml.example # Profile template (committed)
│ ├── models/
│ │ ├── staging/ # stg_* models
│ │ ├── intermediate/ # int_* models
│ │ └── marts/ # dim_* and fct_* models
│ ├── tests/ # Custom / singular SQL tests
│ ├── macros/ # Custom macros
│ ├── seeds/ # Seed data
│ ├── snapshots/ # SCD Type 2 snapshots
│ └── analyses/ # Ad-hoc analyses (non-materialized)
├── 4 - Semantic Layer/ # Power BI TMDL + PBIP output (pbip-from-dbt lands here)
├── 5 - Report Building/ # Power BI reports (empty)
├── 6 - Data Exports/ # Query results (empty)
├── .venv/ # Python virtual environment
├── .gitignore # Git ignore file
└── CLAUDE.md # Project-specific agent config
All 3 - Data Pipeline/ subfolders are created even if empty because dbt_project.yml declares a path for each (model-paths, test-paths, macro-paths, seed-paths, snapshot-paths, analysis-paths). Missing any of them causes dbt parse to fail.
The skill automatically creates a Python 3.12 virtual environment and installs:
dbt Dependencies:
Data Processing:
After initialization, complete these steps:
Configure dbt profile (run each command separately from inside 3 - Data Pipeline/):
cp "3 - Data Pipeline/profiles.yml.example" "3 - Data Pipeline/profiles.yml"
Edit profiles.yml with your connection details, then verify:
dbt debug --project-dir "3 - Data Pipeline"
Install dbt packages:
dbt deps --project-dir "3 - Data Pipeline"
Load source data:
/sql-executor skill to load into databaseStart development:
The generated CLAUDE.md configures the project for these agents:
| Agent | Purpose |
|---|---|
| dbt-staging-builder | Create stg_* models from sources |
| dbt-dimension-builder | Create dim_* tables |
| dbt-fact-builder | Create fct_* tables with incremental |
| dbt-test-writer | Add comprehensive dbt tests |
| dbt-pipeline-validator | End-to-end validation |
| business-analyst | Requirements gathering |
Template files are located in the skill's templates/ directory:
CLAUDE.md.template - Project CLAUDE.md templatedbt_project.yml.template - dbt project templatesetup_environment.ps1 - Environment setup scriptgitignore.template - .gitignore templateEdit these to customize default configurations.
The skill copies reference materials from ${CLAUDE_PLUGIN_ROOT}/reference/ in this plugin. To add new reference materials:
${CLAUDE_PLUGIN_ROOT}/reference/# Install Python 3.12
winget install Python.Python.3.12
# Verify installation
py -3.12 --version
# Force recreate
.\0 - Architecture Setup\setup_environment.ps1 -Force
# Skip venv, just update packages
.\0 - Architecture Setup\setup_environment.ps1 -SkipVenvCreation
dbt debug --project-dir "3 - Data Pipeline"
Check the installed ODBC drivers separately:
python -c "import pyodbc; print(pyodbc.drivers())"
npx claudepluginhub kavasimihaly/ai-plugins --plugin dbt-pipeline-toolkitGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.