Skill

dbt-project-initializer

Initialize a new data engineering project with the complete folder structure, dbt configuration, Python virtual environment, and CLAUDE.md for agentic development. Use when starting a new analytics/BI project, creating a data pipeline repository, or scaffolding a dbt + Power BI project from scratch. Sets up everything needed for personal agents and skills to start implementing pipelines.

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/dbt-pipeline-toolkit:dbt-project-initializer

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

BashReadWriteEditGlob

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Initialize a complete data engineering project with the standard folder structure, environment setup, and agentic development configuration.

Supporting Files

scripts/initialize_project.pyscripts/reset_project.pytemplates/data-project-CLAUDE.mdtemplates/macros/date_spine.sql

SKILL.md

234 lines · ~2.1k tokens

Stats

LanguagePython

Stars1

Forks1

MaintenanceExcellent

Last CommitJun 1, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Project Initializer

Initialize a complete data engineering project with the standard folder structure, environment setup, and agentic development configuration.

Overview

This skill creates a fully configured project structure for dbt + SQL Server + Power BI development workflows. It sets up:

Numbered folder structure (0-6) for organized development
Python virtual environment with all required packages
dbt project configuration templates
CLAUDE.md customized for the project
Reference materials for agents

Usage

Interactive Mode (Recommended)

Run the initialization script interactively:

python "${CLAUDE_PLUGIN_ROOT}/skills/dbt-project-initializer/scripts/initialize_project.py" --target "C:\path\to\new\project"

The script will prompt for:

Project name: Used for folder name and dbt project (e.g., "Sales Analytics")
Database name: SQL Server database name (required — no default)
Database schema: Default schema for raw data (default: "raw")
Description: Brief project description for documentation

Non-Interactive Mode

Provide all parameters via command line:

python "${CLAUDE_PLUGIN_ROOT}/skills/dbt-project-initializer/scripts/initialize_project.py" \
  --target "C:\path\to\new\project" \
  --name "sales_analytics" \
  --database "SalesDB" \
  --schema "raw" \
  --description "Sales performance analytics pipeline"

Parameters

Parameter	Required	Default	Description
`--target`	Yes	-	Target directory for the new project
`--name`	No	Prompted	Project name (lowercase, underscores)
`--database`	Yes	—	SQL Server database name
`--schema`	No	"raw"	Schema for source data loaded via sql-executor
`--dbt-schema`	No	"dbo"	dbt default schema prefix (see Schema Convention below)
`--description`	No	Prompted	Project description
`--skip-venv`	No	False	Skip virtual environment creation
`--skip-deps`	No	False	Skip dependency installation
`--force`, `-f`	No	False	Force initialization even if target directory is not empty

Schema Convention

The project uses two schema parameters:

--schema (default: raw): Where sql-executor loads source CSV data
- Source tables created in raw schema (e.g., raw.carbon_intensity)
--dbt-schema (default: dbo): dbt's default profile target schema
- dbt-sqlserver (legacy generate_schema_name) uses each folder's +schema value verbatim — it does NOT prefix it — so final schemas are:
- Staging models → staging
- Intermediate models → intermediate
- Marts (facts/dims) → analytics
- (the --dbt-schema value only applies to models that have no +schema of their own)

Final Schema Structure:

raw                    # Source data (sql-executor)
staging                # Staging views (stg_*)
intermediate           # Intermediate models (int_*)
analytics              # Marts - facts and dimensions (fct_*, dim_*)

What Gets Created

ProjectName/
├── .claude/
│   └── settings.local.json          # Auto-allows skills and safe bash commands
├── 0 - Architecture Setup/
│   ├── setup_environment.ps1        # Python environment setup
│   ├── project-config.yml           # Project configuration
│   └── README.md                    # Setup documentation
├── 1 - Documentation/
│   └── data-profiles/               # Data profiler JSON outputs (profile_*.json)
├── 2 - Source Files/                # CSV source data (empty)
├── 3 - Data Pipeline/
│   ├── dbt_project.yml              # dbt project config
│   ├── packages.yml                 # dbt packages (dbt_utils)
│   ├── profiles.yml                 # Generated connection profile (gitignored)
│   ├── profiles.yml.example         # Profile template (committed)
│   ├── models/
│   │   ├── staging/                 # stg_* models
│   │   ├── intermediate/            # int_* models
│   │   └── marts/                   # dim_* and fct_* models
│   ├── tests/                       # Custom / singular SQL tests
│   ├── macros/                      # Custom macros
│   ├── seeds/                       # Seed data
│   ├── snapshots/                   # SCD Type 2 snapshots
│   └── analyses/                    # Ad-hoc analyses (non-materialized)
├── 4 - Semantic Layer/              # Power BI TMDL + PBIP output (pbip-from-dbt lands here)
├── 5 - Report Building/             # Power BI reports (empty)
├── 6 - Data Exports/                # Query results (empty)
├── .venv/                           # Python virtual environment
├── .gitignore                       # Git ignore file
└── CLAUDE.md                        # Project-specific agent config

All 3 - Data Pipeline/ subfolders are created even if empty because dbt_project.yml declares a path for each (model-paths, test-paths, macro-paths, seed-paths, snapshot-paths, analysis-paths). Missing any of them causes dbt parse to fail.

Virtual Environment Setup

The skill automatically creates a Python 3.12 virtual environment and installs:

dbt Dependencies:

dbt-core
dbt-sqlserver
dbt-fabric

Data Processing:

pandas
sqlalchemy
pyodbc
python-dotenv

Post-Initialization Steps

After initialization, complete these steps:

Configure dbt profile (run each command separately from inside 3 - Data Pipeline/):
```
cp "3 - Data Pipeline/profiles.yml.example" "3 - Data Pipeline/profiles.yml"
```
Edit profiles.yml with your connection details, then verify:
```
dbt debug --project-dir "3 - Data Pipeline"
```

Install dbt packages:

dbt deps --project-dir "3 - Data Pipeline"

Load source data:
- Place CSV files in "2 - Source Files/"
- Use /sql-executor skill to load into database
Start development:
- Use dbt-staging-builder agent for first models
- Use data-profiler skill to understand source data

Integration with Agents

The generated CLAUDE.md configures the project for these agents:

Agent	Purpose
dbt-staging-builder	Create stg_* models from sources
dbt-dimension-builder	Create dim_* tables
dbt-fact-builder	Create fct_* tables with incremental
dbt-test-writer	Add comprehensive dbt tests
dbt-pipeline-validator	End-to-end validation
business-analyst	Requirements gathering

Customization

Modifying Templates

Template files are located in the skill's templates/ directory:

CLAUDE.md.template - Project CLAUDE.md template
dbt_project.yml.template - dbt project template
setup_environment.ps1 - Environment setup script
gitignore.template - .gitignore template

Edit these to customize default configurations.

Adding Reference Materials

The skill copies reference materials from ${CLAUDE_PLUGIN_ROOT}/reference/ in this plugin. To add new reference materials:

Add files to ${CLAUDE_PLUGIN_ROOT}/reference/
They will be copied automatically during initialization

Troubleshooting

Python 3.12 Not Found

# Install Python 3.12
winget install Python.Python.3.12

# Verify installation
py -3.12 --version

Virtual Environment Issues

# Force recreate
.\0 - Architecture Setup\setup_environment.ps1 -Force

# Skip venv, just update packages
.\0 - Architecture Setup\setup_environment.ps1 -SkipVenvCreation

dbt Connection Issues

dbt debug --project-dir "3 - Data Pipeline"

Check the installed ODBC drivers separately:

python -c "import pyodbc; print(pyodbc.drivers())"

Related Skills

dbt-runner: Execute dbt commands after project setup
sql-executor: Load CSV data into SQL Server
data-profiler: Profile source data before modeling
pbip-from-dbt: Generate a Power BI Project (PBIP) from the completed dbt pipeline

dbt-project-initializer

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

dbt-project-initializer

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Project Initializer

Overview

Usage

Interactive Mode (Recommended)

Non-Interactive Mode

Parameters

Schema Convention

What Gets Created

Virtual Environment Setup

Post-Initialization Steps

Integration with Agents

Customization

Modifying Templates

Adding Reference Materials

Troubleshooting

Python 3.12 Not Found

Virtual Environment Issues

dbt Connection Issues

Related Skills

Similar Skills

Project Initializer

Overview

Usage

Interactive Mode (Recommended)

Non-Interactive Mode

Parameters

Schema Convention

What Gets Created

Virtual Environment Setup

Post-Initialization Steps

Integration with Agents

Customization

Modifying Templates

Adding Reference Materials

Troubleshooting

Python 3.12 Not Found

Virtual Environment Issues

dbt Connection Issues

Related Skills

Similar Skills