Search everything...

Stats

Actions

Available In

datafusion-skills

Name: datafusion-skills
Author: datafusion-contrib

By datafusion-contrib

Query local or cloud (S3/GCS) Parquet, CSV, JSON, Arrow, and Avro files with SQL using DataFusion in Claude Code sessions. Register persistent external tables, create and refresh materialized views, visualize and optimize execution plans, inspect schemas, and search DataFusion documentation.

npx claudepluginhub datafusion-contrib/datafusion-skills --plugin datafusion-skills

Popularity

Stars

Top 25%

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills7

create-table

/create-table

Register a data file as a persistent external table in the DataFusion session. Supports Parquet, CSV, JSON, Arrow IPC, and Avro files. Explores the schema and writes to the session state file for reuse across skills.

datafusion-docs

/datafusion-docs

Search Apache DataFusion documentation, user guide, and API reference. Returns relevant documentation for a question or keyword. Searches the official DataFusion repository and website.

explain-plan

/explain-plan

Visualize and analyze DataFusion query execution plans. Shows logical and physical plans, identifies performance bottlenecks, and suggests optimizations. Supports EXPLAIN and EXPLAIN ANALYZE.

install-datafusion

/install-datafusion

Install or update datafusion-cli. Supports installation via cargo install, Homebrew, or pre-built binaries. Checks the current version and offers to upgrade if outdated.

materialized-view

/materialized-view

Create and manage materialized views using DataFusion. Persist SQL query results as Parquet files for fast repeated access. Track source dependencies and refresh when data changes. Powered by datafusion-cli's COPY TO.

Stats

Version0.1.0

Stars12

MaintenanceGood

LicenseApache-2.0

Last CommitMar 21, 2026

AddedMar 28, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

datafusion-skills11

README

datafusion-skills

A Claude Code plugin that adds Apache DataFusion-powered skills for data exploration, querying, and materialized views.

Installation

From GitHub

Add the repository as a plugin source and install:

/plugin marketplace add datafusion-contrib/datafusion-skills
/plugin install datafusion-skills@datafusion-skills

This registers the GitHub repo as a marketplace and installs the plugin. Skills will be available as /datafusion-skills:<skill-name> in all future sessions.

Updating

/plugin marketplace update datafusion-skills
/plugin update datafusion-skills@datafusion-skills

Skills

`query`

Run SQL queries against registered tables or ad-hoc against files. Accepts raw SQL or natural language questions. Supports Parquet, CSV, JSON, Arrow IPC, and Avro.

/datafusion-skills:query SELECT * FROM 'trades.parquet' WHERE symbol = 'AAPL' LIMIT 10
/datafusion-skills:query "what are the top 5 symbols by volume?"
/datafusion-skills:query FROM sales WHERE amount > 100

`read-file`

Read and explore any data file — Parquet, CSV, JSON, Arrow IPC, Avro — locally or from S3/GCS. Auto-detects format by extension.

/datafusion-skills:read-file trades.parquet what columns does it have?
/datafusion-skills:read-file s3://my-bucket/data.parquet describe the schema
/datafusion-skills:read-file metrics.csv how many rows?

`create-table`

Register a data file as a persistent external table. Explores the schema and persists the registration so all other skills can access the table automatically.

/datafusion-skills:create-table trades.parquet
/datafusion-skills:create-table data.csv --name sales --format csv

`materialized-view`

Create and manage materialized views — persist SQL query results as Parquet files for fast repeated access. Track source dependencies and refresh when data changes.

/datafusion-skills:materialized-view "create a daily summary of trades grouped by symbol"
/datafusion-skills:materialized-view refresh trades_daily
/datafusion-skills:materialized-view status
/datafusion-skills:materialized-view list

`explain-plan`

Visualize and analyze query execution plans. Identifies performance bottlenecks and suggests optimizations.

/datafusion-skills:explain-plan SELECT * FROM trades WHERE date > '2024-01-01'
/datafusion-skills:explain-plan --analyze SELECT COUNT(*) FROM large_table GROUP BY category

`datafusion-docs`

Search Apache DataFusion documentation — user guide, SQL reference, and API docs. Returns relevant documentation for a question or keyword.

/datafusion-skills:datafusion-docs window functions
/datafusion-skills:datafusion-docs "how do I create an external table?"
/datafusion-skills:datafusion-docs APPROX_PERCENTILE_CONT

`install-datafusion`

Install or update datafusion-cli. Supports Homebrew, cargo install, and pre-built binaries.

/datafusion-skills:install-datafusion
/datafusion-skills:install-datafusion --update

Session state

All skills share a single state.sql file per project — a plain SQL file containing CREATE EXTERNAL TABLE statements and configuration. When state is first needed, you'll be asked where to store it:

In the project directory (.datafusion-skills/state.sql) — colocated with the project, optionally gitignored
In your home directory (~/.datafusion-skills/<project>/state.sql) — keeps the repo clean

Any skill restores the session via datafusion-cli --file state.sql.

How the skills work together

Skills reference each other where it makes sense:

read-file suggests query for follow-up exploration and create-table for persisting data
query uses session state from create-table automatically
materialized-view creates persistent Parquet files registered via create-table
explain-plan helps optimize queries from query
All skills use datafusion-docs to troubleshoot DataFusion errors automatically

Why DataFusion?

Apache DataFusion is a fast, extensible query engine built in Rust on top of Apache Arrow. It offers:

High performance: Vectorized execution, predicate pushdown, partition pruning
Standard SQL: Full SQL support including window functions, CTEs, subqueries
Extensibility: Custom table providers, UDFs, optimizer rules
File format support: Parquet, CSV, JSON, Arrow IPC, Avro
Cloud native: S3, GCS, Azure object store support
Materialized views: Persist query results and track dependencies (unique to DataFusion ecosystem)

Local development

# Clone the repo
git clone https://github.com/datafusion-contrib/datafusion-skills.git
cd datafusion-skills

# Launch Claude Code with the local plugin directory
claude --plugin-dir .

Test individual skills:

View full README on GitHub

datafusion-skills

Popularity

What's Inside

Confidence

README

datafusion-skills

Installation

From GitHub

Updating

Skills

query

read-file

create-table

materialized-view

explain-plan

datafusion-docs

install-datafusion

Session state

How the skills work together

Why DataFusion?

Local development

Similar Plugins

duckdb-skills

airlayer

qsv-data-wrangling

sq

datafusion-skills

Installation

From GitHub

Updating

Skills

query

read-file

create-table

materialized-view

explain-plan

datafusion-docs

install-datafusion

Session state

How the skills work together

Why DataFusion?

Local development

Popularity

Health & Quality

Similar Plugins

duckdb-skills

airlayer

qsv-data-wrangling

sq

clickhouse

data-exploration

`query`

`read-file`

`create-table`

`materialized-view`

`explain-plan`

`datafusion-docs`

`install-datafusion`

`query`

`read-file`

`create-table`

`materialized-view`

`explain-plan`

`datafusion-docs`

`install-datafusion`