From skills
Choose and configure the right acceleration engine — Arrow, DuckDB, SQLite, Cayenne, PostgreSQL, or Turso. Use this skill whenever the user needs to pick an accelerator engine, compare engines (e.g. "should I use DuckDB or Cayenne?"), configure engine-specific parameters (duckdb_file, sqlite_file), tune memory vs file mode, or understand engine capabilities and limitations. This skill is the engine selection and tuning guide. For the broader acceleration feature (refresh modes, retention, snapshots, indexes), see spice-acceleration.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skills:spice-acceleratorsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Accelerators materialize data locally from connected sources for faster queries and reduced load on source systems.
Accelerators materialize data locally from connected sources for faster queries and reduced load on source systems.
datasets:
- from: postgres:my_table
name: my_table
acceleration:
enabled: true
engine: duckdb # arrow, duckdb, sqlite, cayenne, postgres, turso
mode: memory # memory or file
refresh_check_interval: 1h
| Use Case | Engine | Why |
|---|---|---|
| Small datasets (<1 GB), max speed | arrow | In-memory, lowest latency |
| Medium datasets (1-100 GB), complex SQL | duckdb | Mature SQL, memory management |
| Large datasets (100 GB-1+ TB), analytics | cayenne | Built on Vortex (Linux Foundation), 10-20x faster scans |
| Point lookups on large datasets | cayenne | 100x faster random access vs Parquet |
| Simple queries, low resource usage | sqlite | Lightweight, minimal overhead |
| Async operations, concurrent workloads | turso | Native async, modern connection pooling |
| External database integration | postgres | Leverage existing PostgreSQL infra |
Choose Cayenne when datasets exceed ~1 TB, multi-file ingestion is needed, or point lookups are common. Choose DuckDB when datasets are under ~1 TB, complex SQL (window functions, CTEs) is needed, or DuckDB tooling is beneficial.
| Engine | Mode | Status |
|---|---|---|
arrow | memory | Stable |
duckdb | memory, file | Stable |
sqlite | memory, file | Release Candidate |
cayenne | file | Beta |
postgres | N/A (attached) | Release Candidate |
turso | memory, file | Beta |
| Mode | Description | Use Case |
|---|---|---|
full | Complete dataset replacement on each refresh | Small, slowly-changing datasets |
append (batch) | Adds new records based on a time_column | Append-only logs, time-series data |
append (stream) | Continuous streaming without time column | Real-time event streams (Kafka, Debezium) |
changes | CDC-based incremental updates via Debezium or DynamoDB Streams | Frequently updated transactional data |
caching | Request-based row-level caching | API responses, HTTP endpoints |
# Full refresh every 8 hours
acceleration:
refresh_mode: full
refresh_check_interval: 8h
# Append mode: check for new records from the last day every 10 minutes
acceleration:
refresh_mode: append
time_column: created_at
refresh_check_interval: 10m
refresh_data_window: 1d
# Continuous ingestion using Kafka
acceleration:
refresh_mode: append
# CDC with Debezium or DynamoDB Streams
acceleration:
refresh_mode: changes
acceleration:
enabled: true
engine: arrow
refresh_check_interval: 5m
datasets:
- from: postgres:events
name: events
time_column: created_at
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: append
refresh_check_interval: 1h
refresh_data_window: 7d
Retention policies prevent unbounded growth of accelerated datasets. Spice supports time-based and custom SQL-based retention strategies:
datasets:
- from: postgres:events
name: events
time_column: created_at
acceleration:
enabled: true
engine: duckdb
retention_check_enabled: true
retention_period: 30d
retention_check_interval: 1h
acceleration:
retention_check_enabled: true
retention_check_interval: 1h
retention_sql: "DELETE FROM logs WHERE status = 'archived'"
acceleration:
enabled: true
engine: sqlite
indexes:
user_id: enabled
'(created_at, status)': unique
primary_key: id
acceleration:
engine: duckdb
mode: file
params:
duckdb_file: ./data/cache.db
acceleration:
engine: sqlite
mode: file
params:
sqlite_file: ./data/cache.sqlite
Accelerated datasets support primary key constraints and indexes:
acceleration:
enabled: true
engine: duckdb
primary_key: order_id # Creates non-null unique index
indexes:
customer_id: enabled # Single column index
'(created_at, status)': unique # Multi-column unique index
Bootstrap file-based accelerations from S3 or filesystem snapshots on startup. This dramatically reduces cold-start latency in distributed deployments.
Snapshot triggers vary by refresh mode:
refresh_complete: Creates snapshots after each refresh (full and batch-append modes)time_interval: Creates snapshots on a fixed schedule (all refresh modes)stream_batches: Creates snapshots after every N batches (streaming modes: Kafka, Debezium, DynamoDB Streams)snapshots:
enabled: true
location: s3://my_bucket/snapshots/
bootstrap_on_failure_behavior: warn # warn | retry | fallback
params:
s3_auth: iam_role
Per-dataset opt-in:
acceleration:
enabled: true
engine: duckdb
mode: file
snapshots:
enabled: true
When using mode: memory (default), the dataset is loaded into RAM. Ensure sufficient memory including overhead for queries and the runtime. Mitigate with mode: file for duckdb, sqlite, turso, or cayenne accelerators.
npx claudepluginhub spiceai/skills --plugin skillsGuides architecture decisions for PostgreSQL, DuckDB, Parquet, PGVector, Neo4j across OLTP, OLAP, vector search, graph workloads including schema design, query optimization, and performance tuning.
Manages Databricks Lakebase Postgres: creates autoscaling projects, branching, compute scaling, PostgreSQL connectivity, Data API, and synced tables. For Lakebase databases, OLTP storage, or app connections to Databricks Postgres.
Expert guidance for Azure Databricks covering troubleshooting, best practices, architecture, deployment, Unity Catalog, Delta Live Tables, Model Serving, and Databricks SQL. Activates when working with Azure Databricks tools and services.