Skill

evidence-data-sources

Evidence.dev data source configuration — BigQuery, PostgreSQL, MySQL, DuckDB, Snowflake, CSV, SQLite, Motherduck, Databricks, Redshift, Trino, Google Sheets, JavaScript sources

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/evidence-dev:evidence-data-sources

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Invoke when the user asks to:

SKILL.md

608 lines · ~4k tokens

Stats

LanguagePython

Parent stars0

MaintenanceGood

Last CommitJun 16, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

When to use

Invoke when the user asks to:

Connect a new data source (database, CSV, API) to Evidence.dev
Configure connection.yaml or troubleshoot a source that isn't loading
Understand which source connector to use for a given database type

Evidence Data Sources

Evidence extracts all data sources into Parquet (a common storage format) so you can query across multiple sources with SQL. Each source lives in sources/[source_name]/ and has a connection.yaml (non-secret config) plus an optional connection.options.yaml (secrets, generated by the settings UI).

Level 1 — Available Sources and Minimal Config

How to Add a Source

Run npm run dev and navigate to http://localhost:3000/settings
Select source type, name it, enter credentials
Add .sql files to sources/[source_name]/ — each file becomes a queryable table as [source_name].[query_name]
Run npm run sources

Selective source runs (useful for large sources):

npm run sources -- --sources my_source
npm run sources -- --sources my_source --queries query_one,query_two
npm run sources -- --changed

Minimal connection.yaml per Source

BigQuery

name: my_bq
type: bigquery
options:
  project_id: my-gcp-project
  authenticator: service-account  # or: gcloud-cli, oauth

PostgreSQL / TimescaleDB / Cube

name: my_pg
type: postgres
options:
  host: localhost
  port: 5432
  database: my_db
  user: my_user
  ssl: false

MySQL

name: my_mysql
type: mysql
options:
  host: localhost
  port: 3306
  database: my_db
  user: my_user

Microsoft SQL Server

name: my_mssql
type: mssql
options:
  server: localhost
  database: my_db
  authenticationType: default  # SQL Login
  trust_server_certificate: false
  encrypt: false

Snowflake

name: my_snowflake
type: snowflake
options:
  account: myorg-myaccount
  username: my_user
  database: MY_DB
  warehouse: MY_WH
  authenticator: userpass  # or: snowflake_jwt, externalbrowser, okta

Redshift

name: my_redshift
type: redshift
options:
  host: my-cluster.us-east-1.redshift.amazonaws.com
  port: 5439
  database: my_db
  user: my_user
  ssl: true

DuckDB

name: my_duckdb
type: duckdb
options:
  filename: my_database.duckdb

MotherDuck

name: my_motherduck
type: motherduck
options:
  database: my_md_db   # optional — omit to use default

SQLite

name: my_sqlite
type: sqlite
options:
  filename: my_database.sqlite

Databricks

name: my_databricks
type: databricks
options:
  host: adb-123456789.azuredatabricks.net
  path: /sql/1.0/warehouses/abc123
  token: dapi...  # personal access token — use env var in production

Trino / Starburst

name: my_trino
type: trino
options:
  host: localhost
  port: 8080
  user: my_user
  ssl: false

CSV

name: my_csv
type: csv
options: {}

Then copy .csv files into sources/my_csv/. Query with select * from my_csv.my_file (no .csv extension in the query).

Google Sheets (plugin — install separately)

name: my_sheets
type: gsheets
options: {}
sheets:
  my_workbook_name: 1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms

Query with select * from my_sheets.my_workbook_name_tab_name (tab spaces replaced by underscores).

JavaScript

name: my_api
type: javascript
options: {}

Then add .js files to sources/my_api/ — see Level 3 for the export contract.

Level 2 — Connection Options and Common Gotchas

BigQuery

Field	Required	Notes
`project_id`	Yes	GCP project ID
`location`	No	Default: `US`
`authenticator`	Yes	`service-account`, `gcloud-cli`, or `oauth`
`client_email`	Yes (service-account)	From JSON key file
`private_key`	Yes (service-account)	From JSON key file
`token`	Yes (oauth)	Expires after 1 hour
`enable_connected_sheets`	No	Adds Drive API scope

Gotchas:

gcloud-cli auth requires browser access — local dev only.
OAuth tokens expire in 1 hour; not suitable for automated builds.
Service account needs at minimum BigQuery User role; may also need BigQuery Data Viewer depending on org settings.

PostgreSQL / Redshift

Field	Required	Notes
`host`	Yes	Default: `localhost`
`port`	Yes	Default: `5432` (Redshift: `5439`)
`database`	Yes	Default: `postgres`
`user`	Yes
`password`	Yes
`ssl`	No	`false`, `true`, or `no-verify`
`schema`	No	Overrides search_path

SSL options:

false — no SSL (default)
true — SSL with certificate validation (self-signed certs will fail)
no-verify — SSL without certificate validation

For self-signed certs with a CA certificate, use a connection string:

postgresql://user:password@host:port/db?sslmode=require&sslrootcert=/path/to/ca.crt

For client certificate auth, manually edit connection.yaml:

name: mydatabase
type: postgres
options:
  host: example.myhost.com
  port: 5432
  database: mydatabase
  ssl:
    sslmode: require

And connection.options.yaml:

user: "USERNAME_AS_BASE64"
ssl:
  rejectUnauthorized: true
  key: "USER_KEY_AS_BASE64"
  cert: "USER_CERT_AS_BASE64"

Redshift uses the postgres connector under the hood — same options apply, default port is 5439.

Cube (semantic layer): use the postgres connector pointing at the Cube SQL API. Credentials found in Cube BI Integrations > SQL API Connection.

MySQL

Field	Required	Notes
`host`	Yes
`port`	No	Default: `3306`
`database`	Yes
`user`	Yes
`password`	Yes
`ssl`	No	`false`, `true`, `Amazon RDS`, or a credentials object
`socketPath`	No	Required for Google Cloud MySQL

Microsoft SQL Server

Field	Required	Notes
`server`	Yes	Hostname
`database`	Yes
`port`	No	Default: `1433`
`authenticationType`	Yes	`default` (SQL Login), `azure-active-directory-default`, `azure-active-directory-access-token`, `azure-active-directory-password`, `azure-active-directory-service-principal-secret`
`trust_server_certificate`	No	Default: `false`. Set `true` for local dev / self-signed certs
`encrypt`	No	Default: `false`. Set `true` for Azure
`connection_timeout`	No	Default: `15000` ms
`request_timeout`	No	Default: `15000` ms

Gotchas:

encrypt: true is required for Azure SQL.
trust_server_certificate: true bypasses certificate chain validation — don't use in production unless necessary.

Snowflake

Field	Required	Notes
`account`	Yes	Format: `orgname-accountname`
`username`	Yes
`database`	Yes
`warehouse`	Yes
`role`	No
`schema`	No
`authenticator`	Yes	`userpass`, `snowflake_jwt`, `externalbrowser`, `okta`
`password`	Yes (userpass/okta)
`private_key`	Yes (snowflake_jwt)	PEM format
`passphrase`	Yes (snowflake_jwt)
`okta_url`	Yes (okta)

Gotchas:

All column names are lowercased in Evidence regardless of Snowflake casing.
externalbrowser requires browser access — local dev only.
Okta SSO requires MFA to be disabled on the Okta account.

DuckDB

Field	Required	Notes
`filename`	Yes	Relative to `sources/[source_name]/`, e.g. `my.duckdb`

The .duckdb file must live inside sources/[source_name]/. Use :memory: is not supported as a filename — use an in-memory DuckDB by omitting the file (not directly supported; use MotherDuck or provide a file).

MotherDuck

Field	Required	Notes
`token`	Yes	MotherDuck service token (use env var in production)
`database`	No	Specific MD database to connect to

SQLite

Field	Required	Notes
`filename`	Yes	Relative to `sources/[source_name]/`, e.g. `my.sqlite`

The .sqlite file must live inside sources/[source_name]/.

Databricks

Field	Required	Notes
`token`	Yes	Personal access token
`host`	Yes	Server hostname (e.g. `adb-xxx.azuredatabricks.net`)
`path`	Yes	HTTP path (e.g. `/sql/1.0/warehouses/abc`)
`port`	No	Default: `443`

Trino / Starburst

Field	Required	Notes
`host`	Yes	Default: `localhost`
`port`	Yes	Default: `443`
`user`	Yes
`password`	No
`ssl`	No	Set `true` and port `443`/`8443` for HTTPS
`catalog`	No
`schema`	No
`engine`	No	`trino` (default) or `presto`

Starburst Galaxy config:

host: <YOUR_DOMAIN>-<YOUR_CLUSTER_NAME>.galaxy.starburst.io
port: 443
user: <YOUR_EMAIL>/accountadmin
ssl: true

Gotchas:

Only password-based auth is supported (Basic auth). LDAP, certificate, and JWT auth types are not supported.

Google Sheets

Gotchas:

This is a plugin — install first: npm install @evidence-dev/datasource-gsheets
Share the Google Sheet with the service account's email address, or it cannot read the data.
Sheet ID is the long string after /spreadsheets/d/ in the sheet URL.
Tab names with spaces become underscores in the query name.

CSV

Gotchas:

Source names and file names can only contain letters, numbers, and underscores.
Do not include .csv in the query — select * from my_source.my_file not my_file.csv.
CSV options (DuckDB read_csv() args) are passed without spaces, with double-quoted strings: header=false,delim="|".

Level 3 — Full Config Reference, Environment Variables, and JavaScript Sources

Environment Variable Pattern for Production

Credentials are managed in production via environment variables. The pattern is:

EVIDENCE_SOURCE__[SOURCE_NAME]__[OPTION_NAME]=value

Examples:

# PostgreSQL source named "my_pg"
EVIDENCE_SOURCE__MY_PG__host=db.example.com
EVIDENCE_SOURCE__MY_PG__port=5432
EVIDENCE_SOURCE__MY_PG__database=analytics
EVIDENCE_SOURCE__MY_PG__user=evidence_user
EVIDENCE_SOURCE__MY_PG__password=s3cr3t

# BigQuery source named "my_bq"
EVIDENCE_SOURCE__MY_BQ__project_id=my-gcp-project
EVIDENCE_SOURCE__MY_BQ__authenticator=service-account
EVIDENCE_SOURCE__MY_BQ__client_email=svc@project.iam.gserviceaccount.com
EVIDENCE_SOURCE__MY_BQ__private_key=-----BEGIN RSA PRIVATE KEY-----...

# MotherDuck source named "my_md"
EVIDENCE_SOURCE__MY_MD__token=eyJhbGc...

# Snowflake source named "my_sf"
EVIDENCE_SOURCE__MY_SF__account=myorg-myaccount
EVIDENCE_SOURCE__MY_SF__username=my_user
EVIDENCE_SOURCE__MY_SF__database=MY_DB
EVIDENCE_SOURCE__MY_SF__warehouse=MY_WH
EVIDENCE_SOURCE__MY_SF__authenticator=userpass
EVIDENCE_SOURCE__MY_SF__password=s3cr3t

Source names in env vars are uppercased. Nested options (e.g. ssl.sslmode) use double underscores for nesting.

Build-Time Query Variables

Pass variables into source .sql queries at build time:

# .env (local) or CI environment
EVIDENCE_VAR__client_id=123

-- sources/my_source/customers.sql
select * from customers
where client_id = ${client_id}

Note: these variables only work in source queries (.sql files in sources/), not in markdown-embedded queries.

Increasing Memory for Large Sources

For sources with ~1M+ rows that cause heap errors:

# macOS / Linux
NODE_OPTIONS="--max-old-space-size=4096" npm run sources

# Windows
set NODE_OPTIONS=--max-old-space-size=4096 && npm run sources

JavaScript Data Sources — Full Reference

Add .js files to sources/[source_name]/. The file must export a variable named data.

Minimal example:

// sources/my_api/results.js
const response = await fetch('https://api.example.com/data');
const json = await response.json();
const data = json.results;

export { data };

With API key from environment variable:

// sources/my_api/results.js
// Environment variable must be prefixed with EVIDENCE_
// Set in .env locally; add to build environment for production
let key = process.env.EVIDENCE_API_KEY;
let url = 'https://api.example.com/data';

const response = await fetch(url, {
  headers: {
    'x-api-key': key
  }
});

const json = await response.json();
const data = json.results;

export { data };

Type support:

JS Type	Supported	Notes
String	Yes
Number	Yes
Boolean	Yes
Date	Yes
Array	Partial	Converted to comma-separated string, e.g. `[1,2,3]` → `"1,2,3"`
Object	No	Will be dropped or error

Environment variable for JS sources:

# .env
EVIDENCE_API_KEY=your_api_key_here

Variables must be prefixed with EVIDENCE_. They are available via process.env.EVIDENCE_* inside .js source files.

Full BigQuery connection.yaml (Service Account)

name: my_bq
type: bigquery
options:
  project_id: my-gcp-project
  location: US
  authenticator: service-account
  client_email: [email protected]
  private_key: "-----BEGIN RSA PRIVATE KEY-----\n..."
  enable_connected_sheets: false

Full Snowflake connection.yaml (Username/Password)

name: my_snowflake
type: snowflake
options:
  account: myorg-myaccount
  username: my_user
  database: MY_DB
  warehouse: MY_WH
  role: MY_ROLE
  schema: PUBLIC
  authenticator: userpass
  password: s3cr3t

Full Snowflake connection.yaml (Key-Pair JWT)

name: my_snowflake
type: snowflake
options:
  account: myorg-myaccount
  username: my_user
  database: MY_DB
  warehouse: MY_WH
  authenticator: snowflake_jwt
  private_key: "-----BEGIN PRIVATE KEY-----\n..."
  passphrase: my_key_passphrase

Full MSSQL connection.yaml (Azure Service Principal)

name: my_mssql
type: mssql
options:
  server: myserver.database.windows.net
  database: my_db
  port: 1433
  authenticationType: azure-active-directory-service-principal-secret
  spclientid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  spclientsecret: my-client-secret
  sptenantid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  encrypt: true
  trust_server_certificate: false
  connection_timeout: 15000
  request_timeout: 15000

Google Sheets — Full connection.yaml

name: my_sheets
type: gsheets
options: {}
sheets:
  sales_data: 1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms
  hr_data: 2CyiNWt1YSB6nGNLwCeCajhXhVrqumcmt85PhWF3vqnt

Query:

-- Tab named "Q1 Results" in sales_data workbook
select * from my_sheets.sales_data_Q1_Results

Trino — Full connection.yaml (Starburst Galaxy)

name: my_trino
type: trino
options:
  host: myorg-mycluster.galaxy.starburst.io
  port: 443
  user: [email protected]/accountadmin
  password: my_starburst_password
  ssl: true
  catalog: tpch
  schema: tiny
  engine: trino

Source Directory Layout Reference

sources/
  my_pg_source/
    connection.yaml          # non-secret config (committed to git)
    connection.options.yaml  # secrets (generated by settings UI, gitignored)
    orders.sql               # → queryable as my_pg_source.orders
    customers.sql            # → queryable as my_pg_source.customers

  my_duckdb/
    connection.yaml
    my_database.duckdb       # DB file must live here

  my_csv/
    connection.yaml
    sales_2024.csv           # → queryable as my_csv.sales_2024

  my_api/
    connection.yaml
    results.js               # → queryable as my_api.results

Parallelisation

Caution — writes to connection.yaml and triggers npm run sources. Do not run alongside other data-source agents on the same project.

evidence-data-sources

Invocation

Context Preview

SKILL.md

evidence-data-sources

Invocation

Context Preview

SKILL.md

When to use

Evidence Data Sources

Level 1 — Available Sources and Minimal Config

How to Add a Source

Minimal connection.yaml per Source

Level 2 — Connection Options and Common Gotchas

BigQuery

PostgreSQL / Redshift

MySQL

Microsoft SQL Server

Snowflake

DuckDB

MotherDuck

SQLite

Databricks

Trino / Starburst

Google Sheets

CSV

Level 3 — Full Config Reference, Environment Variables, and JavaScript Sources

Environment Variable Pattern for Production

Build-Time Query Variables

Increasing Memory for Large Sources

JavaScript Data Sources — Full Reference

Full BigQuery connection.yaml (Service Account)

Full Snowflake connection.yaml (Username/Password)

Full Snowflake connection.yaml (Key-Pair JWT)

Full MSSQL connection.yaml (Azure Service Principal)

Google Sheets — Full connection.yaml

Trino — Full connection.yaml (Starburst Galaxy)

Source Directory Layout Reference

Parallelisation

Similar Skills

When to use

Evidence Data Sources

Level 1 — Available Sources and Minimal Config

How to Add a Source

Minimal connection.yaml per Source

Level 2 — Connection Options and Common Gotchas

BigQuery

PostgreSQL / Redshift

MySQL

Microsoft SQL Server

Snowflake

DuckDB

MotherDuck

SQLite

Databricks

Trino / Starburst

Google Sheets

CSV

Level 3 — Full Config Reference, Environment Variables, and JavaScript Sources

Environment Variable Pattern for Production

Build-Time Query Variables

Increasing Memory for Large Sources

JavaScript Data Sources — Full Reference

Full BigQuery connection.yaml (Service Account)

Full Snowflake connection.yaml (Username/Password)

Full Snowflake connection.yaml (Key-Pair JWT)

Full MSSQL connection.yaml (Azure Service Principal)

Google Sheets — Full connection.yaml

Trino — Full connection.yaml (Starburst Galaxy)

Source Directory Layout Reference

Parallelisation

Similar Skills