From redshift-comment-mcp
Reconstruct ACTUAL table-to-table data flow from Redshift query history — catches ad-hoc, BI-tool (Tableau / Looker), and manual-fix usage dbt manifest cannot see. Read-only. Use when auditing who reads / writes a table, or before deprecating one. Do NOT use for dbt-internal lineage (use dbt manifest) or real-time monitoring. Triggers: /redshift-lineage-from-stl / actual lineage / who reads / Tableau usage / 實際 lineage / 誰在讀 / クエリ履歴.
How this skill is triggered — by the user, by Claude, or both
Slash command
/redshift-comment-mcp:redshift-lineage-from-stlThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Reconstructs **actual** lineage by mining query history + sqlglot
Reconstructs actual lineage by mining query history + sqlglot parsing. Ships a helper script — the only one in this plugin that does, justified because LLM-driven SQL parsing is unreliable.
STL retention warning (mention upfront in chat):
SYS_QUERY_HISTORY.SYSLOG ACCESS UNRESTRICTED or admin.| Form | Behavior |
|---|---|
--since <date> or --since 7d | window (clamped to STL retention) |
--table <s>.<t> | scope to queries touching this table |
--user <name> | scope to one user |
--limit N | cap pulled queries (default 5000) |
--include-system | keep pg_catalog/STL/etc (default: filter out) |
--output mermaid | also emit Mermaid graph |
Detect cluster type: execute_sql("SELECT version()"). Look
for "Redshift Serverless" → use SYS_QUERY_HISTORY; else STL.
Pull queries — provisioned (STL_QUERY + STL_QUERYTEXT):
WITH recent AS (
SELECT q.query, q.userid, q.starttime,
u.usename AS user_name
FROM STL_QUERY q LEFT JOIN PG_USER u ON u.usesysid = q.userid
WHERE q.starttime >= '<since>'::timestamp AND q.aborted = 0
/* if --user */ AND u.usename = '<user>'
ORDER BY q.starttime DESC LIMIT <limit>
)
SELECT r.query, r.user_name, r.starttime,
LISTAGG(qt.text, '') WITHIN GROUP (ORDER BY qt.sequence) AS sql_text
FROM recent r JOIN STL_QUERYTEXT qt ON qt.query = r.query
GROUP BY r.query, r.user_name, r.starttime;
Caveat: LISTAGG result is VARCHAR(65535) — long queries truncate.
Mention to user; truncated SQL parses worse.
Serverless (SYS_QUERY_HISTORY — full SQL is one column already):
SELECT query_id AS query, user_id AS user_name,
start_time AS starttime, query_text AS sql_text
FROM SYS_QUERY_HISTORY
WHERE start_time >= '<since>'::timestamp
AND status NOT IN ('failed', 'aborted')
/* if --user */ AND user_id = '<user>'
ORDER BY start_time DESC LIMIT <limit>;
Save NDJSON + run helper: write each query as a JSON line
{query, user, starttime, sql} to $TMPDIR/queries.ndjson, then:
cp "<SKILL_DIR>/assets/parse_stl_lineage.py" "$TMPDIR/"
"$TMPDIR/parse_stl_lineage.py" --input "$TMPDIR/queries.ndjson" \
--output "$TMPDIR/lineage.json" \
[--filter-table <s>.<t>] [--include-system]
Script handles multi-statement SQL (sqlglot.parse), CTE-name
filtering, system-table filtering, per-edge aggregation. PEP 723 +
uv run --script resolves sqlglot automatically.
Note on sql_text escaping. Some MCP server / transport layers
serialize sql_text with literal \n / \r / \t (two-char escape
sequences) instead of real control characters. The helper decodes
these defensively (idempotent), so writing the field verbatim is
fine. If you inspect the NDJSON manually and see literal \n in the
SQL, that's the cause — not a bug in your write.
Read lineage.json and render. Surface parse_errors verbatim.
Adjacency table (chat, primary):
| from | → | to | queries | distinct users | last seen |
| dbt_marts.fct_orders | → | (read by ad-hoc) | 243 | 12 | 2026-05-03 11:42:08 |
| dbt_staging.stg_orders | → | dbt_marts.fct_orders | 31 | 1 (dbt) | 2026-05-03 04:00:12 |
(read by ad-hoc) for SELECTs that don't write to a target.
Mermaid (only with --output mermaid):
flowchart LR
raw_orders.events --> dbt_staging.stg_orders
dbt_staging.stg_orders --> dbt_marts.fct_orders
dbt_marts.fct_orders -. ad-hoc 243× .-> ((readers))
Footer: "Mined N queries from to . Found M edges across
K tables. P parse errors. Earliest STL row: (warn if --since
exceeds retention)."
sqlglot helper script.STL_QUERY (provisioned) and SYS_QUERY_HISTORY (serverless) in one report — different schemas; pick one based on version() detection.LISTAGG hits VARCHAR(65535) — partial parses produce phantom edges. Surface the truncation visibly.parse_errors — list them verbatim so the user can audit what was missed and decide whether the result is trustworthy.| Condition | Behavior |
|---|---|
| STL access denied | _error: stl_access_denied: <verbatim> |
--since > earliest STL row | warn, proceed |
| Helper script missing | _error: helper_script_missing |
| Helper script crashed | _error: parser_failed: <stderr> |
uv not available | _error: uv_unavailable |
| Zero queries in window | not error — render empty + hint widen window |
| > 50% queries failed parse | warn loudly, proceed with what parsed |
| Need | Use |
|---|---|
| dbt internal model lineage | dbt manifest / dbt docs |
npx claudepluginhub kouko/redshift-comment-mcp --plugin redshift-comment-mcpGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.