From results-db
Use when working on an empirical research project and need to track, log, query, or decide what to do with regression results. Invoke when the user asks what results are significant, what goes in the paper vs appendix vs dropped, wants a results ledger or narrative, needs to log a new estimate, wants to update the status of a result, needs to see the full picture before writing a section, or wants to check pre-trend / Honest DiD status of any result. Works across quantitative social-science or economics projects. At the start of each empirical writing session, automatically run `status` and `show --in_paper tbd` without being asked.
How this skill is triggered — by the user, by Claude, or both
Slash command
/results-db:results-dbThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A structured ledger for every regression estimate in a paper. One CSV per project. Query it to decide inclusion, generate the "story" narrative, and never forget a result again.
A structured ledger for every regression estimate in a paper. One CSV per project. Query it to decide inclusion, generate the "story" narrative, and never forget a result again.
Empirically heavy papers accumulate hundreds of estimates across specifications, estimators, heterogeneity splits, and robustness checks. Without a ledger:
The results DB is the single source of truth. Every time you run an analysis, log it. Every time you open a paper-writing session, start with story or status.
Initialize a new project database (creates results/results_database.csv):
python ~/.claude/skills/results-db/scripts/results_db.py init --project /path/to/project
The DB file path defaults to <project>/results/results_database.csv. Pass --db to override.
Each row is one estimate (one DV × one sample × one estimator):
| Column | Type | Description |
|---|---|---|
id | int | Auto-incremented row ID |
section | str | Paper section: market / package / mechanism / robustness / heterogeneity / welfare |
hypothesis | str | Which hypothesis tested (e.g. H1, H2a, free text) |
estimator | str | C&S / TWFE / S&A / SDiD / OLS / ITS / RI / Honest DiD etc. |
dv | str | Outcome variable column name (machine-readable) |
dv_label | str | Human-readable outcome label |
sample | str | Sample description (Full / DepQ4 / Active / Data Science etc.) |
att | float | Point estimate (ATT or coefficient) |
se | float | Standard error |
p | float | p-value |
sig | str | Stars: *** / ** / * / n.s. |
n | int | Sample size |
ci_lo | float | Lower 95% CI (optional) |
ci_hi | float | Upper 95% CI (optional) |
in_paper | str | main / appendix / dropped / tbd |
paper_version | str | Which paper version (e.g. JPE, MS IS, WP) |
referee_round | str | original / R1 / R2 etc. |
language | str | Execution language: Python / R / Stata |
model_spec | str | Model spec summary (e.g. C&S DiD, pkg+month FE, HC1 SEs) |
pre_trend_test | str | Pre-trend / placebo test result (e.g. RI p=0.000, event study F p=0.43) |
pre_trend_pass | str | pass / fail / conditional — did parallel trends hold? |
honest_did_m | str | Honest DiD breakdown M value (e.g. 0.0000, 0.025) |
honest_did_pass | str | pass / fail / conditional — did Honest DiD sensitivity hold? |
table_file | str | Path to LaTeX table (relative to project root) |
figure_file | str | Path to figure PDF (relative to project root) |
source_csv | str | CSV file the estimate was read from |
notes | str | Caveats — failed placebo, RI issues, small N, Honest DiD breakdown etc. |
All commands accept --project /path/to/project (defaults to current directory) or --db /explicit/path/to/db.csv.
python ~/.claude/skills/results-db/scripts/results_db.py init --project .
# All results
python ~/.claude/skills/results-db/scripts/results_db.py show --project .
# Filter by section and minimum significance
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --section mechanism --sig "*"
# What still needs a decision?
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --in_paper tbd
# Show only main-text results
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --in_paper main
# Show only Python-executed results
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --language Python
# Show results where parallel trends failed
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --pre_trend_pass fail
Supports: --section, --sig (minimum: *, **, ***), --in_paper, --estimator, --dv, --sample, --language, --pre_trend_pass, --honest_did_pass, --paper_version, --referee_round
python ~/.claude/skills/results-db/scripts/results_db.py story --project .
python ~/.claude/skills/results-db/scripts/results_db.py story --project . --section package
python ~/.claude/skills/results-db/scripts/results_db.py story --project . --forest # ASCII forest plot
Groups by section → hypothesis → outcome. Shows ATT, sig, in_paper status, language, and (when present) pre-trend test result and Honest DiD breakdown M. Good for sending to co-authors or starting a writing session.
python ~/.claude/skills/results-db/scripts/results_db.py status --project .
Shows counts by in_paper value and by section. Quick health check before writing.
python ~/.claude/skills/results-db/scripts/results_db.py add --project . \
--section package --hypothesis H1 --estimator "C&S" \
--dv delta_num_as_dep --dv_label "Δ Downstream (monthly flow)" \
--sample DepQ4 --att 0.2194 --se 0.0789 --p 0.007 --sig "***" --n 24336 \
--in_paper main \
--language Python --model_spec "C&S DiD, pkg+month FE, HC1 SEs" \
--pre_trend_test "RI p=0.000" --pre_trend_pass pass \
--honest_did_m "0.025" --honest_did_pass pass \
--table_file "results/tables/downstream_table_all_estimators_by_dep_quartile_py.tex" \
--notes ""
All fields except att, dv, sample are optional but encouraged.
# By ID
python ~/.claude/skills/results-db/scripts/results_db.py update --project . --id 42 --in_paper appendix
# By DV + sample (updates all matching rows)
python ~/.claude/skills/results-db/scripts/results_db.py update --project . \
--dv fork_hhi --sample Full --in_paper dropped \
--pre_trend_pass fail \
--notes "Placebo fails p=0.004, pre-trend non-parallel"
# Update Honest DiD result
python ~/.claude/skills/results-db/scripts/results_db.py update --project . \
--id 17 --honest_did_m "0.031" --honest_did_pass conditional
All updatable: in_paper, notes, section, hypothesis, paper_version, referee_round, language, model_spec, pre_trend_test, pre_trend_pass, honest_did_m, honest_did_pass, table_file, figure_file
python ~/.claude/skills/results-db/scripts/results_db.py export --project . --format md
python ~/.claude/skills/results-db/scripts/results_db.py export --project . --format latex --in_paper main
python ~/.claude/skills/results-db/scripts/results_db.py lint --project .
Checks for:
main and droppedpython ~/.claude/skills/results-db/scripts/results_db.py compare --project . --dv delta_num_as_dep
python ~/.claude/skills/results-db/scripts/results_db.py compare --project . --dv log_fork_count --sample DepQ4
Shows all estimators for a given DV × sample in a table. Useful for checking TWFE vs C&S agreement.
# See what's new without writing
python ~/.claude/skills/results-db/scripts/results_db.py sync --project . --source-dir results/tables/
# Apply new estimates to DB
python ~/.claude/skills/results-db/scripts/results_db.py sync --project . --source-dir results/tables/ --apply
# Also scan Stargazer .tex files
python ~/.claude/skills/results-db/scripts/results_db.py sync --project . --source-dir results/tables/ --include-tex
Auto-detects CSV format (generic, modelsummary, statsmodels) and Stargazer .tex. Reports new estimates not yet in DB and existing estimates where the value has changed.
python ~/.claude/skills/results-db/scripts/results_db.py check --project .
For every row with a source_csv, re-reads the source and verifies the ATT still matches. Reports drift.
python ~/.claude/skills/results-db/scripts/results_db.py history --project . --id 42
python ~/.claude/skills/results-db/scripts/results_db.py history --project . --dv log_fork_count
Shows every field change ever made to a result (appended to results_history.csv by all update calls).
python ~/.claude/skills/results-db/scripts/results_db.py template --project . --paper-name mypaper
Generates a populate_mypaper.py skeleton with the r() helper pre-configured.
Claude reads this skill description and invokes it automatically when you are working on an empirical paper. You do not need to ask for it explicitly.
At the start of every empirical paper session Claude will automatically run status and show --in_paper tbd without being asked.
Trigger phrases — querying results
Trigger phrases — pre-trends and validation
Trigger phrases — narrative and writing
Trigger phrases — logging and updating
Trigger phrases — referee and revision
Trigger phrases — quality control
Trigger phrases — overview and status
python ~/.claude/skills/results-db/scripts/results_db.py status --project .
python ~/.claude/skills/results-db/scripts/results_db.py show --project . --in_paper tbd
add each primary estimate (C&S or primary estimator) with --language, --model_spec, --pre_trend_test, --pre_trend_passadd robustness variants with --in_paper appendix--notes; set --pre_trend_pass fail or --honest_did_pass fail when checks don't pass--in_paper tbd until you decidepython ~/.claude/skills/results-db/scripts/results_db.py lint --project .
Fix all errors; review all warnings.
notes when anything is imperfect: small N, failed placebo, RI failure, borderline pre-trend.pre_trend_test and pre_trend_pass for every DiD result. pass / fail / conditional.honest_did_m and honest_did_pass whenever you run Rambachan & Roth sensitivity.language and model_spec so co-authors can reproduce any result from the ledger alone.in_paper dropped or appendix). They matter for the story.history.| section | what goes here |
|---|---|
market | Market-level DiD (HHI, entropy, etc.) |
package | Package-level DiD by dep/pop/age quartile |
mechanism | Attention market, dep composition, community detection |
heterogeneity | Category, company vs community, activity splits |
robustness | Placebos, alt dates, RI, Honest DiD, PPML |
welfare | Weitzman variety loss, ITS magnitudes |
replication | Rust/Haskell, pooled 4-eco |
See references/publishing.md for instructions on packaging and sharing this skill on GitHub.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub batikas/results-db-skill --plugin results-db