results-db skill

Structured results ledger for empirical research papers.
This skill keeps a running ledger of your regression estimates so you do not have to remember which specification produced which result. It is designed for quantitative social-science and economics projects with many specifications, heterogeneity splits, robustness checks, and revision cycles.
In practice, you log each estimate once, label the outcome and sample, record the estimator and validation checks, and then use the ledger to decide what belongs in the main text, appendix, or should be dropped.
For example, if you estimate the same dependent variable with C&S and TWFE, and also run a placebo check, the skill stores each result as a separate row, lets you compare them side by side, and keeps the narrative tied to the actual evidence rather than memory.
Requirements
- Python 3.10 or newer
- No third-party Python dependencies for the core CLI
Installation
This repo now ships both the plugin manifest and a one-plugin Claude marketplace catalog, so people can install it through the normal Claude marketplace flow.
Add the marketplace from GitHub:
claude plugin marketplace add batikas/results-db-skill
Install the plugin:
claude plugin install results-db@results-db-skill
For local development and testing:
claude --plugin-dir .
What this skill does
- Logs one row per estimate into a project CSV database
- Filters and summarizes results by section, estimator, sample, significance, and validation status
- Tracks what belongs in the main text, appendix, or should be dropped
- Records pre-trend and Honest DiD checks
- Exports summary tables and checks database integrity before submission
Example
Suppose you estimate the effect of a treatment on a dependent variable with three specifications:
- a main C&S estimate
- a TWFE robustness check
- a placebo that should not show an effect
Results-db stores each of those estimates as a separate row. You can mark the main result as main, keep the robustness check in appendix, move the placebo to dropped if it fails validation, and write the paper from the same ledger instead of chasing outputs across files.
When to use it
Use this skill when you need to:
- Decide which estimates belong in the paper
- Check what is still
tbd
- Summarize the story of your results
- Add new estimates after running analysis
- Update a result after changing specifications
- Audit whether parallel trends or Honest DiD checks passed
Workflow
flowchart LR
A[Analyze] --> B[Log results]
B --> C[Ledger]
C --> D[Review]
D --> E[Place results]
E --> F[Validate]
F --> G[Package]
G --> H[Release]
The workflow is intentionally simple: analyze, log, review, place, validate, package, release.
Repository layout
results-db-skill/
├── README.md
├── .gitignore
├── LICENSE
├── VERSION
├── CITATION.cff
├── RELEASE_CHECKLIST.md
├── scripts/
│ └── package_skill.py
├── examples/
│ ├── README.md
│ ├── example_estimates.csv
│ └── example_results_database.csv
├── tests/
│ └── test_package_skill.py
├── .claude-plugin/
│ └── plugin.json
├── .github/
│ ├── pull_request_template.md
│ ├── ISSUE_TEMPLATE/
│ │ ├── bug_report.md
│ │ └── feature_request.md
│ └── workflows/
│ ├── ci.yml
│ └── release.yml
├── skills/
│ └── results-db/
│ ├── SKILL.md
│ └── scripts/
│ ├── results_db.py
│ └── populate_example.py
└── references/
└── publishing.md
If you are using the repo as a local plugin during development, keep the same commands but drop the skills/results-db/ prefix.
Quick start
- Initialize a database for your project:
python skills/results-db/scripts/results_db.py init --project .
- Check what is already in the paper and what is still pending:
python skills/results-db/scripts/results_db.py status --project .
python skills/results-db/scripts/results_db.py show --project . --in_paper tbd
- Log a new estimate: