From great-econometrics
Econometrics skill for Difference-in-Differences (DID) analysis. Activates when the user asks about: "difference in differences", "DID", "DiD", "diff-in-diff", "parallel trends", "treatment group", "control group", "pre-treatment", "post-treatment", "policy evaluation", "natural experiment", "staggered DID", "event study regression", "two-way fixed effects DID", "callaway santanna", "sun and abraham", "双重差分", "倍差法", "平行趋势", "处理组", "对照组", "政策评估", "事件研究", "交错DID", "渐进处理"
How this skill is triggered — by the user, by Claude, or both
Slash command
/great-econometrics:did-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill guides complete DID analysis: from assumption validation and model specification to staggered treatment designs and event study regressions. Designed for policy evaluation and natural experiment settings.
This skill guides complete DID analysis: from assumption validation and model specification to staggered treatment designs and event study regressions. Designed for policy evaluation and natural experiment settings.
DID compares the change in outcomes for a treatment group before and after treatment to the change for a control group over the same period.
DID Estimator = (Ȳ_treat,post − Ȳ_treat,pre) − (Ȳ_ctrl,post − Ȳ_ctrl,pre)
Key Assumption (Parallel Trends): In the absence of treatment, the treatment group's outcome would have evolved in parallel with the control group.
Y_it = β₀ + β₁·Treat_i + β₂·Post_t + β₃·(Treat_i × Post_t) + ε_it
β₃ = DID estimate (ATT)
# Python — 2×2 DID with TWFE
import statsmodels.formula.api as smf
# Simple 2x2
model = smf.ols('y ~ treat + post + treat_post', data=df).fit(cov_type='HC3')
# TWFE with entity and time FE (preferred)
from linearmodels.panel import PanelOLS
df_panel = df.set_index(['entity_id', 'year'])
twfe = PanelOLS(df_panel['y'], df_panel[['treat_post']],
entity_effects=True, time_effects=True)
result = twfe.fit(cov_type='clustered', cluster_entity=True)
print(result.summary)
# R — TWFE
library(plm); library(lmtest); library(sandwich)
panel_df <- pdata.frame(df, index = c("entity_id", "year"))
twfe <- plm(y ~ treat_post, data = panel_df, model = "within", effect = "twoways")
coeftest(twfe, vcov = vcovHC(twfe, cluster = "group"))
* Stata — TWFE with clustered SE
xtset entity_id year
xtreg y treat_post i.year, fe cluster(entity_id)
* Or equivalently:
reghdfe y treat_post, absorb(entity_id year) cluster(entity_id)
Replace the single treat_post dummy with relative-time dummies to visualize pre-trends:
* Stata — event study
reghdfe y ib(-1).rel_time, absorb(entity_id year) cluster(entity_id)
coefplot, vertical yline(0) xline(0) ///
title("Event Study: Pre/Post Treatment Effects") ///
xlabel(, angle(45))
# R — event study
library(fixest)
es_model <- feols(y ~ i(rel_time, treat, ref = -1) | entity_id + year,
data = df, cluster = ~entity_id)
iplot(es_model, xlab = "Periods relative to treatment")
Interpreting the event study plot:
When units adopt treatment at different times, standard TWFE can be biased (Callaway-Sant'Anna, Sun-Abraham).
# R — Callaway-Sant'Anna estimator (csdid)
library(did)
cs_result <- att_gt(yname = "y",
gname = "cohort_year", # year of first treatment (0 if never treated)
idname = "entity_id",
tname = "year",
xformla = ~x1 + x2,
data = df)
# Aggregate to average ATT
aggte(cs_result, type = "simple") # Overall ATT
aggte(cs_result, type = "dynamic") # Dynamic effects
ggdid(cs_result)
# R — Sun-Abraham (fixest)
library(fixest)
sa_model <- feols(y ~ sunab(cohort_year, year) | entity_id + year,
data = df, cluster = ~entity_id)
iplot(sa_model)
* Stata — Callaway-Sant'Anna (csdid from SSC)
csdid y x1 x2, ivar(entity_id) time(year) gvar(cohort_year)
csdid_plot
See references/did-reference.md for heterogeneous treatment effects, triple-difference models, synthetic control comparison, Borusyak-Jaravel-Spiess imputation estimator, de Chaisemartin-D'Haultfoeuille estimator, and Roth (2022) pre-trends power analysis.
npx claudepluginhub zhouziyue233/great-econometrics --plugin econometricsGuides through complete difference-in-differences analysis: setup, parallel trends testing, staggered rollout handling, robustness checks, and plain-language interpretation.
Designs, runs, and critiques causal inference workflows in Stata for identification strategies, treatment effects, DiD, IV, event studies, RD, and assumption-sensitive empirical claims.
Guides phased Stata workflows for DiD, IV, matching, panel methods, and more for publication-ready sociology research. Use for quantitative academic analysis.