From grimoire
Identifies probability distributions for phenomena, computes probabilities/quantiles, fits distributions to data, and validates distributional assumptions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/grimoire:calculate-probability-distributionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Identify and apply the correct probability distribution for a phenomenon — computing probabilities, fitting parameters from data, and validating distributional assumptions with goodness-of-fit tests — to support reliable probabilistic inference and simulation.
Identify and apply the correct probability distribution for a phenomenon — computing probabilities, fitting parameters from data, and validating distributional assumptions with goodness-of-fit tests — to support reliable probabilistic inference and simulation.
Adopted by: Actuarial science (loss distributions), reliability engineering (lifetime distributions), queueing theory (arrival processes), machine learning (likelihood functions), Bayesian inference (prior/posterior), and risk management all depend on correct probability distribution selection. SOA (Society of Actuaries) and CAS (Casualty Actuarial Society) exams test distribution fitting as a core competency. SciPy, R, and MATLAB provide standardized implementations of >100 distributions. Impact: Misidentifying the distribution for a phenomenon leads to incorrect probability estimates. DeGroot & Schervish (2012) demonstrate that the difference between normal and heavy-tailed (e.g., Cauchy) distribution assumptions produces probability estimates that differ by orders of magnitude in the tails — precisely where risk decisions are made. Fitting a normal distribution to financial returns (which are leptokurtic) underestimates tail risk (Black Swan events) catastrophically.
Select distribution family based on the variable's nature:
Discrete distributions:
| Phenomenon | Distribution | Parameters |
|---|---|---|
| Binary outcome (success/failure) | Bernoulli | p |
| Count of successes in n trials | Binomial | n, p |
| Count until first success | Geometric | p |
| Count of rare events in interval | Poisson | λ |
| Count of successes without replacement | Hypergeometric | N, K, n |
Continuous distributions:
| Phenomenon | Distribution | Parameters |
|---|---|---|
| Symmetric bell-shaped (Central Limit Theorem) | Normal (Gaussian) | μ, σ |
| Positively skewed, right-tailed | Log-normal, Gamma, Weibull | — |
| Uniform random selection | Uniform | a, b |
| Time between Poisson events | Exponential | λ |
| Lifetime / reliability modeling | Weibull | k, λ |
| Heavy-tailed extreme values | Pareto, GEV | α, x_min |
| Proportions (bounded 0,1) | Beta | α, β |
For a Normal distribution X ~ N(μ, σ²):
from scipy import stats
dist = stats.norm(loc=mu, scale=sigma)
prob = dist.cdf(x) # P(X ≤ x)
prob_range = dist.cdf(b) - dist.cdf(a) # P(a ≤ X ≤ b)
x_q = dist.ppf(q) # quantile: P(X ≤ x_q) = q
For discrete distributions (Binomial):
dist = stats.binom(n=n, p=p)
prob_exact = dist.pmf(k) # P(X = k)
prob_at_most = dist.cdf(k) # P(X ≤ k)
Key quantiles:
Maximum Likelihood Estimation (MLE) — the standard approach:
import numpy as np
from scipy import stats
data = np.array([...])
# Fit normal distribution (returns mu_hat, sigma_hat)
mu_hat, sigma_hat = stats.norm.fit(data)
# Fit exponential distribution
loc_hat, scale_hat = stats.expon.fit(data, floc=0) # fix location at 0
lambda_hat = 1 / scale_hat # rate parameter
# Compare multiple candidate distributions
for dist_name in ['norm', 'lognorm', 'expon', 'gamma', 'weibull_min']:
dist = getattr(stats, dist_name)
params = dist.fit(data)
ks_stat, ks_p = stats.kstest(data, dist_name, args=params)
print(f"{dist_name}: KS p-value = {ks_p:.4f}")
Method of Moments: equate sample moments (mean, variance) to theoretical moments; less efficient than MLE but useful as starting value.
Always verify fit before using in downstream analysis:
Graphical tests:
Statistical goodness-of-fit tests:
P-value > 0.05 → insufficient evidence to reject the distribution (not proof it is correct).
For two random variables X, Y:
Joint: f(x, y) = probability density at (x, y)
Marginal: f_X(x) = ∫ f(x, y) dy
Conditional: f(y|x) = f(x, y) / f_X(x)
Independence: f(x, y) = f_X(x) × f_Y(y)
Covariance and correlation:
Cov(X, Y) = E[(X-μX)(Y-μY)]
ρ = Cov(X,Y) / (σX σY) ∈ [−1, +1]
For sample mean X̄ of n i.i.d. random variables with mean μ and variance σ²:
X̄ ~ N(μ, σ²/n) approximately, for large n (typically n ≥ 30)
This justifies normal-approximation methods and confidence intervals for population means regardless of the underlying distribution.
npx claudepluginhub jeffreytse/grimoire --plugin grimoireApplies Bayes' Theorem to update beliefs given a specific prior and new evidence. Use when interpreting test results, metrics, or diagnostic signals to avoid overreacting.
Guides statistical analysis with test selection, assumption checking, power analysis, and APA-formatted reporting. Use for academic research or when you need help choosing appropriate tests.
Guides statistical analysis with test selection, assumption checking, power analysis, and APA reporting. Use with /ds:experiment for methodology design, validation, and results.