From grainulator
Scores past predictions against actual sprint outcomes, creates calibration claims, computes accuracy scorecards by evidence tier and claim type. Useful for feedback loops after implementations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/grainulator:calibrateThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The user wants to check what actually happened after a sprint's recommendations were implemented.
The user wants to check what actually happened after a sprint's recommendations were implemented.
$ARGUMENTS
Expected format: /calibrate --outcome "what happened" or /calibrate <claim_id> "actual result"
Parse the outcome: The user provides outcome data as free text or claim-specific results.
Match outcomes to predictions: Use wheat_search to find the original estimate, recommendation, or risk claims that predicted something. Compare prediction to actual outcome.
Create calibration claims as cal### claims with evidence tier production (these are real outcomes):
Compute accuracy scorecard:
stated vs web vs documented vs tested claims were accurate?Run wheat_compile.
Print scorecard:
Calibration results:
Predictions scored: <N>
Accurate: <N> (<percent>)
Partially accurate: <N>
Wrong: <N>
Accuracy by evidence tier:
stated: <percent>
web: <percent>
documented: <percent>
tested: <percent>
Next steps:
/brief -- recompile with calibrated data
/research <topic> -- investigate where predictions went wrong
npx claudepluginhub grainulation/grainulator --plugin grainulatorScores completed OKR sets at cycle close with KR-level scoring, committed vs. aspirational interpretation, evidence quality assessment, and next-cycle recommendations. Protects OKR integrity and pairs with foundation-okr-writer.
Analyzes sprint claims for type distributions, evidence quality tiers, stale claims over 7 days, velocity metrics, and prediction scoring. Generates HTML retrospective reports.
Analyzes sprint delivery data to generate a structured retrospective brief with completion stats, pattern analysis, Start/Stop/Continue prompts, and a concrete experiment.