From gilfoyle
Replaces writing-plans. Decomposes a falsifiable design into small slices, each with mandatory complexity budget, scale budget, stress fixture, and oracle. Refuses to run until falsifiable-design has produced an approved design with a passing cheapest-falsifier.
How this skill is triggered — by the user, by Claude, or both
Slash command
/gilfoyle:budgeted-planThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A plan is a sequence of falsifiable hypotheses, not a sequence of pre-typed code blocks.
A plan is a sequence of falsifiable hypotheses, not a sequence of pre-typed code blocks.
The standard convention is a 2949-line plan that pre-types every line of code into checklist form, then "executes" by typing the code into source files. That is not planning. That is dictation. We do not produce dictation.
Plan unit = slice.
Slice = (claim, falsifier, smallest code change, complexity budget, stress fixture).
A slice is done when:
- its unit tests pass, AND
- its stress fixture produces the expected result, AND
- the prove-it-prototype oracle still agrees with the binary, AND
- the complexity budget holds at the slice's scale.
Not before.
After falsifiable-design has produced an approved design with a passing cheapest-falsifier. Not before.
Take the claim list from the design. Each claim becomes one slice. If a claim is too big for one slice — more than ~50 lines of code, or touches more than 2 files — split the claim until it isn't.
A slice should be implementable in 30 minutes or less. If it takes longer, you have not decomposed enough.
Each slice has these fields before any code:
## Slice N: [one-sentence purpose]
**Claim:** [the design claim this slice implements]
**Oracle:** [independent computation; default = the prove-it-prototype oracle]
**Stress fixture:** [input designed to break a plausible bug, not just exercise the code]
**Loop budget:** [for each new loop: asymptotic cost AND production scale]
**Wall budget:** [only for always-on phases: max wall-clock at production scale]
**Files:** [exact paths to create or modify]
**Code (advisory):**
[you may pre-type code here, but the implementer is permitted to deviate if the deviation keeps the slice within budget and the oracle passing]
**Verification:**
- [ ] Unit tests pass
- [ ] Stress fixture produces expected outcome
- [ ] prove-it-prototype oracle still agrees with binary
- [ ] Loop and wall budgets hold at fixture scale
If you cannot fill in all the mandatory fields, the slice is not ready. Do not fake it. Go think.
For every new loop the slice introduces:
O(files × crates), O(symbols log symbols), O(n + m).files ≈ 50k, crates ≈ 200.A loop without a complexity statement is a budget violation. Do not write one.
For every slice that implements logic (not pure types, not pure schema):
The stress fixture's expected output is written down before the implementation. Not after.
If you cannot think of a plausible bug for a slice, the slice is either too small to be worth its own test (combine with the next one) or you haven't thought hard enough (think harder).
For every doc comment in the slice that says "callers must X" or "Y is a precondition" or "Z must be non-empty," classify the precondition's enforcement strength FIRST, then enforce accordingly:
Load-bearing for correctness. If violating the precondition would silently produce wrong output (e.g., empty prefix → SQL LIKE '%' matches everything), add a runtime check that survives release builds. Return an error or a documented refusal value. debug_assert! alone is wrong here — it compiles out in release and the contract becomes a fiction.
Sanity hint for callers. If violating the precondition is "programmer error that would never reach production with a sane caller," debug_assert! is appropriate. Dev/test catches it; release tolerates it without semantic damage.
The test: ask "what does the function silently produce in release builds if this precondition is violated?" If the answer is "wrong output," you need a runtime check. If the answer is "nonsense that the caller would have caught upstream anyway," debug_assert! is fine.
A documented precondition without ANY enforcement is a documentation lie. We do not ship those.
For every thing the slice writes to:
| jq, | grep) want to see this?" If yes, data. If no, diagnostic.If the rule is violated, justify it in writing in the slice. Don't ship an unexamined println! to a file descriptor.
Before saving the plan, run these five lists:
debug_assert! for the latter)?falsifiable-design: every "deferred to rivets-XXX" / "out of scope per rivets-YYY" / "tracked elsewhere" / "follow-up" in the plan must resolve to an existing tracker issue whose content covers the deferred work. Deferrals without a citation get an issue filed now before the plan is finalized — don't wait for review feedback to surface the gap.If any of the five lists has gaps, the plan is incomplete. Don't save it.
The next skill — checkpointed-build — refuses to run until:
falsifiable-design)O(?) or "depends on input." Write down what it depends on, and bound it.This is not a script for an executor to type from. The plan is a hypothesis the implementer is allowed to update as they learn. Pre-typed code blocks are suggestions. If the implementer finds a better algorithm during execution, they take it — provided the oracle still passes and the budget still holds. The fields are the contract. The code blocks are scaffolding.
A plan document with:
## Plan Self-Review section at the bottom listing the five lists from step 7, all empty (no gaps).If the plan has gaps, the skill didn't run. Run it.
npx claudepluginhub dwalleck/gilfoyle --plugin gilfoyleProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.