From hillclimb
Run the hill-climbing loop autonomously: spawn subagents that invoke the `hillclimb-execute` → `hillclimb-verify` → (when stuck) `hillclimb-brainstorm` skills, iteration after iteration, until a stop condition fires. Uses git for per-iteration checkpoints with automatic rollback on failed verifications. Requires `.hillclimb/` to exist (run the `hillclimb-onboard` skill first); on a dirty git tree, asks how to handle the pending changes (commit / stash / abort) before starting.
How this skill is triggered — by the user, by Claude, or both
Slash command
/hillclimb:hillclimb-loopThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill orchestrates the cycle the user would otherwise run by hand. Each
This skill orchestrates the cycle the user would otherwise run by hand. Each
phase runs in a fresh general-purpose subagent invoking the matching
hillclimb-* skill via the Skill tool. The orchestrator's main context stays
small and each iteration starts clean. Git provides per-iteration
checkpoints; failed iterations roll back code, successful ones become
commits on a hillclimb-loop branch the user can merge or discard. Each
pass commit's SHA is recorded on its run via state.py set-commit; later,
state.py rollback-to <run_id> restores that run's code.
The orchestrator never edits code or runs the verifier itself. It dispatches, parses replies, decides keep-or-roll-back, and updates loop counters.
The First principle from hillclimb-execute applies recursively here, a
loop that commits cheated results across many iterations is catastrophic.
If a subagent's behavior makes you suspect spec gaming, stop the loop.
Bail on the first failure with a clear message.
STATE_PY="$PWD/.hillclimb/state.py"; STATE_HTML="$PWD/.hillclimb/state.html"
[ -f "$STATE_PY" ] && [ -f "$STATE_HTML" ] || { echo "no .hillclimb/, run the hillclimb-onboard skill first"; exit 1; }
git rev-parse --is-inside-work-tree >/dev/null 2>&1 \
|| { echo "the hillclimb-loop skill needs a git repo for checkpoints. Run \`git init\` first."; exit 1; }
If git status --porcelain is non-empty, ask the user (one
AskUserQuestion) whether to: (a) commit the pending changes on the
current branch, (b) git stash push -u -m "pre-hillclimb-loop" (the -u
captures untracked files too, onboarding's freshly-scaffolded
.hillclimb/ artifacts are usually untracked), or (c) abort. Don't
silently include uncommitted work in iteration 1's diff. After stash,
re-check git status --porcelain, if it's still non-empty (rare:
ignored files survive stash), abort with the residual paths so the user
can resolve manually.
Then switch to (or create) the hillclimb-loop branch and capture the
pre-loop SHA so the final report can show only this loop's commits:
git rev-parse --verify hillclimb-loop >/dev/null 2>&1 \
&& git switch hillclimb-loop || git switch -c hillclimb-loop
ROOT_SHA=$(git rev-parse HEAD)
All loop knobs are derived from state.py. The user expresses intent
once during onboarding (or by editing state later via state.py set);
the loop reads, never re-parses. Ignore $ARGUMENTS entirely — the
orchestrator does not pass any, and a stray value should not change
behavior.
python3 "$STATE_PY" read "$STATE_HTML" | python3 -c "
import json, sys
p = json.load(sys.stdin)['project']
print('target:', p['objective'].get('target'))
print('stop_criteria:', p.get('stop_criteria'))
print('loop:', json.dumps(p.get('loop') or {}))
"
The fields that matter: project.objective.target (numeric or null),
project.stop_criteria (freeform string from onboarding), project.loop
(opt-in override block; onboarding never seeds it). Use semantic
judgment on stop_criteria, not lexical matching. Phrases that mean
forever: "until I interrupt", "no automatic stop", "loop forever",
"until user stops". Phrases that do NOT: "user satisfied", "manual
review", "stop on first regression". Below, forever stands for that
resolved boolean.
| Setting | Default |
|---|---|
MAX_ITER | 100 |
STUCK_THRESHOLD | 3 |
STALL_BUDGET | 5 |
GREEDY | yes |
Resolution (first match wins):
MAX_ITER = loop.max_iter if set
| 100000 if forever
| 300 if target is set
| 100
STUCK_THRESHOLD = loop.stuck_threshold if set
| 5 if loop.patient is JSON true
| 3
STALL_BUDGET = loop.stall_budget if set
| 100000 if forever # effectively off
| 5
GREEDY = "no" if loop.greedy is JSON false
| "yes"
GREEDY="yes" (default, greedy): only verified runs that improve
best are kept in HEAD; other passing runs are committed then rolled
back. GREEDY="no" (lazy, opt in with loop.greedy=false): every
passing run's code is kept. Only Phase D's pass branch differs; the
best update is unchanged. See WORKFLOW.md section 9.14 for the design
rationale.
Target-met always stops the loop when target is set — Phase E #1 has no
user-facing override. To turn that stop off, clear project.objective.target.
STUCK_THRESHOLD is intentionally unaffected by "forever": brainstorms
remain the loop's escape hatch even when no iteration cap is in play.
State a one-line plan to the user before starting, derived from the resolved knobs, so they can interrupt early if it's wrong. Examples:
TARGET=2.6, STOP=interrupt: "Running on hillclimb-loop branch until best score reaches 2.6 or you interrupt; brainstorm after 3 stuck rounds."TARGET unset, STOP=interrupt: "Running on hillclimb-loop branch until you interrupt; brainstorm after 3 stuck rounds (stall budget disabled under forever)."TARGET=0.05, STOP=target met: "Running up to 300 iterations on hillclimb-loop branch; stop on target met; brainstorm after 3 stuck rounds, stop after 5 stalled rounds following a brainstorm."TARGET unset, STOP="user satisfied": "Running up to 100 iterations on hillclimb-loop branch (no target set, no forever signal); brainstorm after 3 stuck rounds, stop after 5 stalled rounds following a brainstorm."TARGET=2.6, STOP=interrupt, loop.max_iter=20: "Running up to 20 iterations on hillclimb-loop branch (explicit cap overrides 'forever') or until best score reaches 2.6; brainstorm after 3 stuck rounds."TARGET=2.6, STOP=interrupt, loop.greedy=false: "Running on hillclimb-loop branch until best score reaches 2.6 or you interrupt; lazy mode (greedy disabled) — every passing run's code is kept, HEAD walks sideways across plateaus; brainstorm after 3 stuck rounds."Each phase below dispatches with this shape, fresh subagents inherit no context, so each prompt is self-contained:
"Run the
/<skill>skill via the Skill tool on the project at$PWD. Follow that skill's instructions exactly, including its First principle about honest work. Do NOT rungit push,git remote, or any network git command.When done, reply with a single fenced
jsonblock containing only these keys (no extra prose, no logs, no file contents):<reply contract for this phase>"
Initialize ITER_N=0, NO_IMPROVE=0, BRAINSTORMS_DONE=0. NO_IMPROVE
resets to 0 after every brainstorm, so the "stalled-after-brainstorm"
condition is just NO_IMPROVE >= STALL_BUDGET AND BRAINSTORMS_DONE >= 1
POST_BRAINSTORM_RUNS counter needed.ITER_N=$((ITER_N + 1))
PRE_SHA=$(git rev-parse HEAD)
Dispatch the template with <skill> = hillclimb-execute and reply contract:
{ "run_id": "<R-id or null>",
"idea_id": "<I-id or null>",
"status": "ok" | "blocked" | "error",
"summary": "<one sentence>",
"blocked_reason": "<set only if status is blocked>" }
Capture RUN_ID, STATUS. If STATUS == "blocked" and blocked_reason
indicates no open ideas:
BRAINSTORMS_DONE == 0: jump to Phase F without verifying; don't
count this as an iteration."out of ideas after ${BRAINSTORMS_DONE} brainstorms".Dispatch the template with <skill> = hillclimb-verify and reply contract:
{ "run_id": "<R-id>",
"status": "pass" | "fail" | "inconclusive",
"score": <number or null>,
"notes": "<one sentence>" }
Capture STATUS, SCORE. Compute IMPROVED orchestrator-side:
IMPROVED=$(python3 "$STATE_PY" read "$STATE_HTML" \
| python3 -c "import json,sys;s=json.load(sys.stdin);print('yes' if s.get('best',{}).get('run_id')=='${RUN_ID}' else 'no')")
case "$STATUS" in
pass)
ROLLBACK=no
if [ "$IMPROVED" = "yes" ]; then
MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (new best, ${RUN_ID})"
NO_IMPROVE=0
elif [ "$GREEDY" = "yes" ]; then
MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (no improvement, rolling back, ${RUN_ID})"
NO_IMPROVE=$((NO_IMPROVE + 1))
ROLLBACK=yes
else
MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (no improvement, ${RUN_ID})"
NO_IMPROVE=$((NO_IMPROVE + 1))
fi
git add -A && git commit -m "$MSG" --allow-empty
COMMIT_SHA=$(git rev-parse HEAD)
python3 "$STATE_PY" set-commit "$STATE_HTML" "$RUN_ID" "$COMMIT_SHA" "$MSG"
if [ "$ROLLBACK" = "yes" ]; then
git checkout "$PRE_SHA" -- . ':!.hillclimb/state.html'
git commit -m "hillclimb-iter-${ITER_N}: rolled back code to PRE_SHA (greedy, ${RUN_ID})" --allow-empty
fi
;;
fail|inconclusive)
git checkout "$PRE_SHA" -- . ':!.hillclimb/state.html'
git commit -m "hillclimb-iter-${ITER_N}: ${STATUS} (rolled back code, ${RUN_ID})" --allow-empty
NO_IMPROVE=$((NO_IMPROVE + 1)) ;;
esac
Why this exact rollback form (load-bearing, do not "simplify" to
git reset --hard): the dashboard's value comes from showing every
attempt, including failures; chart, log, and brainstorm diagnosis all
depend on the run record in state.html. state.html is
gitignored, so git operations don't touch it; the
:!.hillclimb/state.html pathspec is belt-and-suspenders should the
gitignore ever fail to apply (git reset --hard would still clobber it
in that case). The checkout form reverts every tracked path except
state.html in one operation, no temp files, no race window.
Why greedy is commit-then-rollback, not rollback-only. The pass-
but-no-improvement run's code is real, verified work; we want
state.py rollback-to <R-id> to be able to restore it for replay. The
orchestrator commits the run's tree (so set-commit has a real SHA
pointing at the run's actual code), then reverts the working tree with
a marker commit so the next iteration starts from PRE_SHA. The cost is
two commits per no-improvement greedy iteration instead of one (the
run's tree, then a revert marker); the benefit is replay parity with
lazy mode.
No git add -A on the rollback marker commits. The checkout
updates the index for tracked paths to PRE_SHA's tree, so each marker
commit's tree equals PRE_SHA's tree. Adding git add -A here would
promote the iteration's untracked artifacts (logs, checkpoints) into
tracked files baked into next iteration's PRE_SHA. Without it,
untracked files stay untracked in the working tree (the loop never
git cleans, those are usually what the user wants to inspect).
--allow-empty on all three commit sites (the pass commit, the
fail/inconclusive marker, the greedy revert marker) guards the case
where the iteration's only on-disk change was state.html — gitignored
and excluded by the pathspec, so both git add -A and the
post-checkout index can leave nothing to commit.
Check in order; stop on the first match.
| # | Condition | Reason text |
|---|---|---|
| 1 | objective.target is set AND best.score crosses it | "target met" |
| 2 | ITER_N >= MAX_ITER | "max iterations" |
| 3 | BRAINSTORMS_DONE >= 1 AND NO_IMPROVE >= STALL_BUDGET | "stalled after brainstorm" |
| 4 | Out of ideas after a brainstorm produced none | from Phase F |
| 5 | no-signal brainstorm diagnosis | from Phase F |
Direction-aware target check: <= target for minimize, >= target for
maximize. If none triggers, continue to Phase F.
Trigger on NO_IMPROVE >= STUCK_THRESHOLD OR a Phase B "no open ideas"
short-circuit. Otherwise skip to Phase A.
Dispatch the template with <skill> = hillclimb-brainstorm and reply contract:
{ "diagnosis": "cold-start" | "promising" | "stuck" | "no-signal",
"ideas_added": <integer>,
"titles": ["<title>", ...] }
Increment BRAINSTORMS_DONE; reset NO_IMPROVE=0.
diagnosis == "no-signal" → stop with that reason; tell the user the
verifier or objective needs fixing before further iteration is meaningful.ideas_added == 0 AND no open ideas remain → stop with reason
"out of ideas; brainstorm produced none".git log:git commit -m "hillclimb-brainstorm-${BRAINSTORMS_DONE}: added ${IDEAS_ADDED} ideas" --allow-empty
Show the user, in under 15 lines:
"Stopped after ${ITER_N} iterations: ${STOP_REASON}."best.score, delta absolute and
percent. Chain of best.run_id updates over the run.git log --oneline ${ROOT_SHA}..HEAD so the user
sees iteration commits, improvements, no-improvements, rollbacks
distinct.hillclimb-loop and run hillclimb-loop skill again.git switch <main> && git merge hillclimb-loop (or cherry-pick).git switch <main> && git branch -D hillclimb-loop to discard.file://${PWD}/.hillclimb/state.html.The dashboard is the long-form artifact; this hand-off is just orientation.
hillclimb-loop branch is local until
the user explicitly merges or discards it.git add -f .hillclimb/state.html. The
rationale lives in Phase D; don't paraphrase it elsewhere.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub ntt123/hill-climbing-skills --plugin hillclimb