Skill

hillclimb-loop

Run the hill-climbing loop autonomously: spawn subagents that invoke the `hillclimb-execute` → `hillclimb-verify` → (when stuck) `hillclimb-brainstorm` skills, iteration after iteration, until a stop condition fires. Uses git for per-iteration checkpoints with automatic rollback on failed verifications. Requires `.hillclimb/` to exist (run the `hillclimb-onboard` skill first); on a dirty git tree, asks how to handle the pending changes (commit / stash / abort) before starting.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/hillclimb:hillclimb-loop

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill orchestrates the cycle the user would otherwise run by hand. Each

SKILL.md

348 lines · ~3.9k tokens

Stats

LanguageHTML

Stars1

MaintenanceExcellent

Last CommitMay 27, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

hillclimb-loop: autonomous execute → verify → brainstorm loop

This skill orchestrates the cycle the user would otherwise run by hand. Each phase runs in a fresh general-purpose subagent invoking the matching hillclimb-* skill via the Skill tool. The orchestrator's main context stays small and each iteration starts clean. Git provides per-iteration checkpoints; failed iterations roll back code, successful ones become commits on a hillclimb-loop branch the user can merge or discard. Each pass commit's SHA is recorded on its run via state.py set-commit; later, state.py rollback-to <run_id> restores that run's code.

The orchestrator never edits code or runs the verifier itself. It dispatches, parses replies, decides keep-or-roll-back, and updates loop counters.

The First principle from hillclimb-execute applies recursively here, a loop that commits cheated results across many iterations is catastrophic. If a subagent's behavior makes you suspect spec gaming, stop the loop.

Pre-flight

Bail on the first failure with a clear message.

STATE_PY="$PWD/.hillclimb/state.py"; STATE_HTML="$PWD/.hillclimb/state.html"
[ -f "$STATE_PY" ] && [ -f "$STATE_HTML" ] || { echo "no .hillclimb/, run the hillclimb-onboard skill first"; exit 1; }
git rev-parse --is-inside-work-tree >/dev/null 2>&1 \
  || { echo "the hillclimb-loop skill needs a git repo for checkpoints. Run \`git init\` first."; exit 1; }

If git status --porcelain is non-empty, ask the user (one AskUserQuestion) whether to: (a) commit the pending changes on the current branch, (b) git stash push -u -m "pre-hillclimb-loop" (the -u captures untracked files too, onboarding's freshly-scaffolded .hillclimb/ artifacts are usually untracked), or (c) abort. Don't silently include uncommitted work in iteration 1's diff. After stash, re-check git status --porcelain, if it's still non-empty (rare: ignored files survive stash), abort with the residual paths so the user can resolve manually.

Then switch to (or create) the hillclimb-loop branch and capture the pre-loop SHA so the final report can show only this loop's commits:

git rev-parse --verify hillclimb-loop >/dev/null 2>&1 \
  && git switch hillclimb-loop || git switch -c hillclimb-loop
ROOT_SHA=$(git rev-parse HEAD)

Read loop settings from state

All loop knobs are derived from state.py. The user expresses intent once during onboarding (or by editing state later via state.py set); the loop reads, never re-parses. Ignore $ARGUMENTS entirely — the orchestrator does not pass any, and a stray value should not change behavior.

python3 "$STATE_PY" read "$STATE_HTML" | python3 -c "
import json, sys
p = json.load(sys.stdin)['project']
print('target:', p['objective'].get('target'))
print('stop_criteria:', p.get('stop_criteria'))
print('loop:', json.dumps(p.get('loop') or {}))
"

The fields that matter: project.objective.target (numeric or null), project.stop_criteria (freeform string from onboarding), project.loop (opt-in override block; onboarding never seeds it). Use semantic judgment on stop_criteria, not lexical matching. Phrases that mean forever: "until I interrupt", "no automatic stop", "loop forever", "until user stops". Phrases that do NOT: "user satisfied", "manual review", "stop on first regression". Below, forever stands for that resolved boolean.

Setting	Default
`MAX_ITER`	`100`
`STUCK_THRESHOLD`	`3`
`STALL_BUDGET`	`5`
`GREEDY`	`yes`

Resolution (first match wins):

MAX_ITER         = loop.max_iter         if set
                 | 100000                if forever
                 | 300                   if target is set
                 | 100

STUCK_THRESHOLD  = loop.stuck_threshold  if set
                 | 5                     if loop.patient is JSON true
                 | 3

STALL_BUDGET     = loop.stall_budget     if set
                 | 100000                if forever    # effectively off
                 | 5

GREEDY           = "no" if loop.greedy is JSON false
                 | "yes"

GREEDY="yes" (default, greedy): only verified runs that improve best are kept in HEAD; other passing runs are committed then rolled back. GREEDY="no" (lazy, opt in with loop.greedy=false): every passing run's code is kept. Only Phase D's pass branch differs; the best update is unchanged. See WORKFLOW.md section 9.14 for the design rationale.

Target-met always stops the loop when target is set — Phase E #1 has no user-facing override. To turn that stop off, clear project.objective.target. STUCK_THRESHOLD is intentionally unaffected by "forever": brainstorms remain the loop's escape hatch even when no iteration cap is in play.

State a one-line plan to the user before starting, derived from the resolved knobs, so they can interrupt early if it's wrong. Examples:

TARGET=2.6, STOP=interrupt: "Running on hillclimb-loop branch until best score reaches 2.6 or you interrupt; brainstorm after 3 stuck rounds."
TARGET unset, STOP=interrupt: "Running on hillclimb-loop branch until you interrupt; brainstorm after 3 stuck rounds (stall budget disabled under forever)."
TARGET=0.05, STOP=target met: "Running up to 300 iterations on hillclimb-loop branch; stop on target met; brainstorm after 3 stuck rounds, stop after 5 stalled rounds following a brainstorm."
TARGET unset, STOP="user satisfied": "Running up to 100 iterations on hillclimb-loop branch (no target set, no forever signal); brainstorm after 3 stuck rounds, stop after 5 stalled rounds following a brainstorm."
TARGET=2.6, STOP=interrupt, loop.max_iter=20: "Running up to 20 iterations on hillclimb-loop branch (explicit cap overrides 'forever') or until best score reaches 2.6; brainstorm after 3 stuck rounds."
TARGET=2.6, STOP=interrupt, loop.greedy=false: "Running on hillclimb-loop branch until best score reaches 2.6 or you interrupt; lazy mode (greedy disabled) — every passing run's code is kept, HEAD walks sideways across plateaus; brainstorm after 3 stuck rounds."

Subagent prompt template

Each phase below dispatches with this shape, fresh subagents inherit no context, so each prompt is self-contained:

"Run the /<skill> skill via the Skill tool on the project at $PWD. Follow that skill's instructions exactly, including its First principle about honest work. Do NOT run git push, git remote, or any network git command.

When done, reply with a single fenced json block containing only these keys (no extra prose, no logs, no file contents):
<reply contract for this phase>
"

The loop

Initialize ITER_N=0, NO_IMPROVE=0, BRAINSTORMS_DONE=0. NO_IMPROVE resets to 0 after every brainstorm, so the "stalled-after-brainstorm" condition is just NO_IMPROVE >= STALL_BUDGET AND BRAINSTORMS_DONE >= 1

no separate POST_BRAINSTORM_RUNS counter needed.

Phase A: Checkpoint

ITER_N=$((ITER_N + 1))
PRE_SHA=$(git rev-parse HEAD)

Phase B: Execute (subagent)

Dispatch the template with <skill> = hillclimb-execute and reply contract:

{ "run_id": "<R-id or null>",
  "idea_id": "<I-id or null>",
  "status": "ok" | "blocked" | "error",
  "summary": "<one sentence>",
  "blocked_reason": "<set only if status is blocked>" }

Capture RUN_ID, STATUS. If STATUS == "blocked" and blocked_reason indicates no open ideas:

If BRAINSTORMS_DONE == 0: jump to Phase F without verifying; don't count this as an iteration.
Else: stop with reason "out of ideas after ${BRAINSTORMS_DONE} brainstorms".

Phase C: Verify (subagent)

Dispatch the template with <skill> = hillclimb-verify and reply contract:

{ "run_id": "<R-id>",
  "status": "pass" | "fail" | "inconclusive",
  "score": <number or null>,
  "notes": "<one sentence>" }

Capture STATUS, SCORE. Compute IMPROVED orchestrator-side:

IMPROVED=$(python3 "$STATE_PY" read "$STATE_HTML" \
  | python3 -c "import json,sys;s=json.load(sys.stdin);print('yes' if s.get('best',{}).get('run_id')=='${RUN_ID}' else 'no')")

Phase D: Decide (orchestrator-side)

case "$STATUS" in
  pass)
    ROLLBACK=no
    if [ "$IMPROVED" = "yes" ]; then
      MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (new best, ${RUN_ID})"
      NO_IMPROVE=0
    elif [ "$GREEDY" = "yes" ]; then
      MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (no improvement, rolling back, ${RUN_ID})"
      NO_IMPROVE=$((NO_IMPROVE + 1))
      ROLLBACK=yes
    else
      MSG="hillclimb-iter-${ITER_N}: pass score=${SCORE} (no improvement, ${RUN_ID})"
      NO_IMPROVE=$((NO_IMPROVE + 1))
    fi
    git add -A && git commit -m "$MSG" --allow-empty
    COMMIT_SHA=$(git rev-parse HEAD)
    python3 "$STATE_PY" set-commit "$STATE_HTML" "$RUN_ID" "$COMMIT_SHA" "$MSG"
    if [ "$ROLLBACK" = "yes" ]; then
      git checkout "$PRE_SHA" -- . ':!.hillclimb/state.html'
      git commit -m "hillclimb-iter-${ITER_N}: rolled back code to PRE_SHA (greedy, ${RUN_ID})" --allow-empty
    fi
    ;;
  fail|inconclusive)
    git checkout "$PRE_SHA" -- . ':!.hillclimb/state.html'
    git commit -m "hillclimb-iter-${ITER_N}: ${STATUS} (rolled back code, ${RUN_ID})" --allow-empty
    NO_IMPROVE=$((NO_IMPROVE + 1)) ;;
esac

Why this exact rollback form (load-bearing, do not "simplify" to git reset --hard): the dashboard's value comes from showing every attempt, including failures; chart, log, and brainstorm diagnosis all depend on the run record in state.html. state.html is gitignored, so git operations don't touch it; the :!.hillclimb/state.html pathspec is belt-and-suspenders should the gitignore ever fail to apply (git reset --hard would still clobber it in that case). The checkout form reverts every tracked path except state.html in one operation, no temp files, no race window.

Why greedy is commit-then-rollback, not rollback-only. The pass- but-no-improvement run's code is real, verified work; we want state.py rollback-to <R-id> to be able to restore it for replay. The orchestrator commits the run's tree (so set-commit has a real SHA pointing at the run's actual code), then reverts the working tree with a marker commit so the next iteration starts from PRE_SHA. The cost is two commits per no-improvement greedy iteration instead of one (the run's tree, then a revert marker); the benefit is replay parity with lazy mode.

No git add -A on the rollback marker commits. The checkout updates the index for tracked paths to PRE_SHA's tree, so each marker commit's tree equals PRE_SHA's tree. Adding git add -A here would promote the iteration's untracked artifacts (logs, checkpoints) into tracked files baked into next iteration's PRE_SHA. Without it, untracked files stay untracked in the working tree (the loop never git cleans, those are usually what the user wants to inspect).

--allow-empty on all three commit sites (the pass commit, the fail/inconclusive marker, the greedy revert marker) guards the case where the iteration's only on-disk change was state.html — gitignored and excluded by the pathspec, so both git add -A and the post-checkout index can leave nothing to commit.

Phase E: Stop conditions

Check in order; stop on the first match.

#	Condition	Reason text
1	`objective.target` is set AND `best.score` crosses it	`"target met"`
2	`ITER_N >= MAX_ITER`	`"max iterations"`
3	`BRAINSTORMS_DONE >= 1` AND `NO_IMPROVE >= STALL_BUDGET`	`"stalled after brainstorm"`
4	Out of ideas after a brainstorm produced none	from Phase F
5	`no-signal` brainstorm diagnosis	from Phase F

Direction-aware target check: <= target for minimize, >= target for maximize. If none triggers, continue to Phase F.

Phase F: Brainstorm (conditional, subagent)

Trigger on NO_IMPROVE >= STUCK_THRESHOLD OR a Phase B "no open ideas" short-circuit. Otherwise skip to Phase A.

Dispatch the template with <skill> = hillclimb-brainstorm and reply contract:

{ "diagnosis": "cold-start" | "promising" | "stuck" | "no-signal",
  "ideas_added": <integer>,
  "titles": ["<title>", ...] }

Increment BRAINSTORMS_DONE; reset NO_IMPROVE=0.

diagnosis == "no-signal" → stop with that reason; tell the user the verifier or objective needs fixing before further iteration is meaningful.
ideas_added == 0 AND no open ideas remain → stop with reason "out of ideas; brainstorm produced none".
Otherwise drop a marker commit and loop back. state.html is gitignored, so this commit has no diff content; it just chronologically marks where brainstorm fired in git log:

git commit -m "hillclimb-brainstorm-${BRAINSTORMS_DONE}: added ${IDEAS_ADDED} ideas" --allow-empty

Final report

Show the user, in under 15 lines:

Outcome. "Stopped after ${ITER_N} iterations: ${STOP_REASON}."
Improvements. Baseline score, final best.score, delta absolute and percent. Chain of best.run_id updates over the run.
Brainstorms. Count, total ideas added.
Git trajectory. git log --oneline ${ROOT_SHA}..HEAD so the user sees iteration commits, improvements, no-improvements, rollbacks distinct.
Next steps, neutrally:
- Stay on hillclimb-loop and run hillclimb-loop skill again.
- git switch <main> && git merge hillclimb-loop (or cherry-pick).
- git switch <main> && git branch -D hillclimb-loop to discard.
Dashboard: file://${PWD}/.hillclimb/state.html.

The dashboard is the long-form artifact; this hand-off is just orientation.

Rules

Honest execution always. Each subagent invokes the underlying skill, so its rules carry through. If you suspect a subagent gamed a check, stop the loop and surface it.
One iteration at a time. Never spawn two execute subagents in parallel. The state's in-progress invariant catches concurrent writers, but the loop's logic assumes serialized iterations.
Subagent prompts must be self-contained. Use the template above fresh subagents inherit no orchestrator context.
No network git commands. The hillclimb-loop branch is local until the user explicitly merges or discards it.
No state.html rollback. state.html is gitignored, and Phase D's pathspec exclusion is belt-and-suspenders against the gitignore ever failing to apply. Don't git add -f .hillclimb/state.html. The rationale lives in Phase D; don't paraphrase it elsewhere.

hillclimb-loop

Popularity

Invocation

Context Preview

SKILL.md

hillclimb-loop

Popularity

Invocation

Context Preview

SKILL.md

hillclimb-loop: autonomous execute → verify → brainstorm loop

Pre-flight

Read loop settings from state

Subagent prompt template

The loop

Phase A: Checkpoint

Phase B: Execute (subagent)

Phase C: Verify (subagent)

Phase D: Decide (orchestrator-side)

Phase E: Stop conditions

Phase F: Brainstorm (conditional, subagent)

Final report

Rules

Similar Skills

hillclimb-loop: autonomous execute → verify → brainstorm loop

Pre-flight

Read loop settings from state

Subagent prompt template

The loop

Phase A: Checkpoint

Phase B: Execute (subagent)

Phase C: Verify (subagent)

Phase D: Decide (orchestrator-side)

Phase E: Stop conditions

Phase F: Brainstorm (conditional, subagent)

Final report

Rules

Similar Skills