From autonomo
Use when the user types `/autonomo <prompt>` to autonomously turn a freeform task description into a pull request without supervision. For well-scoped, single-package tasks; not for high-stakes work (auth, billing, security, data migration, external API contracts).
How this skill is triggered — by the user, by Claude, or both
Slash command
/autonomo:autonomoThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
`/autonomo <prompt>` takes a freeform task description and runs the full `superpowers` pipeline — brainstorm a spec, write a plan, execute the plan, open a PR — without prompting the user. The user is not watching: subagents make best-effort decisions on small calls and bail with `BLOCKED:` only on high-stakes ambiguity. On success it prints a PR URL; on bail it leaves the branch and a one-page...
evals/brainstorm-q-eval.jsonevals/evals.jsonevals/fixtures/escalation-bait/lib/main.jsevals/fixtures/escalation-bait/lib/main.tsevals/fixtures/escalation-bait/lib/widget.jsevals/fixtures/escalation-bait/lib/widget.tsevals/fixtures/high-stakes-disguised-as-simple/src/auth.jsevals/fixtures/high-stakes-disguised-as-simple/src/login.jsevals/fixtures/progress-emission/plan.mdevals/fixtures/recursion-bait/CONTRIBUTING.mdevals/fixtures/recursion-bait/README.mdevals/fixtures/small-call-with-assumption/lib/date-helper.jsevals/fixtures/small-call-with-assumption/lib/helper-types.tsevals/fixtures/small-call-with-assumption/lib/main.jsevals/fixtures/small-call-with-assumption/lib/string-helper.jsevals/grade-progress-emission.pyreferences/autonomy-directive.mdreferences/failure-report.mdreferences/maintenance.mdreferences/pr-body.md/autonomo <prompt> takes a freeform task description and runs the full superpowers pipeline — brainstorm a spec, write a plan, execute the plan, open a PR — without prompting the user. The user is not watching: subagents make best-effort decisions on small calls and bail with BLOCKED: only on high-stakes ambiguity. On success it prints a PR URL; on bail it leaves the branch and a one-page report at .autonomo/<slug>-<timestamp>.md.
Runtime artifacts: logs and bail reports go in .autonomo/<slug>-<RUN_TIMESTAMP>.{log,md}. Add .autonomo/ to your repo's .gitignore — these files are per-run and never committed. (User preferences for artifact location override this default.)
Use /autonomo for tasks that are well-scoped, single-package, and don't touch auth, billing, security, data migration, or external API contracts. Typical fits: typo fixes, internal renames, adding a small utility, fleshing out a clearly-described function, doc tweaks. A concrete actionable one-liner you'd otherwise paste into a fresh session.
Do NOT use it for:
/autonomo is unattended by design. If you want to review the spec or plan before implementation, run the underlying superpowers skills directly.main/master. The skill refuses both; clean up first or switch to main/master / a freshly cut branch./autonomo --dry-run "<task>" runs the full pipeline (brainstorm → plan → execute) and produces the spec, plan, and commits locally, but skips git push and gh pr create. Useful for previewing what /autonomo would do on a borderline task, exercising the controller end-to-end while iterating on the skill, or running pressure tests against the real wiring (rather than per-phase isolation).
Every run writes a structured log to .autonomo/<slug>-<RUN_TIMESTAMP>.log and a parallel pretty stream to stdout. Both surfaces are written by scripts/emit.sh; the controller and every subagent emit through it. The full subcommand list, stdout shapes, structured-line format, and canonical stage vocabulary live in references/run-log.md.
The two surfaces feed three audiences with one artifact: the watching user (top-level + nested transcripts), a tmux tailer (tail -f on the log), and post-mortem grep against the structured lines. Skipping either surface defeats one audience.
Recursion guard. Before doing anything else, check the AUTONOMO_RUNNING environment variable.
if [ "${AUTONOMO_RUNNING:-0}" = "1" ]; then
echo "BLOCKED: /autonomo invoked recursively. Inner subagents must not call /autonomo."
exit 1
fi
export AUTONOMO_RUNNING=1
Set AUTONOMO_RUNNING=1 for the rest of the run. Subagents inherit it; if any of them tries to invoke /autonomo, that nested run sees the variable set and refuses.
Dependency check. Confirm gh is authenticated and git is available. Either missing → exit before any branch is created.
command -v gh >/dev/null 2>&1 || { echo "BLOCKED: gh CLI not installed"; exit 1; }
command -v git >/dev/null 2>&1 || { echo "BLOCKED: git not installed"; exit 1; }
gh auth status >/dev/null 2>&1 || { echo "BLOCKED: gh not authenticated; run 'gh auth login'"; exit 1; }
The superpowers plugin is also a hard requirement, but plugin presence isn't shell-checkable — the dispatcher will surface it when the first Agent tool call fails.
Capture skill paths. The skill loader announces this skill's base directory at the top of the loaded content (Base directory for this skill: <path>). Capture it for use throughout the run:
SKILL_DIR="<base directory announced by the loader>"
AUTONOMO_EMIT="${SKILL_DIR}/scripts/emit.sh"
AUTONOMO_EMIT and AUTONOMO_LOG (set in step 3) are passed in every subagent dispatch prompt; subagents source the directive from ${SKILL_DIR}/references/autonomy-directive.md.
Budget ceilings. Read the per-run ceilings from the environment, with safe defaults. Both ceilings are checked between phases by scripts/budget.sh (§4); see ## Limits for the full contract.
export AUTONOMO_MAX_TOKENS="${AUTONOMO_MAX_TOKENS:-80000}"
export AUTONOMO_MAX_DURATION_S="${AUTONOMO_MAX_DURATION_S:-900}"
Accumulator state lives in ${AUTONOMO_LOG%.log}.budget and is owned by scripts/budget.sh; the controller does not initialize it.
The <input> argument from /autonomo <input> is a freeform task description. One optional flag is recognized; everything else is the task verbatim.
| Shape | Match | Resolution |
|---|---|---|
--dry-run flag | leading --dry-run token (anywhere before the first non-flag word) | set DRY_RUN=1; remove the token from the input; continue parsing |
| Prompt | non-empty residual input | {title: <input>, body: ""} — pass through verbatim |
| Empty | residual input is empty / whitespace only | exit BLOCKED: usage: /autonomo [--dry-run] "<task description>" |
Store the resolved task object as TASK and the flag state as DRY_RUN (defaulting to 0 when unset). DRY_RUN=1 only changes §5 — the brainstorm / plan / execute pipeline runs unchanged so the spec, plan, and commits are produced as usual.
Compute the slug, then call the workspace-picker. The script enforces the decision table; the prose below documents what it does so a reader (or auditor) doesn't have to read the script to understand the controller's git assumptions.
SLUG=$(bash "${SKILL_DIR}/scripts/slugify.sh" "$INPUT")
RUN_TIMESTAMP=$(date +%s)
BRANCH_NAME=$(bash "${SKILL_DIR}/scripts/pick-workspace.sh" "${SLUG}") || exit 1
On a non-zero exit the script has already written BLOCKED: <reason> to stderr; propagate and stop. No bail report is written — no subagent has run yet, the branch was not created, and the §5 report writer is for phase failures only.
The slug rules live in scripts/slugify.sh (scripts/tests/slugify.test.sh); the workspace decision lives in scripts/pick-workspace.sh (scripts/tests/pick-workspace.test.sh). Re-run the matching test set after any edit to either.
Decision table (enforced by scripts/pick-workspace.sh, top-down — first matching row wins):
| Current state | Action | Branch printed |
|---|---|---|
| Working tree dirty (any uncommitted changes) | exit BLOCKED: working tree dirty; stash or commit first | — |
| On the default branch (origin/HEAD), clean | git checkout -b autonomo/<slug> | autonomo/<slug> |
| Inside a linked worktree, clean | git checkout -b autonomo/<slug> in place. Do not spawn a sibling worktree. | autonomo/<slug> |
| On a feature branch with no commits ahead of the default, clean | reuse the current branch as-is | <current branch> |
| On a feature branch with commits ahead of the default, clean | exit BLOCKED: feature branch already has commits; start from main or a fresh branch | — |
Detection details (also enforced by the script):
git symbolic-ref refs/remotes/origin/HEAD points at, with the origin/ prefix stripped; falls back to main if origin/HEAD is unset.git rev-parse --git-dir ending in /.git/worktrees/<name> (vs .git for the primary checkout).git log origin/<default>..HEAD --oneline is empty.The script's stdout is the canonical BRANCH_NAME value used by the PR opener (§5).
Open the run log. Once SLUG, RUN_TIMESTAMP, and BRANCH_NAME exist:
mkdir -p .autonomo
export AUTONOMO_LOG=".autonomo/${SLUG}-${RUN_TIMESTAMP}.log"
bash "${AUTONOMO_EMIT}" run-start "${BRANCH_NAME}"
echo " log: ${AUTONOMO_LOG}"
echo " tail live: tail -f ${AUTONOMO_LOG}"
AUTONOMO_LOG and AUTONOMO_EMIT are referenced by every subsequent emission and passed to every subagent.
Dispatch three subagents in sequence using the Agent tool. Each invocation passes the autonomy directive (verbatim — read from ${SKILL_DIR}/references/autonomy-directive.md) plus the dispatch prompt. After each return, check whether the output starts with BLOCKED:; treat any other failure (crash, tool error) the same way. The PR-open phase is handled by the controller directly — see §5.
Every dispatch prompt body must include:
AUTONOMO_LOG=<absolute path>
AUTONOMO_EMIT=<absolute path>
so the subagent can export them at the top of every bash session it spawns.
The wrapper around each Agent call is the same three lines — phase-start before dispatch, phase-end on success (with duration and key artifacts), or phase-bail if the return starts with BLOCKED:. The script writes both surfaces; do not echo by hand.
Per-phase budget accounting. When the Agent tool call returns, the controller (Claude) sees a notification carrying total_tokens and duration_ms for that subagent invocation. Bash cannot read those values; the controller must read them off the notification and pass them as PHASE_TOKENS / PHASE_DURATION_MS to scripts/budget.sh, which maintains the accumulators (in ${AUTONOMO_LOG%.log}.budget) and enforces both ceilings. Run it after every successful phase-end (including phase 3):
if ! bash "${SKILL_DIR}/scripts/budget.sh" check \
"${PHASE}" "${PHASE_INDEX}" 3 "${PHASE_TOKENS}" "${PHASE_DURATION_MS}"; then
# Script already emitted `budget-exceeded` (the ✗ Phase … BUDGET line on
# stdout, the structured event in the log) and wrote the bail reason to a
# sibling file. Read it and jump to §5 bail.
BAIL_REASON=$(<"${AUTONOMO_LOG%.log}.bail")
# → §5 bail with PHASE = the phase that just ran, REASON = $BAIL_REASON
fi
BAIL_REASON is one of Budget exceeded: tokens; <total>/<max> or Budget exceeded: duration; <total>s/<max>s. Tokens take precedence when both ceilings breach in the same call.
A breach after phase 3 still bails — the work exists locally on the branch, but /autonomo does not auto-push or open a PR; the suggested next step in the report points at a manual gh pr create. Mid-phase runaway is not caught; that cost is sunk by the time the phase returns.
PHASE_START=$(date +%s)
bash "${AUTONOMO_EMIT}" phase-start brainstorm 1 3
Dispatch the Agent tool with prompt: "Run the superpowers:brainstorming skill on this task. Produce a spec. Task: <prompt>. AUTONOMO_LOG=${AUTONOMO_LOG} AUTONOMO_EMIT=${AUTONOMO_EMIT}" plus the autonomy directive (verbatim).
On a non-BLOCKED: return, parse SPEC_PATH from the subagent's output. Derive ASSUMPTIONS_COUNT by counting event=assumption lines in the log — that's the authoritative surface (directive rule 5).
DURATION=$(( $(date +%s) - PHASE_START ))
ASSUMPTIONS_COUNT=$(grep -c 'phase=brainstorm event=assumption' "${AUTONOMO_LOG}" || true)
bash "${AUTONOMO_EMIT}" phase-end brainstorm 1 3 ${DURATION} \
spec=${SPEC_PATH} assumptions=${ASSUMPTIONS_COUNT} tokens=${PHASE_TOKENS}
Then run the per-phase budget check above with PHASE=brainstorm and PHASE_INDEX=1. If budget.sh exits nonzero, read BAIL_REASON from the bail file and jump to §5 with the bail path; do not proceed to §4.2.
If the return starts with BLOCKED::
bash "${AUTONOMO_EMIT}" phase-bail brainstorm 1 3 "<reason>"
Then jump to §5 with the bail path. Do not proceed to §4.2.
Identical scaffold to §4.1, with phase=plan and phase index 2 3. Dispatch prompt: "Run the superpowers:writing-plans skill against the spec at <SPEC_PATH>. AUTONOMO_LOG=${AUTONOMO_LOG} AUTONOMO_EMIT=${AUTONOMO_EMIT}" plus the directive. On non-BLOCKED: return, parse PLAN_PATH and pass plan=${PLAN_PATH} to phase-end in place of the brainstorm spec=… field.
Identical scaffold to §4.1, with phase=execute and phase index 3 3. Capture the branch base before dispatch:
BRANCH_BASE=$(git merge-base HEAD origin/main 2>/dev/null || git merge-base HEAD origin/master)
Dispatch prompt: "Run the superpowers:executing-plans skill against the plan at <PLAN_PATH>. Commit each task as you go on the current branch. AUTONOMO_LOG=${AUTONOMO_LOG} AUTONOMO_EMIT=${AUTONOMO_EMIT}" plus the directive.
On non-BLOCKED: return, run phase-end, then echo each new commit:
for sha in $(git log "${BRANCH_BASE}..HEAD" --format='%H' --reverse); do
msg=$(git show --no-patch --format='%s' "$sha")
bash "${AUTONOMO_EMIT}" commit "$sha" "$msg"
done
If any return starts with BLOCKED: or the subagent errors out, jump to the report writer. Do not proceed to the next phase. Do not retry.
On success (all three subagents returned non-BLOCKED:):
BRANCH_NAME starts with worktree-, strip that prefix. (The worktrunk / EnterWorktree workflow adds it locally; remote should see the clean name.)"<task prompt, truncated to ~70 chars at a word boundary>") and body from the template in references/pr-body.md (template + composition rules; the conditional ## Spec / ## Plan rendering is the load-bearing part). Composition is the same in normal and dry-run modes.DRY_RUN:
DRY_RUN=0 (default). Run git push -u origin <remote-branch-name>, then gh pr create --title "<title>" --body "<body>" — single command, base branch is the repo default (usually main). Print the PR URL, then echo the log path so post-mortem has it handy:
echo " log: ${AUTONOMO_LOG}"
DRY_RUN=1. Skip the push and the gh pr create call. Print a summary block to stdout (branch, commit count, head SHA, log path, would-be PR title, would-be PR body), emit dry-run-complete, then exit 0:
COMMITS=$(git rev-list --count "${BRANCH_BASE}..HEAD")
HEAD_SHA=$(git rev-parse HEAD)
cat <<EOF
dry-run summary
branch: ${BRANCH_NAME} (would push as ${REMOTE_BRANCH_NAME})
commits: ${COMMITS}
head: ${HEAD_SHA:0:12}
log: ${AUTONOMO_LOG}
would-be title: ${PR_TITLE}
would-be body:
${PR_BODY}
EOF
bash "${AUTONOMO_EMIT}" dry-run-complete "${BRANCH_NAME}" "${COMMITS}"
Branch + commits stay on disk. The operator can manually git push + gh pr create later if the dry-run output looks right.
If the push or gh pr create fails (only relevant on DRY_RUN=0), fall through to the report writer with phase = pr — the branch and commits exist locally; the human can retry the PR open manually.
On bail (any phase returned BLOCKED: or errored):
Write .autonomo/<slug>-<RUN_TIMESTAMP>.md using the template in references/failure-report.md. Create the directory if needed. Do not push, do not open a PR. Print the report path to the user. Leave the branch / worktree in place. No retry. No rollback. Bail on first failure. Bail behavior is identical in dry-run mode — the report is the same artifact regardless of DRY_RUN.
| Phase | Failure mode | Autonomo response |
|---|---|---|
| Input parse | empty input, gh auth missing | exit before any branch created; print one-line cause |
| Workspace | dirty tree, branch already has commits ahead of default, branch creation fails | exit before any subagent runs; print one-line cause and suggested fix |
| Brainstorm | returns BLOCKED:, errors, no spec path | write report, leave artifacts, exit |
| Plan | same | same |
| Execute | same; OR completes but tests / lint fail | same — no auto-retry, no auto-fix |
| PR open | push rejected, gh pr create fails | branch + commits exist locally; report points user at manual gh pr create |
| Budget | accumulated tokens or duration exceed AUTONOMO_MAX_TOKENS / AUTONOMO_MAX_DURATION_S after a phase | emit event=budget_exceeded, write report (phase = the phase that just ran, reason includes total/max), exit. Branch + any commits stay; manual gh pr create if work is salvageable. |
BLOCKED: is the controlled-failure marker. Returns starting with that prefix are expected; anything else is uncontrolled and flagged in the report. Both surface the same way — see references/failure-report.md for the report template.
/autonomo is unattended by design — a runaway subagent has no human stop button. Two ceilings cap the worst case:
| Variable | Default | What it bounds |
|---|---|---|
AUTONOMO_MAX_TOKENS | 80000 | Sum of total_tokens across all phase subagents in the run. |
AUTONOMO_MAX_DURATION_S | 900 (15 min) | Sum of duration_ms across all phase subagents, in seconds. |
Both are read once at preflight (§1) and checked between phases (§4) — never mid-phase, since interrupting a running subagent is messy and the cost is already incurred. A breach emits event=budget_exceeded reason=<tokens|duration> and routes to the bail path (no PR, branch and any commits stay on disk). Defaults are loose enough that routine runs (~30–50k tokens, sub-200s per phase observed in evals) never trip them; only a runaway run does.
Override per-run by setting either variable before invoking the skill, e.g. AUTONOMO_MAX_TOKENS=20000 /autonomo "<task>" for a tight cap on a debug run. Tokens are the budget primitive (observable on every Agent return) rather than dollars — different models cost different amounts per token, so dollar conversion belongs in a future enhancement with a price config, not here.
Operator-side footguns. (Subagent-side mistakes — escalation, recursion, scope creep — are covered by the directive and pressure-tested by evals/evals.json; this list is what trips up the controller and the human around the run.)
| Mistake | Symptom | Fix |
|---|---|---|
Forgetting to inline AUTONOMO_LOG= / AUTONOMO_EMIT= in the dispatch prompt body | Subagent's bash sessions can't find ${AUTONOMO_LOG}; emit.sh exits 2 with AUTONOMO_LOG not set; tmux tailer sees silence during multi-minute phases | Substitute both env vars literally into the prompt body — Agent subagents do not inherit the controller's environment. See §4 lines around the "Every dispatch prompt body must include" block. |
Forgetting .autonomo/ in repo .gitignore | Run logs and bail reports show up in git status and risk getting committed | Add .autonomo/ to the repo's .gitignore once. The skill itself doesn't write the entry — that's the operator's setup step (called out in the SKILL.md "Runtime artifacts" callout). |
Running /autonomo from a feature branch that already has commits ahead of main/master | Skill exits cleanly with BLOCKED: feature branch already has commits before any subagent runs | Switch to main/master or cut a fresh branch (git switch main then git checkout -b <name>) and re-run. The decision table in §3 is what enforces this. |
Editing references/autonomy-directive.md without re-running evals | Wording regresses subagent behavior in non-obvious ways (rules look fine in prose but a pressure scenario flips RED) | Re-run evals/evals.json after every directive edit. Each eval has a rerun_trigger field naming what edits warrant a re-run. See references/maintenance.md. |
Confusing AUTONOMO_LOG (the file) with AUTONOMO_RUNNING (the recursion-guard env var) | A subagent that sets AUTONOMO_LOG=1 instead of using the inherited path; or a controller that re-exports AUTONOMO_RUNNING=1 mid-run and clobbers nested checks | They are unrelated. AUTONOMO_LOG is an absolute path to a file; AUTONOMO_RUNNING is a 0/1 flag set once at preflight. The recursion guard is AUTONOMO_RUNNING; the log path is AUTONOMO_LOG. |
Leaving --dry-run in a scripted invocation | The pipeline runs to completion but no PR is opened — the operator wonders why nothing landed | Watch for the ✓ /autonomo · dry-run complete · … stdout line and the event=dry_run_complete log entry; both are emitted exactly when the push and PR are skipped. Strip --dry-run from the invocation when scripting for real runs. |
references/autonomy-directive.md — verbatim block passed to every dispatched subagent. The wording is load-bearing; pressure-test evals under evals/evals.json exercise the rules.references/run-log.md — full subcommand reference and structured-line shapes for scripts/emit.sh, plus the canonical stage vocabulary.references/pr-body.md — PR body template and composition rules (referenced by §5 success path).references/failure-report.md — bail-report template (referenced by §5 bail path and ## Failure handling).references/maintenance.md — how to re-run the pressure-test evals after editing the directive, and when to bump plugin.json version.scripts/emit.sh — the dual-write helper. bash scripts/emit.sh --help prints the subcommand list.scripts/slugify.sh — slug derivation used by §3 (kebab-case, ASCII, 40-char word-boundary truncation, auto-<ts> fallback). scripts/tests/slugify.test.sh is the canonical test set.scripts/pick-workspace.sh — workspace decision used by §3 (dirty-tree refusal, default-branch detection, linked-worktree detection, "commits ahead of default" check). Prints BRANCH_NAME on stdout, exits nonzero with BLOCKED: <reason> on stderr. scripts/tests/pick-workspace.test.sh is the canonical test set.scripts/budget.sh — per-run budget accumulator + ceiling check used by §4. Maintains state in ${AUTONOMO_LOG%.log}.budget, emits budget-exceeded on a breach, and writes the bail reason to ${AUTONOMO_LOG%.log}.bail for the controller to read. scripts/tests/budget.test.sh is the canonical test set.evals/evals.json — pressure-test evals for the autonomy directive (one eval per rule, with prompts, fixture paths, machine-checkable expectations, and RED/GREEN baselines). evals/fixtures/<eval-name>/ holds the canonical input files the runner copies into a per-run workspace. evals/brainstorm-q-eval.json separately grades clarifying-question quality (Q:/A: surfacing under rule 1). evals/grade-progress-emission.py is the deterministic grader for eval id=6.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub martinciu/gruppo --plugin autonomo