From my-auto-goal-plugin
Use when the user wants an autonomous goal loop that keeps iterating inside the current project directory, with upfront approval, locked scope, and minimal interruptions during research, coding, debugging, app work, SEO work, or content iteration. Trigger on phrases like "auto goal", "keep iterating", "autonomous loop", "research loop", "improve until", or any request for repeated autonomous improvement cycles.
How this skill is triggered — by the user, by Claude, or both
Slash command
/my-auto-goal-plugin:my-auto-goalThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> *Give it a goal. It keeps iterating until it gets there or cleanly pauses.*
Give it a goal. It keeps iterating until it gets there or cleanly pauses.
Autonomous goal-achievement loop inspired by karpathy/autoresearch, generalized for research, code, debug, app, content, and web/SEO work.
CRITICAL: Do NOT invoke Skill(my-auto-goal-plugin:my-auto-goal) from within this skill. You are already inside it. Doing so creates an infinite loop.
You MUST execute these phases in strict order. Do NOT skip to Phase 4 (the loop). Each phase has a gate — you cannot proceed past it until it is satisfied.
Phase 1: Goal Clarification → GATE: all required fields have values
Phase 2: Plan Brief & Approval → GATE: user said "yes" or "proceed"
Phase 3: Session Setup → GATE: session directory + files exist on disk
Phase 4: The Loop → only after all 3 gates passed
The main session (lead) owns the loop: clarifies the goal, gets approval, dispatches subagents, scores iterations, and makes all keep/discard decisions.
Subagents are bounded workers. Dispatch them via the Agent tool for:
subagent_type: "general-purpose")subagent_type: "Explore")Always dispatch multiple subagents in parallel when possible — use run_in_background: true for concurrent work.
Agent({
subagent_type: "general-purpose",
run_in_background: true,
prompt: "Research [topic]. Use WebSearch and WebFetch. Write findings to <path>. Return summary.",
description: "Research: [topic]"
})
Subagent fallback: If subagents return empty or failed results (WebSearch/WebFetch can fail due to permissions or model limitations), the lead must perform the research directly or try alternative data sources. See the domain-specific loop guide for fallback details.
Background agent polling: When dispatching agents with run_in_background: true:
Non-negotiable rules:
MANDATORY. Use AskUserQuestion to gather the goal spec. Infer what you can from context, ask about the rest.
Required fields (all must have values before proceeding):
goal: <specific objective>
success_criteria: <measurable completion condition>
constraints: <boundaries / must-not-change>
domain_type: <research | code | debug | app | content | web-seo>
project_root: <default = cwd>
scope: <default = project_root, or narrower>
eval_command: <for code/app/web-seo; repro steps for debug; "rubric" for research/content>
metric_key: <metric name>
metric_direction: <higher-is-better | lower-is-better>
iteration_timeout: <default 5min code, 10min research/content>
soft_pivot: <default 5>
max_no_improve: <default 10>
Smart defaults: For research/content, default to rubric-based scoring. For debug, default to bug_status metric.
Before proceeding to Phase 2, verify ALL prerequisites. Do NOT skip this.
| Domain | Check | How to verify |
|---|---|---|
| All | Writable project directory | ls -la <project_root> |
| code/app/debug | Git repo exists | git -C <project_root> status |
| code/app/debug | Working tree is clean | git -C <project_root> diff --quiet |
| code/app/debug | Eval command works | Run <eval_command> and check exit code |
| code/app/debug | Test runner available | which pytest or which npm etc. |
| web-seo | Lighthouse available | npx lighthouse --version |
| research/content | None required | — |
If a prerequisite fails:
git init in your project first"GATE: All prerequisites pass. All required fields have values. Only then proceed to Phase 2.
MANDATORY. Present this brief and wait for explicit approval. Do NOT start the loop without it.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Goal: <goal>
Domain: <domain_type> (LOCKED)
Project: <project_root> (LOCKED)
Scope: <scope> (LOCKED)
Eval: <eval_command or "rubric-based self-scoring">
Metric: <metric_key> (<direction>)
Timeout: <iteration_timeout> per iteration
Stagnation: soft pivot at <N>, hard pause at <N>
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Will create: <project_root>/.autogoal-<YYYYMMDD-HHMMSS>/
Will do: autonomous loop, subagent dispatch, git branch (code domains)
Proceed? (yes / modify first)
Use AskUserQuestion with options ["Yes, proceed", "Modify first"] to get approval.
GATE: User approved. Only then proceed to Phase 3.
MANDATORY. Create these files on disk before starting any iteration.
<project_root>/.autogoal-<YYYYMMDD-HHMMSS>/
├── session.md # goal spec + session log
├── results.tsv # iter, id, score, delta, status, description
├── resume.md # recovery state (overwritten each iteration)
├── outputs/ # iteration artifacts (research-v1.md, etc.)
└── domain/ # domain-specific work (subagent outputs go here)
Create ALL directories: outputs/ and domain/ must exist before dispatching subagents. For research, also create domain/research/ for subagent write targets.
results.tsv with header: iter\tid\tscore\tdelta\tstatus\tdescriptionautogoal/<YYYYMMDD-HHMMSS>session.md with the goal spec (see templates/goal-session.md)resume.md with iteration = 0 (see templates/resume-state.md)loops/<domain_type>.mdGATE: Verify session directory exists with ls <session_dir>. Only then proceed to Phase 4.
Iteration 0 is always baseline. Then iterate.
BASELINE (iter 0):
Run eval / self-score current state → baseline score
Log to results.tsv: 0\t-\t<score>\t-\tbaseline\tInitial state
If eval fails → ask user to fix eval_command, do NOT proceed
ITERATE (iter >= 1):
1. Choose ONE hypothesis (one focused change)
2. Dispatch subagents in parallel for research/exploration if needed
3. Implement or synthesize
4. Evaluate (run eval or self-score)
5. Compute delta → apply keep/discard
6. Log to results.tsv + update resume.md
7. Check stopping conditions → if goal met, generate Final Report and stop
8. Continue
| Condition | Action |
|---|---|
| delta > 0 (improved) | KEEP — commit (code domains) or save (research/content) |
| delta == 0 and simpler | KEEP |
| delta < 0 (worse) | DISCARD — git restore . && git clean -fd (code) or delete files (research/content) |
Delta is direction-aware: delta = current - best (higher-is-better) or delta = best - current (lower-is-better). Positive always means improvement.
| Domain | Method |
|---|---|
| code/app/web-seo | Parse metric_key=<number> from eval output, or JSON dot-path, or exit code |
| debug | bug_status: 0 (reproduces), 50 (partial), 100 (fixed + test added) |
| research/content | Rubric self-score: Completeness + Accuracy + Clarity + Actionability = 0-100 |
Rubric dimension tracking (research/content):
session.md (not just totals)resume.md so recovery sessions know which dimension to target nextloops/content.md or loops/research.md for scoring band definitions| Condition | Action |
|---|---|
| Goal met (score meets success_criteria) | MUST generate Final Report (see below) → update session.md handoff notes → stop |
| Soft pivot (N consecutive no-improve) | Try meaningfully different approach, don't reset counter |
| Hard pause (max_no_improve reached) | Ask user: continue with new direction, or stop? |
| Scope boundary reached | Write handoff note → stop |
| User stop | Final report → stop |
On goal met: Do NOT silently continue or skip the report. The final report is a mandatory output.
Before keeping ANY iteration, verify:
.. traversal, no symlink escapesViolation → mark as violation, discard, continue.
After every iteration, overwrite resume.md with current state (see templates/resume-state.md).
If context is compacted:
resume.md → results.tsv → session.mditeration + 1 — do NOT re-run baselineMANDATORY — Generate this report and present it to the user whenever the loop ends (goal met, stagnation, user stop, or scope boundary). For content/research domains, also copy the best artifact to a -final version (e.g., content-final.md, research-final.md).
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
AUTOGOAL SESSION COMPLETE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Goal: <goal>
Domain: <domain_type>
Total iters: <N>
Best score: <score> (iter <N>)
Improvement: <delta from baseline> (<percentage>%)
Status: <goal met | stagnated | scope boundary | user stopped>
Top improvements:
1. iter N — <description> (+X)
2. iter N — <description> (+X)
Final artifact: <path to final output>
Full log: <session_dir>/results.tsv
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Read the relevant guide when starting a loop. Loop files are in the same plugin directory as this SKILL.md:
# Find loop files relative to SKILL.md
Glob({ pattern: "**/.claude/skills/my-auto-goal/loops/<domain_type>.md", path: "~/.claude/plugins/cache/my-auto-goal-plugin" })
| Domain | Guide | Key pattern |
|---|---|---|
| research | loops/research.md | Parallel subagent dispatch, source evaluation, rubric scoring |
| code | loops/code.md | Quality gates, git workflow, retry protocol (3 attempts) |
| debug | loops/debug.md | Reproduce → isolate → hypothesize → fix → verify |
| app | loops/app.md | Foundation → logic → UI → polish phases |
| content | loops/content.md | Rubric scoring, revision by weakest dimension |
| web-seo | loops/web-seo.md | Lighthouse audit-first, priority-ordered fixes |
npx claudepluginhub akhil41/my-auto-goal-plugin --plugin my-auto-goal-pluginRuns Karpathy-inspired autonomous iteration loops on any task: modify, verify, keep/discard, repeat. Subcommands for planning, debugging, fixing, security audits, shipping.
Runs a rigorous iteration loop for artifacts, prompts, briefings, or skills with baseline scoring, metrics, stop criteria, and keep/reject decisions.
Plans and autonomously builds a software task end-to-end. Recons codebase, researches best practices, decomposes into phases, then generates a single ready-to-paste /goal command that drives the entire chain to completion with retry and recovery.