Skill

auton

Diagnoses why the jleechanclaw + AO automation system is not autonomously driving PRs to green and merge. Checks config divergence, worker path, and skeptic pipeline health.

automation

devops

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-commands:auton

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

When invoked, diagnose WHY the jleechanclaw + AO system is NOT autonomously driving PRs to N-green and merged. The system is supposed to do this without human intervention — if it isn't, something is broken.

SKILL.md

183 lines · ~2k tokens

Stats

LanguagePython

Stars27

Forks4

MaintenanceExcellent

Last CommitJun 2, 2026

Actions

View Source View Plugin View on GitHub View README

Autonomy Diagnostic Skill

Purpose

Read first (mandatory before answering)

⚠️ DUAL-CONFIG WARNING (2026-04-15): Two AO config directories exist with different purposes:

~/.openclaw/ — CLI/interactive use (used by ao spawn, ao config, etc.)

~/.openclaw_prod/ — worker use (used by lifecycle-workers via launchd plist AO_CONFIG_PATH) Always diagnose BOTH or verify which is active — they can diverge and cause auth failures.

~/.hermes_prod/ — Hermes agent config (the agent since 2026-04-12; OpenClaw is dead)
~/.openclaw_prod/agent-orchestrator.yaml — AO worker config (used by lifecycle-workers)
~/.openclaw/agent-orchestrator.yaml — AO CLI config (used by interactive ao commands)
~/.codex/AGENTS.md — agent policies
~/.openclaw/SOUL.md — AO decision-making policy (legacy; Hermes uses its own)

NOTE: ao-pr-poller is DEPRECATED and removed. Do NOT check for it or report its absence as a problem.

The system's intended behavior

AO lifecycle-worker (every ~5 min via launchd)
  ↓ detects non-green PR (backfillAllPRs: true)
  ↓ spawns agento session (ao spawn --claim-pr)
  ↓ agento: reads comments, fixes code, pushes
  ↓ agento: posts @coderabbitai all good?
  ↓ agento: runs /er evidence review
  ↓ CI passes, CR APPROVED, Bugbot neutral, comments resolved
  ↓ worker-signals-completion reaction fires → skeptic-review
  ↓ skeptic-review runs `ao skeptic verify` and posts VERDICT comment
  ↓ skeptic-cron.yml checks 7-green and merges when all applicable gates pass

Key config to verify (from the ACTIVE config — run Step 0 first):

worker-signals-completion:
  auto: true
  action: skeptic-review
  skepticModel: codex

And the repo must have a healthy skeptic-cron.yml workflow. approved-and-green may still exist, but it is no longer the canonical merge executor.

If worker-signals-completion is missing, auto: false, or not action: skeptic-review, that is the local skeptic trigger gap.

Full autonomy means: idea in → merged PR out, zero human clicks.

Diagnostic questions (answer each with evidence)

0. Verify active config path (MUST DO FIRST — 2026-04-15 lesson)

# Find which config workers actually use
ps aux | grep "lifecycle-worker" | grep -v grep | head -5

# Check if both configs exist and are different
diff ~/.openclaw/agent-orchestrator.yaml ~/.openclaw_prod/agent-orchestrator.yaml 2>/dev/null && echo "configs IDENTICAL" || echo "⚠️ configs DIVERGE"

# If configs differ, the WORKER config is the source of truth for auth/model issues
# Workers read AO_CONFIG_PATH from their process env — verify it:
ps aux | grep "lifecycle-worker" | grep -v grep | grep -o "AO_CONFIG_PATH=[^ ]*"

Why this matters: ~/.openclaw/ (CLI) and ~/.openclaw_prod/ (workers) can diverge. The 2026-04-15 auth outage was caused by ~/.openclaw_prod/ having model: gemini-3-flash-preview while ~/.openclaw/ had MiniMax-M2.7. Always check the worker config first for auth issues.

1. Is AO lifecycle-worker and orchestrator running?

# Lifecycle-worker
launchctl list com.agentorchestrator.lifecycle-jleechanclaw

# Orchestrator — sessions are hash-prefixed, NEVER use hard-coded "ao-orchestrator"
# Correct pattern:
orch_session=$(tmux list-sessions 2>/dev/null | grep "ao-orchestrator" | cut -d: -f1)
if [ -n "$orch_session" ]; then
  echo "ao-orchestrator FOUND: $orch_session"
  tmux capture-pane -t "$orch_session" -p -S -10
else
  echo "ao-orchestrator NOT FOUND"
fi

launchctl list com.agentorchestrator.lifecycle-jleechanclaw
tail -20 /tmp/ao-lifecycle-jleechanclaw.log

3. Are sessions being spawned?

tmux list-sessions | grep -E "^[a-z]{2}-[0-9]+"
ao session ls --project agent-orchestrator 2>/dev/null || echo "ao session ls failed"

4. Are sessions doing work? (or idle/zombie)

# For each jc-* session, check if agent is alive
tmux list-sessions -F '#{session_name}' | grep jc- | while read s; do
  cmd=$(tmux list-panes -t "$s" -F '#{pane_current_command}' 2>/dev/null)
  echo "$s: $cmd"
done

5. What are the non-green reasons per PR?

gh pr list --repo jleechanorg/agent-orchestrator --state open \
  --json number,title,mergeable,mergeStateStatus,reviewDecision

# Verify skeptic-review trigger wiring (REQUIRED for autonomous reviewing):
python3 - <<'PY'
import yaml, os
# Try worker config first, fall back to CLI config
for path in [os.environ.get("AO_CONFIG_PATH", ""), "~/.openclaw_prod/agent-orchestrator.yaml", "~/.openclaw/agent-orchestrator.yaml"]:
    if path and os.path.exists(os.path.expanduser(path)):
        cfg = yaml.safe_load(open(os.path.expanduser(path)))
        print(f"Config: {path}")
        print(((cfg.get("reactions") or {}).get("worker-signals-completion") or {}))
        break
PY

# Verify skeptic-cron is present and running:
gh run list --repo jleechanorg/agent-orchestrator --workflow skeptic-cron.yml --limit 3

6. Is CR rate-limited?

gh api repos/jleechanorg/jleechanclaw/issues/comments?per_page=5 | \
  python3 -c "import json,sys; [print(c['user']['login'],c['body'][:100]) for c in json.load(sys.stdin) if 'rate limit' in c['body'].lower()]"

7. Is the 7-green review + merge chain working correctly?

# Verify latest skeptic markers and VERDICT comment are bound to the head SHA
gh api repos/jleechanorg/agent-orchestrator/issues/<PR_NUM>/comments --paginate | \
  jq '[.[] | select(.body | test("skeptic-(gate|cron)-trigger|VERDICT:"; "i"))] | .[-5:]'

8. Is the stray-worktree bug blocking spawns?

git -C ~/.openclaw worktree list | grep -v "~/.openclaw\b\|~/.worktrees"
# Any /private/tmp/ or unexpected paths = stray worktree blocking new spawns

Common failure modes

Symptom	Root cause	Fix
Sessions spawn but die immediately	`--claim-pr` fails (stray worktree)	clear stale paths / targeted unlock
Sessions alive but no pushes	Agent hits rate limit or auth failure	Check agent logs, re-auth
CR never APPROVED	Rate limited (too many PRs)	Wait for limit reset, or reduce simultaneous PRs
7-green checks pass except skeptic	`worker-signals-completion` missing, marker mismatch, or skeptic-review failed	Verify skeptic-review hook and latest VERDICT comment markers
`skeptic-cron` runs but merges 0 PRs	No PR is truly 7-green	Inspect gate-by-gate failures in workflow logs
`/tmp/ao-pr-poller.log` missing	NOT A BUG — ao-pr-poller is deprecated/removed	Ignore; its absence is correct and expected
PRs cycling CR changes_requested	Agent not reading CR comments correctly	Check agento's comment-reading skill
Spawned sessions are idle shells	`is_agent_alive_in_session` returns false	Check Hermes gateway is running (`hermes gateway status`)

Output format

## Autonomy Diagnostic — <date>

### System health
- AO lifecycle-worker: RUNNING / STOPPED
- Orchestrator session: FOUND (hash-prefixed name) / NOT FOUND
- Skeptic-review hook: worker-signals-completion auto=true/false, action=X
- Skeptic-cron workflow: RUNNING / FAILING / NOT FOUND
- Active sessions: N (M with live agents)
- Open PRs: N total, N non-green

### Per-PR status
| PR | Non-green reason | Session | Session state |
|---|---|---|---|
| #NNN | <reason from log> | jc-NNN | alive/zombie/none |

### Root cause
<Primary reason the system is not progressing PRs autonomously>

### Recommended fix
<Concrete next step>

auton

Popularity

Invocation

Context Preview

SKILL.md

auton

Popularity

Invocation

Context Preview

SKILL.md

Autonomy Diagnostic Skill

Purpose

Read first (mandatory before answering)

The system's intended behavior

Diagnostic questions (answer each with evidence)

0. Verify active config path (MUST DO FIRST — 2026-04-15 lesson)

1. Is AO lifecycle-worker and orchestrator running?

3. Are sessions being spawned?

4. Are sessions doing work? (or idle/zombie)

5. What are the non-green reasons per PR?

6. Is CR rate-limited?

7. Is the 7-green review + merge chain working correctly?

8. Is the stray-worktree bug blocking spawns?

Common failure modes

Output format

Similar Skills

Autonomy Diagnostic Skill

Purpose

Read first (mandatory before answering)

The system's intended behavior

Diagnostic questions (answer each with evidence)

0. Verify active config path (MUST DO FIRST — 2026-04-15 lesson)

1. Is AO lifecycle-worker and orchestrator running?

3. Are sessions being spawned?

4. Are sessions doing work? (or idle/zombie)

5. What are the non-green reasons per PR?

6. Is CR rate-limited?

7. Is the 7-green review + merge chain working correctly?

8. Is the stray-worktree bug blocking spawns?

Common failure modes

Output format

Similar Skills