From juvenal
Creates and runs verified AI agent workflows using Juvenal to orchestrate Claude or Codex coding agents through alternating implementation prompts and verifier checks.
How this skill is triggered — by the user, by Claude, or both
Slash command
/juvenal:juvenal [goal or command][goal or command]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping the user create and manage Juvenal workflows. Juvenal orchestrates AI coding agents through alternating implementation and verification phases, preventing agents from cheating on success criteria.
You are helping the user create and manage Juvenal workflows. Juvenal orchestrates AI coding agents through alternating implementation and verification phases, preventing agents from cheating on success criteria.
Juvenal is a framework where a deterministic Python runtime orchestrates AI coding agents (Claude or Codex) through verified phases. Each phase has:
The key insight: the implementing agent and the checking agent are separate, so the implementer can't cheat by weakening tests.
name: "my-workflow"
backend: claude # or "codex"
working_dir: "."
max_bounces: 999 # global bounce limit
backoff: 2.0 # exponential backoff between bounces (seconds)
max_backoff: 60.0 # cap on backoff delay
notify:
- https://example.com/webhook # webhook on completion/failure
vars: # template variable defaults
ENV: staging
TEST_DIR: tests/
include:
- shared-phases.yaml # merge phases from other workflows
phases:
- id: setup
prompt: "Set up the {{PROJECT}} project scaffolding in {{ENV}}."
timeout: 300 # seconds
env:
NODE_ENV: development
checks:
- prompt: |
Run `pytest tests/ -x` from the working directory and verify the result.
Do not modify code while checking.
Emit `VERDICT: FAIL: <reason>` on failure, otherwise emit `VERDICT: PASS`.
- tester # built-in role shorthand
- role: senior-engineer # role as dict
- prompt: "Review the code for security issues." # inline prompt
- id: implement
prompt_file: phases/implement/prompt.md
bounce_target: setup # on failure, bounce back to setup
checks:
- prompt: "Run `make test` and emit `VERDICT: PASS` only if it succeeds."
- role: senior-engineer
parallel_groups:
# Lanes: concurrent mini-pipelines with per-lane bounce loops
- lanes:
- [feature-a, check-a] # lane 1
- [feature-b, check-b] # lane 2
# Legacy flat: run implement phases concurrently (no per-phase checking)
- phases: [independent-x, independent-y]
my-workflow/
phases/
01-setup/
prompt.md # implement phase
check.md # check phase (auto-bounces to 01-setup)
02-parallel/ # "parallel" in name → parallel lane group
feature-a/ # each subdir is a lane
prompt.md # implement phase
check.md # check phase (auto-bounces to implement)
feature-b/
prompt.md
check.md
03-check-final/ # "check-" prefix → standalone check phase
prompt.md
In any phase directory, extra .md files (besides prompt.md) become check phases with bounce_target set to the implement phase. If a checker needs to run a command, write that command into the markdown prompt. Directories with check- prefix or -check- in the name are standalone check phases (only prompt.md is used).
Lanes can also use subdirectories: 02-parallel/a/01-implement/prompt.md, 02-parallel/a/02-check-review/prompt.md.
A single .md file becomes a single implement phase:
juvenal run task.md
| Type | Description |
|---|---|
implement | Agent executes a prompt to build/modify code (default) |
check | Separate agent verifies work, emits VERDICT: PASS or VERDICT: FAIL: reason |
workflow | Sub-workflow: dynamic (from prompt) or static (from file/dir) |
# Dynamic: LLM plans the sub-workflow from the prompt
- id: dynamic-feature
type: workflow
prompt: "Build a REST API with authentication and tests."
max_depth: 2 # recursion depth limit (default: 3)
# Static: execute an existing workflow YAML or directory
- id: auth-module
type: workflow
workflow_file: auth/workflow.yaml
- id: frontend
type: workflow
workflow_dir: frontend/
Static sub-workflows skip the LLM planning step. Paths resolve relative to the declaring YAML file. Parent workflow vars propagate to sub-workflows. workflow_file and workflow_dir are mutually exclusive with each other.
Checks are defined inline on implement phases. Each entry can be:
tester, architect, pm, senior-tester, senior-engineer, security-engineer, technical-writer, professor, grant-reviewerrole: NAME — agent checker with built-in rolerole: NAME + prompt: TEXT — built-in role plus extra checker instructionsprompt: TEXT — agent checker with inline promptprompt_file: PATH — agent checker with prompt from fileChecks can also carry timeout and env.
bounce_target (singular, fixed): always bounces to this phase on failurebounce_targets (list, agent-guided): checker picks which phase to bounce to via VERDICT: FAIL(target-id): reason. Falls back to first in the list.These are mutually exclusive.
- id: review
type: check
bounce_targets:
- design-experiments # agent can bounce here
- write-paper # or here
Each lane is a mini-pipeline (implement + check) that runs its own internal bounce loop. All lanes run concurrently with a shared global bounce budget.
parallel_groups:
- lanes:
- [feature-a, check-a]
- [feature-b, check-b]
- [feature-c, check-c]
Run implement phases concurrently with no per-phase checking. A single failure aborts the group.
parallel_groups:
- phases: [a, b, c]
Compose workflows from reusable pieces:
# main.yaml
include:
- shared/setup.yaml
- shared/linting.yaml
phases:
- id: feature
prompt: "Build the feature."
Included phases, parallel groups, and other settings are merged. Circular includes are detected.
Use {{VAR}} placeholders in prompts. Variables are resolved at runtime.
vars:
ENV: staging
PROJECT: myapp
phases:
- id: deploy
prompt: "Deploy {{PROJECT}} to {{ENV}}."
checks:
- prompt: |
Run `curl -f https://{{ENV}}.example.com/health` and verify the deployment health check.
Emit `VERDICT: FAIL: <reason>` if it fails; otherwise emit `VERDICT: PASS`.
# Override defaults from CLI
juvenal run workflow.yaml -D ENV=prod -D PROJECT=api
vars: sets defaults; CLI -D overrides them{{VAR}} passes through unchanged (safe for prompts containing literal {{)-D T=a -D T=b duplicates phases using {{T}} into parallel lanes (with checks grouped)--dry-run shows active variablesjuvenal run <workflow> [--resume] [--rewind N] [--rewind-to PHASE_ID] [--phase X]
[--max-bounces N] [--backend claude|codex] [--dry-run]
[--backoff SECONDS] [--notify URL] [--working-dir DIR]
[--state-file PATH] [--checker SPEC] [--implementer ROLE]
[--push-main] [--clear-context-on-bounce] [-D VAR=VAL] [--serialize]
juvenal plan "goal" [-o output.yaml] [--backend claude|codex] [--push-main]
juvenal do "goal" [--backend claude|codex] [--max-bounces N] [-D VAR=VAL] [--push-main] [--serialize]
juvenal status [--state-file path]
juvenal init [directory] [--template name]
juvenal validate <workflow>
--checker SPEC: Inject a checker on every implement phase. SPEC is tester, tester:extra instructions, or prompt:TEXT. Repeatable.--implementer ROLE: Prepend an implementer role prompt to every implement phase (e.g., software-engineer, professor-writer).--push-main: When injecting implementer roles, tell them to push after committing even on main or master.--clear-context-on-bounce: Start a fresh agent session on bounce instead of resuming (default: resume session, preserving conversation context).-D VAR=VAL: Set a template variable. Use {{VAR}} in prompts. Repeatable. Overrides vars: defaults in YAML. Multiple values for the same key (-D T=a -D T=b) duplicate phases into parallel lanes.--serialize: Disable all parallelization (run parallel groups and lanes sequentially).--backoff SECONDS: Exponential backoff between bounces (base delay, doubles each bounce, capped at --max-backoff or workflow's max_backoff).--notify URL: Webhook URL for JSON notifications on completion/failure. Repeatable.--dry-run: Print execution plan, validation, and phase summary without running.research-paper — Write a research paperA 14-phase workflow for producing a research paper from an initial idea, with 6 agent roles: professor, postdoc, graduate researcher, research engineer, and two academic reviewers (one positive, one skeptical).
Setup: Create an IDEA.md in your working directory with the research idea, then run:
juvenal run $(python -c "from pathlib import Path; print(Path(__import__('juvenal').__file__).parent / 'workflows' / 'research-paper.yaml')") --backend claude
Phases:
| # | Phase | Agent | Type |
|---|---|---|---|
| 1 | research-plan | Professor | implement |
| 2 | design-experiments | Graduate Researcher | implement |
| 3 | design-implementation | Research Engineer | implement |
| 4 | implement-project | Research Engineer | implement |
| 5 | ensure-tests | Research Engineer | implement |
| 5b | run-tests | — | check |
| 6 | implement-experiments | Graduate Researcher | implement |
| 7 | run-experiments | Graduate Researcher | implement |
| 8 | results-review | Postdoc | check -> design-experiments |
| 9 | design-paper | Professor | implement |
| 10 | write-paper | Graduate Researcher | implement |
| 11 | professor-review | Professor | check -> [design-experiments, write-paper] |
| 12 | reviewer-a-review | Reviewer A (positive) | check -> [design-experiments, write-paper] |
| 13 | reviewer-b-review | Reviewer B (skeptical) | check -> [design-experiments, write-paper] |
Artifacts produced: PLAN.md, DESIGN.md, IMPLEMENTATION.md, RESULTS.md, OUTLINE.md, PAPER.md, reviews/
When the user invokes /juvenal, help them by:
workflow.yaml file for that goaljuvenal run via BashAlways create workflows that are specific, testable, and have meaningful checks.
Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub zardus/juvenal