From ce
Downstream: pega Issue especificada e entrega PR com validation report. Autônomo (Claude sozinho). Fases: research, decompose (plan interno), implement (subagentes), validate (subagente isolado), deliver (PR). Triggers: '/delivery #N', 'implementa #N', 'faz #N', 'entrega #N', 'build #N'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ce:deliverysonnetThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Input:** `$ARGUMENTS` — must be a GitHub Issue number (e.g., `#10`)
Input: $ARGUMENTS — must be a GitHub Issue number (e.g., #10)
Architecture reference:
ARCHITECTURE.md§ "Delivery — downstream" This skill runs inline — needs to spawn subagents for research, implementation, and validation.
gh issue view $ISSUE_NUMBER --json title,body,milestone,labels
Parse the Issue spec:
If the Issue lacks acceptance criteria: abort. Return to user:
"Issue #N has no acceptance criteria. Run
/discovery #Nto complete the spec first."
Check if execution-state.json exists at .claude/state/milestones/{milestone-slug}/issue-{N}-{slug}/:
If state file exists:
execution-state.jsonplan_hash with current plan (if plan.md exists)
If no state file: proceed to Step 2 (fresh delivery).
Subagent reads:
Disk output: append findings to execution-log.md § Step 2 (full analysis, risks, alternatives considered, code references with file:line).
Return to parent (max 10 lines):
Agent nudge: If research surfaces a deep technical uncertainty that the research subagent cannot resolve confidently (e.g., unknown performance characteristics, untested integration path, ambiguous API behavior): suggest @spike before proceeding. "Research found uncertainty: {question}. Consider @spike to validate before decomposing."
Purpose: Verify external premises from research before committing to a plan. Prevents implementing against wrong assumptions (changed APIs, deprecated libs, outdated patterns).
Heuristic — skip grounding when ALL of these are true:
When grounding runs, subagent receives:
Subagent verifies each premise via WebSearch against live sources (official docs, changelogs, npm/pypi, API references).
Disk output: append full grounding report to discovery.md with timestamp:
## Grounding — 2026-04-08
| Premise | Source | Status | Notes |
|---------|--------|--------|-------|
| jose v5.2 supports ES256 | npmjs.com/package/jose | ✅ verified | current: v5.4, ES256 supported since v4.0 |
| /api/v3/auth accepts PKCE | developer.example.com | ⚠️ changed | v3 deprecated, v4 is current — PKCE only on v4 |
### Flagged risks
- **BLOCKING:** /api/v3/auth is deprecated. Spec should target v4.
### Corrections
- Update auth integration to use API v4 endpoints
Return to parent (max 5 lines):
If blocking risks found: surface to orchestrator before proceeding to decompose:
"Grounding found invalid premise for #N: [detail]. Research assumed X but current state is Y. Adjusting plan to account for Y."
If no blocking risks: proceed to decompose. Parent reads grounding report from discovery.md only when writing PR body (Step 6).
──▶ Notification checkpoint: "Approaching #N as X in N parts: [summary]. Grounding: N premises verified, N flagged."
Create internal plan at .claude/state/milestones/{milestone-slug}/issue-{N}-{slug}/plan.md
Create execution log at .claude/state/milestones/{milestone-slug}/issue-{N}-{slug}/execution-log.md
Using templates: templates/state/issue/plan.md, templates/state/issue/execution-log.md
This is agent infrastructure. Deliverables do NOT become GitHub Issues.
plan.md is temporary infrastructure — deleted after PR opens (Step 6). If delivery is interrupted, it can be regenerated from the Issue spec + git diff.
Initialize execution-state.json with all batches as pending and a hash of plan.md:
{
"issue": N,
"delivery_started": "ISO-8601",
"plan_hash": "sha256 of plan.md content",
"batches": [
{"id": 1, "status": "pending"},
{"id": 2, "status": "pending"}
],
"grounding": {"verified_at": "ISO-8601", "flags": []},
"last_checkpoint": "ISO-8601"
}
After creating plan.md, post a ## Decomposition comment on the Issue:
gh issue comment $ISSUE_NUMBER --body "$(cat <<'EOF'
## Decomposition
{list of deliverables with size and deps — from plan.md}
**Git strategy:** {mode}, branch: `{branch_name}`
EOF
)"
For each batch (respecting deps):
| Deliverable complexity | Subagent tier |
|---|---|
| Simple (config, rename, boilerplate, unit tests) | haiku |
| Medium (new feature, refactor, integration) | sonnet |
| Complex (architecture, multi-file, non-trivial logic) | sonnet |
Parallel deliverables within a batch use isolation: "worktree".
Subagent return contract: Each implementation subagent writes code (persisted via git commits) and appends notes to execution-log.md. Returns to parent only:
The parent never receives raw code, diffs, or implementation rationale — those live in git history and execution-log.md.
Decision markers: After each batch, update execution-log.md with:
<!-- DECISION: {id} | chose: {choice} | alternatives: {alt1 (reason), alt2 (reason)} | reason: {why} -->
<!-- DECISION-FAILED: {id} | {what went wrong} | revert-to: {decision-id to revisit} -->
Mark decisions when: choosing between alternatives (lib, architecture, approach), committing to irreversible technical choices (schema, API contract), or explicitly skipping something. Don't mark routine implementation details.
Checkpoint after each batch: update execution-state.json:
completed with completed_at, branch, and pr numberin_progress with started_atlast_checkpoint timestampThis ensures that if the agent crashes mid-delivery, the next /delivery #N resumes from the last completed batch (via Step 1.5).
──▶ Architectural gate (if: new DB table, public API change, interface refactor):
"Created this structure for #N. OK to proceed?" Blocks until user approves.
Critical: the validator receives ONLY:
git diff main...HEAD)execution-log.md (what choices were made and why)The validator does NOT inherit implementation context.
Disk output: write full critique to execution-log.md § Step 5 (all blocking/warning/observation items with file:line references, scoring table, fix instructions).
Return to parent (max 8 lines):
Validator produces a structured critique, not just pass/fail:
## Validation Critique — Issue #N
### Blocking (must fix before merge)
1. **{what's wrong}**
- Criterion: "{which acceptance criterion}"
- What's wrong: {specific issue with file:line reference}
- Fix instruction: {actionable — what to change, where}
- Files affected: {paths}
### Warnings (should fix, not blocking)
2. **{concern}**
- Criterion: {related criterion or "best practice"}
- Suggestion: {what to do}
### Observations (informational)
3. **{note}** — {context}
### Scoring
| Dimension | Result |
|-----------|--------|
| Correctness | N/M criteria met |
| Completeness | N% of criteria addressed |
| Safety | ok / warning / blocking |
| Coherence | high / medium / low (follows codebase patterns?) |
| Regressions | N tests passing, N failing |
**Result: PASS / NEEDS REFINEMENT / FAIL**
──▶ If PASS: proceed to step 6.
──▶ If NEEDS REFINEMENT (has blocking items but they're fixable):
Cycle 1: Send fix instructions to workers (scoped to specific files/functions):
Cycle 2 (if still has blocking items):
──▶ If still failing after 2 refinement cycles — Failure gate:
"Validation failed for #N after 2 refinement cycles. Critique history:
- Cycle 0: {original issues}
- Cycle 1: {what was fixed, what persists}
- Cycle 2: {what was fixed, what persists} Remaining blocking items: [list with fix attempts]. Recommend: {human judgment needed — change approach? override? abandon?}"
──▶ If FAIL (fundamental design issue, not fixable with targeted patches):
"Validation identified fundamental issue for #N: {description}. Decision point '{id}' may be the root cause. Alternatives from execution-log: {alt1, alt2}. Recommend: revisit decision '{id}' or escalate to human."
Open PR:
gh pr create \
--title "feat: {issue title}" \
--body "$(cat <<'EOF'
## Summary
{what was built, 2-3 bullets}
## Validation Report
{from step 5 — structured critique with scoring}
## Grounding Report
{from step 2.5 — omit section if grounding was skipped}
## Key Decisions
{from execution-log.md decision markers — list each DECISION with id, choice, alternatives, and reason. This helps the human reviewer understand WHY, not just WHAT.}
| Decision | Chose | Alternatives | Reason |
|----------|-------|-------------|--------|
| {id} | {choice} | {alt1, alt2} | {why} |
## Test plan
- [ ] CI passes
- [ ] {specific checks}
Closes #{issue_number}
🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"
CI runs as quality gate.
If CI fails: diagnose, fix, push. Never --no-verify.
Delete temporary files (plan + execution state):
rm -f .claude/state/milestones/{milestone-slug}/issue-{N}-{slug}/plan.md
rm -f .claude/state/milestones/{milestone-slug}/issue-{N}-{slug}/execution-state.json
Note: execution-log.md is NOT deleted — it's the permanent history.
Agent nudge: If the delivery involved external dependencies, APIs, or premises that may have changed since discovery: suggest @reviewer for independent grounding + quality review. "PR #{N} ready. The implementation relies on {N} external premises. Consider @reviewer for independent verification before merging."
When /orchestrate detects human feedback on a PR and re-invokes delivery:
Input: Issue number + PR number + human comment(s) classified as change requests.
DECISION-OVERRIDDEN marker to execution-log:
<!-- DECISION-OVERRIDDEN: {id} | original: {choice} | override: {new_choice} | by: human (PR #{pr} comment) | reason: {human's feedback} -->
Scope discipline: rework touches ONLY what the human asked to change. Don't redo unaffected batches. Don't "improve" adjacent code.
/discovery.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub rmolines/context-engineering --plugin ce