Implements reflexion loop for self-correcting deliverables: implement, validate (tests/security/accessibility), self-critique issues, retry up to 3 iterations. Ensures quality in code/content/AI/services.
The auto-dogfood system is an inferential verification loop
Order: Always attempt rules-based → computational → inferential. Only escalate to the next mode when the previous mode cannot verify the property in question.
Source: Trivedy (Anatomy of an Agent Harness, LangChain blog). Three-mode taxonomy adapted from Böckeler (Harness Engineering, martinfowler.com — computational vs inferential distinction). Note: harnesses continue to matter even as models improve — they engineer systems around model intelligence, not just patch deficiencies.
Rules
Each iteration must show measurable improvement over the previous.
If the same issue recurs across iterations, investigate root cause rather than patching symptoms.
Never skip the self-critique step, even if tests pass.
Log the reflexion loop outcome in delivery-journal.md.
Applies adversarial fresh-context review to non-trivial decisions in code. Use when correctness matters more than speed, in unfamiliar code, or for high-stakes operations.
Runs a clean-context blind-verification loop where a sub-agent verifies the main agent's response against user requirements, revising until no blockers remain. Use when the user explicitly asks for verification or review.
Subjects non-trivial decisions to a fresh-context adversarial review before finalizing. Use for high-stakes code, unfamiliar logic, or when correctness outweighs speed.