Cross-model verification protocol for high-risk code. Triggers when authoring or reviewing code in categories prone to subtle errors (regex, async / concurrency, security-sensitive code, database migrations, edge-case-heavy parsing). Defines when to verify Claude output with a non-Claude model (ChatGPT, Gemini), the handoff-protocol YAML schema, and how to interpret verification results. Lifted from Pillarworks's 2-verifier pattern.
How this skill is triggered — by the user, by Claude, or both
Slash command
/cross-model-verification:cross-model-verificationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
For code where Claude is likely to miss subtle errors (regex / async / security / migrations / heavy parsing), have a non-Claude model review the output before commit. Pillarworks formalised this with 2 ChatGPT verifier agents + a YAML handoff protocol; promoting to cross-cutting in Wave 1.
For code where Claude is likely to miss subtle errors (regex / async / security / migrations / heavy parsing), have a non-Claude model review the output before commit. Pillarworks formalised this with 2 ChatGPT verifier agents + a YAML handoff protocol; promoting to cross-cutting in Wave 1.
The verifier doesn't fix — it flags. Claude (or the human) acts on the flags.
| Code type | Verify? | Why |
|---|---|---|
| Regex | YES | Backreferences, anchors, escape handling — easy to get wrong subtly |
| Async / concurrency | YES | Race conditions are nearly invisible to single-model review |
| Security-sensitive (auth, crypto, input validation) | YES | High blast radius if wrong |
| Database migrations | YES | Irreversible if shipped |
| Heavy parsing (CSV, EDI, complex JSON) | YES | Edge cases dominate |
| Routine CRUD | NO | Verification overhead > value |
| UI / styling | NO | Visual review catches errors |
| Logging / observability | NO | Low-stakes |
| Doc generation | NO | Low-stakes |
# handoff-protocol.yaml — declared once per repo
verification_required:
- file_pattern: "**/*.test.ts" # for test code? sometimes
when: pre-merge
verifier: gpt-4-turbo
- file_pattern: "**/migrations/**"
when: pre-commit
verifier: gpt-4-turbo
- file_pattern: "**/auth/**"
when: pre-merge
verifier: gpt-4-turbo
severity: high
verifier_prompts:
gpt-4-turbo: |
You are reviewing Claude-generated code for the kinds of errors Claude reliably misses:
- Off-by-one in array/string boundaries
- Regex edge cases (unicode, multiline, escape sequences)
- Async race conditions (parallel writes to shared state)
- SQL injection / XSS in user-input handling
- Missing error handling for the unhappy path
- Type coercion bugs (== vs ===, JS automatic conversions)
Flag each issue with: severity (low/med/high), location (file:line), explanation.
Do NOT fix; only flag.
Standard Claude session, plan-mode-first, code generated.
Does any modified file match a verification_required.file_pattern?
# Pre-commit hook (suggested):
grep -E "(migrations|auth|.test.ts)" $(git diff --cached --name-only)
Pillarworks's pattern: 2 ChatGPT subagents (one general, one security-focused) read the diff and return structured findings.
Implementation options:
| Finding severity | Action |
|---|---|
| high | Stop. Address before commit/merge. |
| medium | Investigate. Acknowledge in PR description if intentional. |
| low | Optional. Worth addressing as low-hanging fruit. |
If verification passes, add to commit message:
verified-by: gpt-4-turbo (handoff-protocol@orryx 0.1.0)
verification: pass / 0 high / 2 medium addressed / 1 low acknowledged
If verification fails: don't commit until addressed.
Claude (or any single model) has consistent blind spots — regex anchors, async ordering, type coercion. A different model architecture trained differently has different blind spots. Two-model verification catches what either alone misses. This is well-documented in the prompt-engineering literature.
The cost is small (one extra API call per high-risk diff). The save is large (avoided production incidents).
handoff-protocol.yaml — reference implementationD:\pillarworks-build-mvp\agents\chatgpt-*references/verifier-prompts.md — proven verifier prompts (Wave 1)references/example-findings.md — what verifier output looks like in practiceProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub alexmclaren/orryx-knowledge --plugin cross-model-verification