From total-recall
Run cross-model comparison — scan files with haiku, sonnet, and opus study agents to see how model size affects trigger phrase generation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/total-recall:compareThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Compare trigger phrase generation across model sizes. Same file, same prompt, three models.
Compare trigger phrase generation across model sizes. Same file, same prompt, three models.
*.{ts,tsx,js,jsx,py,go,rs,md,json,yaml,sh,sql,css,scss,html,svelte,vue}), exclude node_modules/, dist/, build/, .git/, *.lock, *.min.*Spawn the compare-researcher agent:
Use Agent tool with:
subagent_type: total-recall:compare-researcher
prompt: |
Compare trigger phrase generation across models for these files:
<list each absolute file path, one per line>
Read .claude/compare-results.json if it exists, otherwise create it. Append a new run entry:
{
"version": 1,
"runs": [
{
"runAt": "<ISO timestamp>",
"files": {
"relative/path.md": {
"models": {
"haiku": { "phrase": "...", "samples": [...], "convergence": [...], "confidence": 0.0 },
"sonnet": { "phrase": "...", "samples": [...], "convergence": [...], "confidence": 0.0 },
"opus": { "phrase": "...", "samples": [...], "convergence": [...], "confidence": 0.0 }
},
"crossModelConvergence": [...],
"modelSpecificTerms": { "haiku": [...], "sonnet": [...], "opus": [...] }
}
}
}
]
}
Use relative paths as keys (relative to project root).
Show a comparison table:
## Cross-Model Comparison Results
| File | Haiku | Sonnet | Opus | Shared Terms |
|------|-------|--------|------|--------------|
| testing.md | test behavior not implementation (0.99) | ... (0.xx) | ... (0.xx) | behavior, implementation |
### Analysis
- Files where all models agree: [list] — these are stable attractors
- Files where models diverge: [list] — larger models may surface deeper associations
- Average confidence: haiku X.XX | sonnet X.XX | opus X.XX
This is experimental data. Production triggers in triggers.json are not affected.
npx claudepluginhub brewpirate/zen-flow --plugin total-recallOrchestrates parallel analysis of coding problems across AI models (Claude, GPT, Gemini, Grok) via CLI tools or APIs, collects recommendations, and synthesizes optimal solution.
Adds evaluation results to Hugging Face model cards. Extracts tables from README, imports scores from Artificial Analysis API, or runs custom evaluations with vLLM/lighteval. Updates model-index metadata for leaderboard compatibility.
Measures latency, token cost, and accuracy across LLM skill/prompt variants. Runs paired evaluations, audits token-budget compliance, and flags insufficient sample sizes.