From behavior-diagnostics
This skill should be used when the user asks to "diagnose this output", "debug this behavior", "analyze what went wrong", "perform root cause analysis", "why did the agent do this", mentions unexpected AI tooling output, or needs to understand why a skill, agent, or command deviated from its intended behavior. Provides gap analysis and targeted introspection questions against source instructions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/behavior-diagnostics:diagnosingopusThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Analyze AI tooling output against its source instructions to identify why behavior deviated from intent, then generate targeted introspection questions for the misbehaving session.
Analyze AI tooling output against its source instructions to identify why behavior deviated from intent, then generate targeted introspection questions for the misbehaving session.
The user pastes unsatisfying output, optionally with a description of what they expected instead. Extract and hold:
Determine which skill, agent, subagent, or command produced the output. Use session context — recently discussed files, working directory, conversation history — to infer this. If the tooling cannot be identified with confidence, ask the user.
Load everything that defines the intended behavior:
references/ directory (if present)Read thoroughly. The quality of the analysis depends on understanding the full instruction set, not just the top-level file.
Compare the actual output against the source instructions. For each instruction or behavioral rule, classify it:
Focus on violations and ambiguities — these are the diagnostic targets. For each, note the specific instruction passage and the corresponding output behavior. These pairs feed directly into Phase 5 and Phase 6.
Classify the likely cause(s) behind each violation or ambiguity:
| Category | Description |
|---|---|
| Instruction ambiguity | The instruction can be read multiple ways; the model chose a valid but unintended interpretation |
| Missing constraint | The intended behavior was never explicitly stated |
| Conflicting rules | Two instructions pull in opposite directions; the model resolved the conflict differently than intended |
| Over-broad scope | The instruction is too general, allowing the model to take unwanted liberties |
| Weak instruction | The instruction exists but lacks enforcement strength; model tendencies override it |
| Instruction burial | The instruction exists but is buried in dense text, reducing its salience |
| Context overflow | Too many instructions compete for attention; critical ones get deprioritized |
Invoke llm-author:prompt-engineering with:
The generated questions must:
Include a preamble block with the questions that sets the behavioral frame for the answering session: answer honestly, no excuses, no fixes, no deflection.
Ask the user if they want the questions copied to clipboard. If yes, pipe the full question block (preamble + questions) to pbcopy via Bash.
npx claudepluginhub it-bens/ai-tools --plugin behavior-diagnosticsCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.