agent-observability - Claude Code Plugin | ClaudePluginHub

Stats

Actions

Available In

Tags

agent-observability - Claude Code Plugin | ClaudePluginHub

Agent Observability Plugin

Expert guidance for instrumenting AI agents in production. This Claude Code plugin provides best practices, hooks against anti-patterns, and ready-to-use templates for observing multi-agent systems and workflows.

Features

14 Specialized Skills - LLM tracing, tool calls, multi-agent coordination, cost tracking, prompt A/B testing, guardrails, decision tracing, and more
2 AI Agents - Codebase analyzer and instrumentation reviewer
2 Commands - /instrument for planning and /audit for assessment
7 Anti-Pattern Hooks - Catch common mistakes before they ship
4 Production Templates - Copy-paste instrumentation code
9 Framework Guides - LangChain, LangGraph, Claude Agent SDK, CrewAI, AutoGen, Pydantic AI, etc.
10 Vendor Integrations - Langfuse, LangSmith, Arize, OpenTelemetry, Sentry, Datadog, etc.

Installation

# From Claude Code
/plugin install agent-observability@caleb-davis-plugins

# Development mode
claude --plugin-dir /path/to/agent-observability

Quick Start

Audit Existing Telemetry

/audit

Scans your codebase for:

Existing observability SDKs
Instrumentation coverage gaps
Anti-patterns with file:line references
Prioritized recommendations

Generate Instrumentation Plan

/instrument

Creates a tiered implementation plan:

T0: Foundation - SDK init, basic spans, error capture
T1: Core Tracing - LLM calls, tool executions
T2: Context - Token/cost tracking, user context
T3: Multi-Agent - Parent-child spans, handoffs
T4: Evaluation - Quality metrics, feedback loops

Framework-Specific

/instrument langgraph --vendor=langfuse
/instrument langchain --vendor=langsmith
/instrument crewai

Skills

Skill	Priority	Triggers
`instrumentation-planning`	P1	"what should I measure", "observability strategy"
`llm-call-tracing`	P1	"trace LLM calls", "token tracking"
`tool-call-tracking`	P1	"instrument tools", "tool execution spans"
`multi-agent-coordination`	P1	"multi-agent tracing", "agent handoffs"
`token-cost-tracking`	P1	"track tokens", "cost monitoring"
`prompt-versioning`	P1	"prompt A/B testing", "prompt versions", "compare prompts"
`guardrails-safety`	P1	"guardrails", "safety checks", "PII detection"
`decision-tracing`	P1	"agent decisions", "tool selection", "why did agent"
`production-eval-strategy`	P1	"production evaluation", "sampling strategy", "regression detection"
`memory-rag-instrumentation`	P2	"RAG tracing", "retrieval instrumentation"
`human-in-the-loop`	P2	"human approval tracking", "feedback loops"
`error-retry-tracking`	P2	"error tracking", "retry instrumentation"
`evaluation-quality`	P2	"agent evaluation", "quality metrics"
`session-conversation-tracking`	P2	"session tracking", "multi-turn tracing"

Anti-Patterns Detected

The plugin's hooks warn you about:

Full prompt/response logging - PII risk, storage explosion
Missing parent spans - Broken traces, orphaned operations
Secrets in traces - API keys, tokens in span attributes
Blocking telemetry - Sync calls in agent hot path
High cardinality - Dynamic values in span names
No token tracking - Cost blindness
Missing error context - Can't debug failures

Supported Frameworks

Framework	Detection	Guide
LangChain	`from langchain`	`references/frameworks/langchain.md`
LangGraph	`from langgraph`	`references/frameworks/langgraph.md`
Claude Agent SDK	`from claude_agent_sdk`	`references/frameworks/claude-agent-sdk.md`
OpenAI Agents	`from openai`	`references/frameworks/openai-agents-sdk.md`
CrewAI	`from crewai`	`references/frameworks/crewai.md`
AutoGen	`from autogen`	`references/frameworks/autogen.md`
Semantic Kernel	`semantic_kernel`	`references/frameworks/semantic-kernel.md`
Haystack	`from haystack`	`references/frameworks/haystack.md`
Pydantic AI	`from pydantic_ai`	`references/frameworks/pydantic-ai.md`

Supported Vendors