From axiom-cli
Analyzes OpenTelemetry distributed traces from Axiom to find traces by ID, errors, latency, or service. Helps debug distributed system issues.
How this skill is triggered — by the user, by Claude, or both
Slash command
/axiom-cli:find-tracesThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Analyze OpenTelemetry distributed traces to identify errors, latency issues, and root causes.
Analyze OpenTelemetry distributed traces to identify errors, latency issues, and root causes.
When invoked with a trace ID (e.g., /find-traces abc123...), it's available as $ARGUMENTS.
First, find trace datasets:
axiom dataset list -f json
Look for datasets containing trace data (often named *traces*, *spans*, or otel-*).
Always verify field names first:
axiom query "['<trace-dataset>'] | getschema" --start-time -1h
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| sort by _time asc
| limit 100" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where error == true
| extend error = coalesce(ensure_field(\"error\", typeof(bool)), false)
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
error_count = countif(error),
services = make_set(['service.name']),
root_operation = arg_min(_time, name)
by trace_id
| sort by start_time desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where duration >= 1000000000
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
services = make_set(['service.name'])
by trace_id
| sort by total_duration desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where ['service.name'] == '<SERVICE>'
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
error_count = countif(error == true)
by trace_id
| sort by start_time desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| where error == true
| project _time, ['service.name'], name, duration, ['status.message']" --start-time -1h -f json
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| project span_id, parent_span_id, ['service.name'], name, duration, error
| sort by duration desc" --start-time -1h -f json
| Field | Bracket? | Description |
|---|---|---|
trace_id | No | 32-char trace identifier |
span_id | No | 16-char span identifier |
parent_span_id | No | Parent span (empty for root) |
name | No | Operation name |
duration | No | Duration in nanoseconds |
kind | No | CLIENT, SERVER, INTERNAL, PRODUCER, CONSUMER |
error | No | Boolean error flag |
['service.name'] | Yes | Service identifier |
['status.code'] | Yes | OK, ERROR, or nil |
['status.message'] | Yes | Error description |
['scope.name'] | Yes | Instrumentation library |
OTel durations are in nanoseconds:
| Human | Nanoseconds | Filter |
|---|---|---|
| 1 ms | 1,000,000 | duration >= 1000000 |
| 100 ms | 100,000,000 | duration >= 100000000 |
| 1 s | 1,000,000,000 | duration >= 1000000000 |
Convert for display:
| extend duration_ms = duration / 1000000.0
Non-standard span attributes are stored in attributes.custom map:
// Filter by custom attribute
| where ['attributes.custom']['user_id'] == "123"
// Aggregation requires explicit cast
| summarize count() by tostring(['attributes.custom']['tenant'])
Without tostring(), aggregations fail with "grouping by field of type unknown".
When working in a repository that matches the traced service, correlate trace data with source code to identify root causes.
Extract package/module path from ['scope.name']
github.com/org/repo/pkg/auth → pkg/authFind code from operation name
name field often contains function names or HTTP routesTrace the call chain
Note: Codebase correlation is optional. Proceed with trace-only analysis if code is unavailable or doesn't match the traced services.
When analyzing a trace, provide:
## Trace Summary
- **Trace ID:** <id>
- **Duration:** <human-readable>
- **Services:** <list>
- **Outcome:** success/failure
## Sequence of Events
1. <Service> - <operation> (<duration>)
2. <Service> - <operation> (<duration>) ⚠️ ERROR
...
## Error Analysis
<What failed, when, why>
## Root Cause
<Deepest error and explanation>
## Codebase Locations (if applicable)
- **Service:** <service.name>
- **Package:** <scope.name>
- **Files:** <specific files to investigate>
## Recommended Actions
1. <Specific action>
2. <What to investigate next>
For query syntax, invoke the axiom-apl skill which provides trace analysis patterns and duration unit guidance.
npx claudepluginhub axiomhq/cli --plugin axiom-cliInvestigates distributed application performance via PostHog APM / OpenTelemetry spans — trace ID lookup, slow span analysis, error-rate trends, latency distributions, service/attribute exploration.
Queries OpenSearch OpenTelemetry traces using PPL for GenAI agent invocations, tool executions, slow spans, errors, latency, and token usage via curl and Bash.
Guides implementing distributed tracing in microservices with OpenTelemetry, covering traces, spans, context propagation, and cross-service debugging.