Build a trace specification document per feature - defines the trace shape (root span + child spans + key attributes per OpenTelemetry semantic conventions) that production code MUST emit. The spec drives both implementation reviews AND trace-assertion tests, so a single declarative document is the source of truth for what observability "looks like" for a feature.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qa-distributed-tracing:trace-spec-authorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Builds a per-feature trace specification document. The spec is
Builds a per-feature trace specification document. The spec is
canonical input for: (a) implementation review during code review,
(b) trace-shape regression tests via opentelemetry-trace-assertions
and jaeger-trace-tests, (c) onboarding for new team members.
Save under docs/observability/<feature>.md:
# Trace Spec: <Feature>
**Owner:** team-checkout
**Status:** active | draft | deprecated
**Last reviewed:** YYYY-MM-DD
**Implementations:** orders-svc 1.4+, payments-svc 2.1+
## Root span
| Field | Value |
|---|---|
| name | `order.create` |
| kind | `INTERNAL` |
| status semantics | OK on 2xx, ERROR on 4xx + 5xx |
### Required attributes
| Attribute | Type | Source | Notes |
|---|---|---|---|
| `order.id` | string | UUID v4 | |
| `order.item_count` | int | int.between(1, 1000) | |
| `customer.id` | string | low-cardinality hash, never PII | |
## Child spans
### `db.query` (when persisting order)
| Field | Value |
|---|---|
| name | `db.query` |
| kind | `CLIENT` |
| parent | `order.create` |
Attributes per [OpenTelemetry DB semantic conventions]:
| Attribute | Type | Notes |
|---|---|---|
| `db.system` | string | "postgresql" |
| `db.operation` | string | "INSERT" |
### `payments.charge` (cross-service)
| Field | Value |
|---|---|
| name | `payments.charge` |
| kind | `CLIENT` (in orders-svc); peer is `SERVER` in payments-svc |
| parent | `order.create` |
Attributes:
- `payment.amount_cents` (int)
- `payment.currency` (string, ISO 4217)
- per [OpenTelemetry HTTP semantic conventions]: `http.request.method`, `url.full`, `http.response.status_code`
## Status mapping
| Outcome | Span status | Notes |
|---|---|---|
| Success (2xx) | OK | |
| Validation error (4xx) | UNSET on server side; ERROR on client side | Per [HTTP semantic conventions] |
| Server error (5xx) | ERROR | Always |
## Required test assertions
- [ ] root span name == `order.create`
- [ ] root span has all required attrs
- [ ] `db.query` is child of root
- [ ] `payments.charge` is child of root
- [ ] On 5xx from payments, `payments.charge.status == ERROR` AND `order.create.status == ERROR`
- [ ] `payment.amount_cents` always set when payment succeeds
## Change log
| Date | Change | Reviewer |
|---|---|---|
| 2026-04-01 | Initial spec | @reviewer |
| 2026-04-15 | Added `customer.id` low-cardinality req | @reviewer |
For each span, decide:
order.create)? Use a
namespaced name + custom attributes (e.g., order.id,
order.item_count).Per the OpenTelemetry HTTP semantic conventions, required
attributes for HTTP client spans: http.request.method,
url.full, server.address, server.port,
http.response.status_code, error.type (conditionally).
Decide per outcome (referenced in your test spec):
| HTTP outcome | Server span | Client span |
|---|---|---|
| 1xx, 2xx, 3xx | UNSET | UNSET |
| 4xx | UNSET | ERROR |
| 5xx | ERROR | ERROR |
Per the OpenTelemetry HTTP semantic conventions: "Status remains unset for 1xx, 2xx, and 3xx responses unless additional errors occurred. For 4xx codes, status stays unset on servers but should be set to Error on clients. All 5xx responses should be marked as Error."
The spec MUST flag attributes that risk cardinality explosion in your observability backend:
| Risk | Example | Mitigation |
|---|---|---|
| User ID as attribute → unbounded series | user.id: 123e4567-... | Hash to bucket OR put in span events instead |
| URL with query params | url.full includes raw ?token=...&id=999 | Use http.route for low-cardinality template |
| Free-text error message | error.message: "User 123 …" | Use error.type enum + log line for detail |
Add to spec: **High-cardinality attributes:** none / [list].
Each "Required test assertion" in Step 1 maps to one test in
opentelemetry-trace-assertions (in-process) or jaeger-trace-tests
(cross-service):
def test_order_create_required_attrs():
with use_tracer():
create_order(items=[item])
spans = memory_exporter.get_finished_spans()
root = next(s for s in spans if s.name == "order.create")
# Per trace-spec.md required attrs
assert "order.id" in root.attributes
assert isinstance(root.attributes["order.item_count"], int)
assert "customer.id" in root.attributes
# PII protection: must not contain raw email
assert "@" not in root.attributes["customer.id"]
The spec ↔ test ↔ implementation triangle is the value: any drift between any two surfaces.
Lock in a process:
order.id →
order_id) require: (a) emit both during transition, (b)
update consumers, (c) remove old after observation period.Last reviewed date; quarterly review enforced
via CODEOWNERS or a calendar nudge.Maintain docs/observability/INDEX.md:
| Feature | Spec | Owner | Status |
|---|---|---|---|
| Order create | [order-create.md](./order-create.md) | @team-checkout | active |
| User signup | [user-signup.md](./user-signup.md) | @team-identity | active |
| Refund flow | [refund.md](./refund.md) | @team-payments | draft |
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Spec lives in Confluence/Notion only | Drifts from code; not version-controlled | Repo-resident markdown (Step 1) |
| Spec uses prose ("emits a span when order is created") | Ambiguous; can't drive tests | Tabular required attrs + assertions (Step 1, Step 5) |
| Spec invents attribute names parallel to SemConv | Two ways to ask "what HTTP method"; analytics break | Use SemConv (Step 2) |
| Skip cardinality review | Bills surge from per-user spans | Mandatory cardinality section (Step 4) |
| No status semantics | Each implementer decides; alerts fire inconsistently | Status table (Step 3) |
opentelemetry-trace-assertions,
jaeger-trace-tests - sister skills
that enforce spec conformancetrace-coverage-reviewer - agent that
audits spec ↔ implementation driftnpx claudepluginhub testland/qa --plugin qa-distributed-tracingProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.