From Sail
Use when a Voyage already ran but renders wrong in the dashboard: missing from the series list, no events, agents not appearing, model calls (LLM spans) stuck in_progress, no Sailbox exec evidence, or "Unscoped" / "Missing span" counts. A symptom→cause→fix diagnostic playbook grounded in real rollout failure modes. Use this to fix an existing trace, not to author one (see sail-voyage).
How this skill is triggered — by the user, by Claude, or both
Slash command
/sail:sail-voyage-debuggingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this when a Voyage you expected to see in the dashboard either doesn't
Use this when a Voyage you expected to see in the dashboard either doesn't appear, appears incomplete, or shows misleading state. Below: the high-frequency failure modes, what they look like, how to diagnose, and how to fix.
Voyage missing from dashboard list ──► section 1
│
├─ Voyage shows but Overview says "no events" ──► section 2
│
├─ Events present but agents missing ──► section 3
│
├─ Model calls stuck "in_progress" / striped ──► section 4
│
├─ No Sailbox exec evidence when you expected execs ──► section 5
│
├─ "Unscoped" model calls > 0 ──► section 6
│
└─ Voyage never reaches terminal status ──► section 7
Symptoms: You called sail.voyage.create(...) but the Voyage doesn't
appear at the env-qualified dashboard URL, for example
app.sailresearch.com/prod/voyages/<voyage_id>.
Most likely causes:
SAIL_MODE defaults to prod in the SDK. If your key is a
dev or staging key, set SAIL_MODE=dev or SAIL_MODE=staging explicitly.
Check the script's env or os.environ.get("SAIL_MODE") at runtime./voyages groups by exact name. If you changed
capitalization or embedded the input/date in name, the run may be under a
different series than expected.SAIL_API_KEY is unset, the SDK creates a no-op
Voyage and emits one RuntimeWarning per process saying telemetry is
disabled. Look for it in stderr.voyage.create() raises VoyageHTTPError;
the Voyage was never created. Look for the exception in stderr.Diagnose:
import sail
voyage = sail.voyage.create(name="diag")
print("voyage_id =", sail.voyage.id()) # None if telemetry is disabled
print("dashboard_url =", sail.voyage.dashboard_url())
If voyage_id is None at the top of your script, the SDK degraded to a
no-op Voyage (and warned why on stderr).
Check SAIL_MODE, SAIL_API_KEY, and try a single /v1/models curl
against the API endpoint to confirm the key works there.
Symptoms: Voyage row exists; Overview shows agents/events as 0.
Most likely causes:
sys.exit(0) or letting the process die mid-flight can drop events.
Always call voyage.complete() (or voyage.fail()) before exit.SAIL_VOYAGE_ID; it created its own
Voyage and emitted events there.Diagnose:
voyage.flush() # force buffered events to write before assertion
If events appear after explicit flush but not before, your process was exiting too early.
Symptoms: Events fire but the Overview's "Agents" panel is empty or wrong.
Most likely cause: events were emitted outside any
with voyage.agent(...) block. The dashboard derives the agent list
from stored event attribution, which is populated from the active agent
context at event time.
Fix: wrap event/span emission in with voyage.agent("Agent Name", role=...).
See sail-voyage multi-agent reference.
Symptoms: The Voyage waterfall shows model-call rows as striped /
"partial" / in_progress even after the Voyage completed.
Known issue: the dashboard reconciles terminal state on the read path, but a model call's producer-side close event (from the Responses streaming path) can be intermittently late or missing — for synchronous and background calls alike — so the row can look stuck.
Workaround:
background=False on sail.inference.responses.create(); the
synchronous path tends to be more reliable about terminal-state events, though
it is not immune.Symptoms: You ran Sailbox.exec() inside a Voyage but Execution Trace,
Waterfall, or native exec evidence views do not show the command.
Most likely causes:
sailbox_id= to
voyage.create() at start, or run the exec inside a Voyage agent/span
context. Sail associates the exec row through request metadata from the
active context.@sail.agent(...) / with voyage.agent(...) so the dashboard can
show who caused the work..wait(). Foreground Sailbox exec auto-spans
close when .wait() observes the result. If you dispatch and drop the handle,
the dashboard may show a partial started span.Diagnose in the dashboard: open the Voyage detail page, then check
Execution Trace and the Sailbox/native exec evidence view. The exec should show
the expected Sailbox id, command preview, agent name, and span. If the command
is present but agent/span are missing, move the exec(...).wait() call inside
the intended agent/span function and re-run.
Symptoms: Overview's Native Model Calls panel shows Unscoped: N or
Missing span: N with N > 0.
Most likely causes:
agent() / span() context. Headers carry
whatever context is active at call time; with none active, only the
voyage id is attached. Wrap the call:
with voyage.agent("Analyst"): with voyage.span("score"): ...OpenAI(default_headers=sail.voyage.headers()) freezes the context that
was active when the client was built — usually none, or worse, the wrong
span. Wrap the client instead; headers are then computed per call and
un-spanned calls get synthesized auto-spans, same as sail.inference.*:from openai import OpenAI
import sail
cfg = sail.Config.from_env()
client = sail.voyage.wrap_openai(
OpenAI(api_key=cfg.api_key, base_url=f"{cfg.api_url.rstrip('/')}/v1")
)
with voyage.agent("Analyst"):
with voyage.span("score"):
response = client.responses.create(model="zai-org/GLM-5", input="...")
For a non-OpenAI-style client, pass extra_headers=sail.voyage.headers()
per call — the helper carries the full attribution context (voyage id
plus the active span and agent) as of the moment you call it.
Symptoms: Voyage Overview keeps saying "in progress" indefinitely.
Most likely causes:
voyage.complete() or voyage.fail().with voyage.span(...) block and
bypassed the terminal call. Wrap the whole script body in
try/except and call voyage.fail(error_type=..., message=...) on
exception.complete()/fail() never raise
on delivery failure — they warn on stderr and leave the event buffered
for background/atexit retry. If the process exits immediately on a dead
network, the event can be lost; check stderr for the
could not deliver voyage.completed warning, and call voyage.flush()
after the terminal call when you need raise-on-failure confirmation.Pattern:
voyage = sail.voyage.create(name="task")
try:
do_work()
voyage.complete(message="ok")
except Exception as exc:
voyage.fail(error_type=exc.__class__.__name__, message=str(exc))
raise
Symptoms: the Execution Trace shows spans with an "auto" chip (dashed badge), or Waterfall bars with diagonal striping, that your code never declared.
This is expected: those are auto-spans (ADR-025/026) — the SDK synthesizes a span around any Sail inference call or Sailbox exec made with no active span, named after your calling function when derivable. They mean your work was captured and scoped even where you declared nothing. They are not a bug and not double-counting: synthesis never happens inside an explicit span.
@sail.span() /
with voyage.span(...) — explicit always wins and the auto chip
disappears.SAIL_VOYAGE_AUTO_SPANS=0; rows then
fall back to Missing span as before.RuntimeWarning per process unconditionally.SAIL_VOYAGE_DEBUG=1 for per-occurrence
repeats of those warnings.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub sailresearchco/sail-skills --plugin sail