claude-code-profiler
English | 中文
Profile a slice of a Claude Code session and measure wall time, API time, tool time, token cost, and per-bucket breakdowns (docker pull / build, dataset & checkpoint downloads, benchmark runs, tests, …).
/profile start begins a profile, do work, /profile stop prints the report. Single-file Python, standard library only — no OpenTelemetry, no daemon, no Claude Code config changes required.
Useful when you want to:
- Pin down where a debug session or benchmark actually spent its time (API? tools? docker pull?)
- Know how many tokens a chunk of conversation burned and the estimated USD cost at current prices
- Drop "this run took X minutes, $Y, of which docker pull was N seconds" into an issue / PR / weekly report
Usage — three lines, that's it
/profile start # begin a profile
…talk with Claude as you normally would: drive a benchmark, debug, run tools, whatever…
/profile stop # end the profile and print the report
That's the whole workflow. No extra prompting mid-session, no special syntax to sprinkle into your messages, no "remember to log this". On stop, the profiler scans the session transcript and aggregates everything that happened between start and stop into the table below. You can start → stop → start → stop … as many times as you like in one session — each is an independent profile.
Forgot to call /profile start? Use retro instead — it scans the existing transcript over a window you specify and emits the same report:
/profile retro # whole session so far
/profile retro --since 30m # last 30 minutes
/profile retro --since last-prompt # since your most recent message
retro writes the same artifact bundle as stop but never touches the active-profile pointer, so it's safe to run alongside an in-progress /profile start.
Throughout this README, profile (the noun) means one such start→stop measurement. We're using the standard term in profiler tooling — cProfile, Go pprof, and Linux perf all call the captured measurement period a "profile" — so the name lines up with the slash command (/profile), the tool (claude-code-profiler), and the artifacts it produces (profile.json, profile.md).
What it computes
A typical /profile stop / profile status prints something like:
╭─ profile: run [60943b15] 51m34s wall
├─ time
│ wall 51m34s
│ api time 15m21s
│ tool time sum 13m24s
│ tool time wall 13m24s (critical path; sum > wall =
parallel)
│ user-thinking 30.1s
│ idle/wait 22m12s
├─ api time by model
│ claude-opus-4-7 10m22s
│ claude-sonnet-4-6 4m59s
│ (main) 28.6s
│ (subagent) 14m53s
├─ tool time by bucket
│ bash 5m47s
│ benchmark_run 4m26s
│ coding 2m12s
│ test 58.0s
│ mcp 0.4s
│ checkpoint_dl 0.0s
│ agent_dispatch 0.0s
├─ top tools (exact = solo bundle, ~ = N-way parallel split)
│ Bash 11m12s n=85 ok=79 fail=6 (~17/85)
│ Write 1m35s n=10 ok=9 fail=1
│ Edit 20.3s n=9 ok=9 fail=0
│ Read 16.6s n=38 ok=38 fail=0 (~11/38)
│ mcp__plugin_nautilus_nautilus__lookup_benchmark 0.2s
n=1 ok=1 fail=0 (~1/1)
├─ tokens
│ input 147
│ output 26.4k
│ cache write 201.1k (5m=188.3k, 1h=12.9k)
│ cache read 8.37M (hit_ratio=0.98)
│ thinking blocks 3 (0 tok est)
│ tool result (est) 69.5k
├─ tokens / cost by source
│ subagent $ 13.4388 in=137 out=22.9k cw=188.3k
│ mcp__plugin_nautilus_nautilus__lookup_benchmark 0.2s n=1 ok=1 fail=0 (~1/1)
├─ tokens
│ input 147
│ output 26.4k
│ cache write 201.1k (5m=188.3k, 1h=12.9k)
│ cache read 8.37M (hit_ratio=0.98)
│ thinking blocks 3 (0 tok est)
│ tool result (est) 69.5k
├─ tokens / cost by source
│ subagent $ 13.4388 in=137 out=22.9k cw=188.3k cr=8.20M calls=130
│ main $ 1.0098 in=10 out=3.6k cw=12.9k cr=173.2k calls=5
├─ tokens / cost by model
│ claude-opus-4-7 $ 13.4222 in=96 out=20.8k calls=86
│ claude-sonnet-4-6 $ 1.0264 in=51 out=5.6k calls=49
├─ subagents (2 files)
│ nautilus:policy-generator $ 12.4124 in=86 out=17.2k agents=1 calls=81
│ nautilus:env-generator $ 1.0264 in=51 out=5.6k agents=1 calls=49
├─ cost
│ estimated $ 14.4486 (not billing truth)
├─ turns / errors
│ assistant turns 135
│ debug (w/tool) 131