Claude TokenVampire
An app that monitors your Claude Code token usage in real time.
Anthropic doesn't show you how much of your current 5-hour session quota you've consumed — ClaudeTokenVampire does.

Full documentation lives on the website:
gabrielmoraru.com/my-delphi-code/token-vampire
How Anthropic's 5-hour window actually works
Session-based, not sliding. Your 5-hour clock starts on your first message and runs for exactly 5 hours regardless of activity, at which point the counter hard-resets. The next session starts on the first message after that reset. Anthropic uses the word "rolling" in their docs but means "cycles session-to-session", not "continuously sliding".
Source: Anthropic support article 12429409 — "if you hit your limit at 2 PM, your next allocation begins at 7 PM, then 12 AM, and so on."

What it does
It puts you in control of your Claude Code tokens:
- Tracks all billable token types: input, output, cache creation, cache reads
- Shows the current 5-hour session with a per-bucket bar chart from
session_start → session_end
- Color-coded bars: green → yellow → red as you approach your limit
- Estimates cost (configurable $/1M token rates)
- Shows cache hit rate and warns when the 5-minute cache gap expires
- Counts down until the session hard-resets (all tokens reset at once, not gradually)
- Tracks the 7-day weekly cap with its own configurable limit and ratio bar
- Top tool calls — a third tab ranks the tools your sessions hit most over the last 7 days, with calls / cost / avg duration
- Runs quietly in the system tray — click the icon to show/hide
- USES 0 TOKENS by default — runs entirely offline, no API calls, no Claude queries (the optional auto-ping feature is opt-in and uses a few tokens per ping)
Features
Data Engine
- Parses all billable token types: input, output, cache creation, cache read
- Sorts entries by timestamp; skips non-
assistant entries
- Detects the current 5-hour session: first message where no predecessor exists within 5h; session runs for exactly 5h from there
- Aggregates only entries inside
[SessionStart, SessionEnd] — matches what Anthropic counts
- Per-project breakdown, sorted descending by token usage
- Configurable bucket width (2-60 minutes per chart bar)
Computed Stats (per session, global and per-project)
- Total tokens: input + output + cache creation + cache read
- Per-type token breakdown
- Message count (assistant turns) in current session
- Cache hit rate:
cache_read / (cache_read + input)
- Cost estimate in USD (four independently configurable $/1M rates)
- Minutes until session hard-reset (
SessionEnd - Now)
- Idle minutes since the last message
- Cache gap warning with per-tier detection (5m and 1h shown as two separate gradient bars on the cache-status row)
- Cache tier breakdown: 1h ephemeral vs 5m ephemeral tokens
- Web search and web fetch counts
- 7-day total tokens (true sliding window) with optional weekly cap
All Projects Tab
- Combined stats across all projects
- Five gradient progress bars: token usage, cache hit rate, session-reset countdown, cache 1h warmth, cache 5m warmth (last two share the cache-status row side-by-side)
- Bar chart spanning the current session window (left edge =
SessionStart, right edge = SessionEnd)
- Configurable bucket width
- Color-coded bars: green → yellow → orange → red by % of per-slot budget
- Auto-scale blue mode when no limit is configured
- Token value labels above each active bar
- Y-axis with token count labels
- X-axis with hour offsets (
start, +1h, +2h, +3h, +4h, end (reset))
- 10% horizontal grid lines; vertical hour-mark grid lines
- Legend (color key or auto-scale note)
- Cache status row: two side-by-side gradient bars (1h tier fills available width, 5m tier fixed-width on the right) plus a short "Xm idle" label — each bar fills as its tier ages toward expiry
- Hot hours warning (13:00-18:59 local time — Anthropic peak-load window, user-reported)
- Detailed tooltips on every stat label
Per Project Tab
- Project list: active projects (with token counts) and inactive known projects (gray, separated)
- Per-project stats: tokens, messages, cache hit rate, cost, expiry, cache status
- Per-project bar chart (same renderer, filtered data)
- Selection preserved across automatic refreshes