claude-usage-guard

A zero-dependency Claude Code plugin that
warns you as you approach your subscription usage limits and blocks new work when a
window is exhausted — deterministically, via hooks. When a limit is hit during a tool
call, it steers the model to self-schedule a wakeup past the reset instead of burning
through retries.
Heads up: this reads an undocumented endpoint and fails open. Read the
Disclaimer before relying on it.
Why
There are two ways to run into a usage window in Claude Code, and both hurt when they
happen unintentionally:
- With "extra usage" enabled on your plan, work silently continues past your included
limits and is billed at API-token rates — an unattended loop, agent fan-out, or
long autonomous session can keep spending real money without you noticing.
- Without it, the session just hits the wall mid-task.
The guard is a deterministic backstop for both: it warns you while there is still time to
wind down, and hard-blocks new work before a window is exhausted — so you stay within
your plan's included usage instead of rolling into billed extra usage, and an autonomous
session pauses itself until the reset instead of burning retries (or your wallet).
What it does
The guard runs on two hook events and reads your usage from Anthropic's OAuth usage
endpoint (cached locally). It compares the worst utilization across your rolling windows
(5h, 7d, and the per-model 7d-opus / 7d-sonnet windows when present) against two
thresholds — WARN (default 80%) and HARD (default 95%). Windows whose reset time
has already passed are ignored entirely — stale data can never block you past the actual
reset.
Reset times are shown in your local timezone with fixed English labels, e.g.
[usage] 5h: 32% (reset Sun 15:50) | 7d: 5% (reset Wed 19 Jun 19:00).
| Event | Condition | Behavior | Exit |
|---|
UserPromptSubmit | worst < WARN | Prints a one-line [usage] … status to context | 0 |
UserPromptSubmit | WARN ≤ worst < HARD | Status line + WIND DOWN advisory (finish current work, don't start big tasks) | 0 |
UserPromptSubmit | worst ≥ HARD | Blocks the prompt; stderr explains when it resets and how to bypass | 2 |
PreToolUse | tool is ScheduleWakeup | Exempt — always allowed (exit 2 is never possible). When hard-blocked, the guard stamps [usage-guard:resume] onto the wakeup prompt via updatedInput so the wake turn is recognized as a resume hop | 0 |
UserPromptSubmit | prompt starts with [usage-guard:resume], still blocked, reset ≤ 6h | Allowed through; re-instructs the model to reschedule with same prompt + computed delay | 0 |
UserPromptSubmit | prompt starts with [usage-guard:resume], still blocked, reset > 6h | Allowed through; instructs chain termination (summarize + end turn, no multi-day reschedule) | 0 |
UserPromptSubmit | prompt starts with [usage-guard:resume], window has reset | Allowed through; appends resume-ready suffix telling the model to resume the task | 0 |
PreToolUse | worst < HARD | Allowed silently (no stdout, per hook contract) | 0 |
PreToolUse | worst ≥ HARD, reset ≤ 6h away | Blocks the tool; instructs the model to call ScheduleWakeup until reset, then resume | 2 |
PreToolUse | worst ≥ HARD, reset > 6h away | Blocks the tool; instructs the model to wrap up, summarize, and end the turn | 2 |
Gate-and-self-sleep pause design
When a short (5-hour) window is exhausted mid-task, blocking every tool except
ScheduleWakeup turns the limit into a pause rather than a failure: the model is told
to schedule a wakeup (chaining 3600s sleeps if needed) and resume the same task after the
reset. ScheduleWakeup itself is always exempt from the gate, so the model can never be
trapped — it can always schedule its own wakeup.
For a weekly window (reset more than 6 hours out), self-sleeping is pointless, so the
guard instead tells the model to summarize state for you and end the turn cleanly.