From hatch3r
Tracks token usage and estimates costs for agent sessions. Monitors budget limits and generates cost reports using session/issue/epic budgets.
How this skill is triggered — by the user, by Claude, or both
Slash command
/hatch3r:hatch3r-cost-trackingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Applies when:** the project tracks cloud/LLM spend against a budget. On a project with no cost-accounting need, skip (per `rules/hatch3r-right-sizing.md`).
Applies when: the project tracks cloud/LLM spend against a budget. On a project with no cost-accounting need, skip (per rules/hatch3r-right-sizing.md).
Task Progress:
- [ ] Step 0: Detect ambiguity (P8 B1)
- [ ] Step 1: Review cost tracking configuration
- [ ] Step 2: Estimate current session token usage
- [ ] Step 2a: Re-fetch pricing before reporting (currency gate)
- [ ] Step 3: Identify cost optimization opportunities
- [ ] Step 4: Generate cost report
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per agents/shared/user-question-protocol.md. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: tracking scope (session vs issue vs epic), budget values when missing from hatch.json, hardStop authority (block vs warn), report format target, and whether to defer non-critical work over budget.
Fan-out scales with task size; token cost never justifies serializing independent work (rules/hatch3r-fan-out-discipline.md P8 B2; agents/shared/efficiency-patterns.md). Emit sub_agents_spawned: { count, rationale } in your output.
hatch.json for a costTracking section.sessionBudget, issueBudget, epicBudget.hardStop is enabled.Estimate tokens for the current session using these rules:
| Content Type | Rule | Accuracy |
|---|---|---|
| Messages | ~4 characters per token | High -- stable ratio for English text |
| Tool calls | JSON length / 4 (input), response length / 4 (output) | Medium -- JSON has more overhead characters |
| File reads | Character count / 4 | High -- but large files may be truncated by the tool |
| Web searches | ~500 tokens per search | Low -- varies widely by result length |
| Subagent spawns | Estimate full context re-sent per spawn (~2000-5000 tokens base) | Medium -- depends on included rules/context |
Subagent cost multiplier. Each subagent spawn carries a base cost for the agent protocol, included rules, and context. A pipeline with 8 subagents (researcher + implementer + reviewer + fixer + 4 Phase 4 specialists) has significant overhead from context re-transmission. Factor this into budget estimates.
Calculate estimated cost using named-model rates tied to model IDs. Published-vendor rates drift between model releases — re-fetch before every report per Step 2a; the accessed: date marks when each row was last verified.
| Model | Model ID | Input (per 1M) | Output (per 1M) | accessed |
|---|---|---|---|---|
| Claude Haiku 4.5 | claude-haiku-4-5 | $1.00 | $5.00 | 2026-06-05 |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | $3.00 | $15.00 | 2026-06-05 |
| Claude Opus 4.8 | claude-opus-4-8 | $5.00 | $25.00 | 2026-06-05 |
Cache-read multiplier. Cached input tokens (cache_read_input_tokens in the API usage block) bill at ~0.1× the model's base input rate; cache writes bill at 1.25× (5-minute TTL) or 2× (1-hour TTL). When a session reuses a large cached prefix, price cache_read_input_tokens separately at 0.1× base input — counting them at full input rate overstates spend by up to 10× on cache-heavy agent loops. Total prompt size = input_tokens (uncached) + cache_creation_input_tokens + cache_read_input_tokens.
Published per-token rates change between model releases. Before emitting a cost figure, verify the rate table is current:
accessed: date is older than 30 days, re-fetch the vendor's current pricing (Anthropic: https://www.anthropic.com/pricing, or the live claude-api skill's models/pricing reference) and update the row's rates + accessed: date in place.accessed: date rather than defaulting to the nearest tier.accessed: date of the rate used directly in the report (see Step 4). A cost figure with no rate-provenance date is unverifiable — never report a dollar amount without it.If pricing cannot be re-fetched (offline, vendor page unreachable), proceed with the on-file rates but flag the figure as (rates as of <accessed-date>, not re-verified) in the report.
When hatch.json has no costTracking section, apply these defaults (report-only — no hard stop unless hardStop: true):
| Budget Type | Default |
|---|---|
sessionBudget | $10.00 |
issueBudget | $5.00 |
epicBudget | $25.00 |
warningThresholds | [0.5, 0.75, 0.9] |
hardStop | false |
| Threshold | Action |
|---|---|
| 50% | Log warning, continue |
| 75% | Alert user, suggest optimization |
| 90% | Strong warning, recommend delegation or checkpoint |
| 100% | Stop (if hardStop: true) or alert and continue |
Review usage patterns for savings:
Produce a cost report in this format:
## Cost Report: {scope}
**Period:** {session/issue/sprint}
**Token Usage:**
- Input tokens (uncached): ~{n}
- Cache-read tokens: ~{n} (billed ~0.1x base input)
- Cache-write tokens: ~{n} (billed 1.25x / 2x by TTL)
- Output tokens: ~{n}
- Total tokens: ~{n}
**Estimated Cost:** ${amount} ({model-id}, rates accessed {YYYY-MM-DD})
**Budget Status:** {amount} / {budget} ({percentage}%)
**Breakdown:**
| Phase | Tokens | Cost | % of Total |
|-------|--------|------|-----------|
| Planning | ~{n} | ${x} | {%} |
| Implementation | ~{n} | ${x} | {%} |
| Testing | ~{n} | ${x} | {%} |
| Review | ~{n} | ${x} | {%} |
| Sub-agents | ~{n} | ${x} | {%} |
**Optimization Opportunities:**
- {suggestions based on usage patterns}
Include total estimated tokens (uncached input + cache-read + cache-write + output), estimated cost at the named-model rate with its accessed: date, budget status (if configured), and top optimization opportunities. Always present estimated values with the ~ prefix. Never report a dollar figure without the model ID and rate-access date. Never suppress threshold alerts.
accessed: datehatch3r-context-health — context health monitoring complements cost tracking for session managementnpx claudepluginhub hatch3r/hatch3r --plugin hatch3rTrack session costs, set budget alerts, and optimize token spend. Use to check costs mid-session or set spending limits.
Monitors, caps, and recovers from context accumulation in agentic systems with per-cycle cost tracking, budget enforcement, and emergency pruning. Use for long-lived agent loops, rising API costs, or post-mortem analysis.