Tokenmin Scanner

The public, Apache-2.0 audit copy of Tokenmin. This is the code that decides what (if anything) leaves your machine when you run tokenmin. About 5 minutes of reading, end to end.

The deal: Tokenmin is free during the friends-and-family preview in exchange for your anonymized usage data. The scanner that does the anonymization is open precisely so the trade is verifiable, not just promised. Read it. Diff it against your install. Then decide if you trust the bargain.

curl --proto '=https' --tlsv1.2 -fsSL https://tokenmin.ai/install.sh | bash

What you get in the first 60 seconds

~ tokenmin
  ▶ scanning ~/.claude
  ✓ found 57 sessions in last 14 days
  ✓ anonymized
  ✓ analyzed

  Tokenmin  Claude usage audit
  ────────────────────────────────────────────────────────────────────────
  scanned 57 sessions over 14 days
  est. spend (window): $6,860
  model mix: Opus 99% · Sonnet 1%
  ────────────────────────────────────────────────────────────────────────

  Headline  ~$7,151/mo recoverable across 7 fix(es), ~4.8 hrs total

  1. A lot of your spend is on Opus — route by tier
     $$$$   ▮▮▮▮▮▮▮▮▮▮  $7,055/mo  0.1 hrs · conf 55% · model routing
     evidence: 100% of $6,860 weekly spend on Opus across 52 sessions.
     → tokenmin show model_overspend
  ...

A rich terminal card with the headline dollar figure, ranked findings, severity pills, per-finding next-action. Then tokenmin show <id> drills into one finding's evidence + fix. Then tokenmin watch runs a live dashboard while you work.

What's in here, what isn't

Concern	Where
Walking `~/.claude` sessions, settings, agents, skills, MCP config	This repo — `skills/tokenmin/analyzer.py`
Parsing claude.ai / Claude Desktop chat exports	This repo — `skills/tokenmin/analyzer_chat_export.py`
Anonymization (paths, secrets, labels, identifiers)	This repo — `skills/tokenmin/anonymize.py`
Orchestrator CLI: collect → anonymize → submit → render	This repo — `skills/tokenmin/tokenmin.py`
Wrapper script + auto-update + version + doctor	This repo — `tokenmin`
Tests + CI	This repo — `tests/`, `.github/workflows/ci.yml`
Detection rules, scoring, report rendering	Not here. Lives in proprietary `watsonrm/tokenmin-core`.
Hosted submission server	Not here. Bundled in the F&F preview.

This repo is scanner-only. Running it produces an anonymized snapshot. It does not produce a report (that's the engine's job). The scanner is fully functional without the engine — you can write the snapshot to disk with --snapshot snap.json and audit what would be sent before deciding to submit anywhere.

Install

Quick (trusts the network all the way to GitHub):

curl --proto '=https' --tlsv1.2 -fsSL https://tokenmin.ai/install.sh | bash

Verify-then-run (recommended if you don't trust the network all the way to GitHub):

curl --proto '=https' --tlsv1.2 -fsSL -o install.sh https://tokenmin.ai/install.sh
curl --proto '=https' --tlsv1.2 -fsSL -o install.sh.sha256 https://tokenmin.ai/install.sh.sha256
shasum -a 256 -c install.sh.sha256
less install.sh
bash install.sh

The installer detects every Claude variant on your machine (Code / Desktop on macOS / Linux / Windows), drops a single tokenmin command on PATH, and offers to add it to your shell rc with consent. No gh, no brew, no auth setup.

After install:

tokenmin --selfcheck      # see the anonymizer rules without reading Python
tokenmin                  # scan + render inline (the magic moment)
tokenmin watch            # live dashboard
tokenmin show <id>        # drill into one finding
tokenmin help             # 30-second walkthrough

Trust posture

Hashes are HMAC-keyed, not raw SHA-256

Identifiers (file paths, project names, MCP server names, custom agent / skill / command names) hash with HMAC-SHA256 keyed by a 32-byte salt generated on first run at ~/.tokenmin/.salt (chmod 0600, refuses to overwrite via O_EXCL). Output is 16 hex chars (64 bits) — collision-resistant for any realistic corpus.

An adversary who guesses common path names like ~/.ssh/known_hosts cannot precompute its hash without your salt. Cross-snapshot correlation works within your install (so the engine can flag "same file re-read 12×"); cross-user correlation is broken.

Stricter mode: TOKENMIN_STRICT_ANONYMIZE=1 adds an additional per-run salt. Breaks within-user cross-run correlation too at the cost of losing across-days findings.

Defense-in-depth on inputs

Pathological JSONL inputs (oversized lines, regex-bomb strings, malformed JSON) can't hang the scrubber: every regex sees inputs truncated to 64 KiB max; bad lines are dropped, not raised on; recursion depth is capped.

tokenmin

Popularity

What's Inside

README

Tokenmin Scanner

What you get in the first 60 seconds

What's in here, what isn't

Install

Trust posture

Hashes are HMAC-keyed, not raw SHA-256

Defense-in-depth on inputs

Confidence

Similar Plugins

claude-mem

antigravity-bundle-web-designer

caveman

frontend-design

ui-design

marketing-skills