watchmen

Watchmen turns every coding session into the skill bundles you’d never sit down and write yourself.

Install · What it does · Mission control · How it works · Cost & privacy

Reusable skills and workspace briefs, built from what you actually do, carried across Claude Code, Codex, and pi.dev. You install it once and never change how you work.

The manual fix is writing a CLAUDE.md or AGENTS.md by hand, but that's you doing the learning on behalf of the agent. That's backwards.

watchmen sits behind Claude Code, Codex, and pi.dev. It silently watches your sessions, mines what you actually do, and writes skill bundles + workspace briefs (CLAUDE.md / AGENTS.md) so your next session is smarter than your last. Same skills follow you across agents — switch between Claude Code and Codex on the same repo and they pick up where the other left off.

Why it matters

Fewer tokens burned re-explaining yourself. Your agent stops rediscovering the same procedures every session. The skill is already on disk, so context that used to be spent re-deriving it isn't.
Fewer tool errors per session. watchmen's own impact card tracks median tool errors before and after a project's skills land, on a 16-week curve.
Unified context layer across every agent. Switch from Claude Code to Codex in the middle of development and the skills you need will follow. No need to re-onboard the new agent.

Local storage, cross-agent, continuous.

watchmen stores transcripts, metrics, analyses, and generated bundles on your machine. Analysis runs send selected session excerpts to OpenRouter using your API key; nothing is uploaded outside those explicit LLM calls.

What watchmen actually does

While you work, it:

🤖 Captures sessions from every agent — Claude Code (~/.claude/projects/), Codex (~/.codex/sessions/), pi.dev. One corpus across tools.
📚 Auto-curates skills — recurring procedures get turned into runnable skill bundles your agent can call: SKILL.md + scripts + references.
✍️ Auto-writes CLAUDE.md + AGENTS.md — workspace brief, identical content for both, refreshed continuously.
📈 Surfaces what's working — mission control web UI, per-project impact tracking, friction signals, action queue.
💡 Retrospective skill hints — when you could've used an existing skill, the next statusLine update tells you. Never modifies your agent's context. Never blocks you.

You install it. It runs. Your agents get better every day you use them.

The CLI

watchmen init

Six steps. Most run in seconds; the LLM passes (analyze + curate) are the only slow ones — you see the cost estimate before they run. Stop at any confirmation gate; nothing partial is left behind.

Mission control

A local web dashboard at http://127.0.0.1:8979 — no hosted account or remote dashboard. Top-of-page tells you:

Skill calls this week vs last week — are your curated skills being invoked?
Tool errors per session — is friction going up or down?
Active repos — what's getting work this week?
Skill leaderboard — which repo's skills are firing most
Status tiles — traffic-light health per project (healthy / stale / uncurated)
Next actions — ranked queue, e.g. "kai-bench has 28 prompts to analyze · Run"

Per-project impact

Drill into any tracked repo and you get a before/after view scoped to that project. 16-week chart of tool errors per session with a dashed annotation at the date the curator first landed. Pre/post comparison table: sessions, median tool errors, median prompts, median cost. Honest empty states when there isn't enough post-treatment data yet — never silently disappears.

Subtitle reads "Correlation only — not a controlled experiment." We don't oversell the signal.

Three themes

Light comic-pulp newsprint by default. Doomsday noir mode for the dark-mode crowd. Rorschach sepia-typewriter for diary-mode fans. Switch instantly at /settings — picker persists per browser via localStorage, no reload.

Dashboard + impact-card screenshots ship with v0.6 — generated against a mock corpus so no real project data leaks into the docs.

Install

Three commands and a wizard. Total wall time ~10 min + 30–90 min per project of LLM passes.

git clone https://github.com/firstbatchxyz/watchmen.git
cd watchmen
uv sync && uv tool install --editable .
watchmen init

watchmen

Popularity

What's Inside

README

watchmen

Why it matters

What watchmen actually does

The CLI

Mission control

Per-project impact

Three themes

Install

Confidence

Similar Plugins

caveman

claude-mem

llm-council-plugin

self-improving-agent

Popularity

Health & Quality

Similar Plugins

caveman

claude-mem

llm-council-plugin

self-improving-agent