Enforces plan→execute→review rigor: every unit of work is a hash-pinned packet driven to a witness-bound confirmed through an append-only tamper-evident log, so agents can't declare premature, unwired, or forged completion.
Health-check the harness state (flags confirmed packets with a missing W3 artifact)
Execute a sift packet (run acceptance; packeted → acceptance_met)
Write a sift "use the harness" standing rule into the repo's CLAUDE.md (hard-law, idempotent)
Scaffold a new sift packet (sift packet new <id> [--profile toy|software])
Show the next actionable packet in a wave (compaction-survival resume)
Executes bash commands
Hook triggers when Bash tool is used
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Build-rigor for coding agents. "Done" is something the agent proves, not something it says.
An agent reports a task complete. The tests never ran, the new function is never called, half the change is missing. It looked successful and was not. The longer the run, the more often this happens, and the model that wrote the code is the worst judge of whether the code is finished.
sift sits between the agent and "done". Work moves through plan → execute → review as a hash-pinned packet, and a packet reaches confirmed only when an append-only, hash-chained log can prove the change was reproduced, reviewed, and wired into the codebase. Skip a step, fake the verdict, or edit the log, and the packet reads corrupt. There is no field to set "done" to true.
./bin/sift setup # bootstrap .harness/
./bin/sift packet new my-task --profile toy # a unit of work + its acceptance check
mkdir -p out && printf 'a scaffolded greeting\nmy-task-OK\n' > out/my-task.txt
./bin/sift plan my-task # submitted → packeted
./bin/sift execute my-task # runs acceptance → acceptance_met
./bin/sift review my-task # runs the review witnesses → confirmed
./bin/sift state my-task # confirmed
The output has to carry the task's marker and address its goal, so an agent cannot pass by writing the magic string into an empty file. Try to shortcut the chain (mark a packet confirmed with no real review, or hand-edit the log) and sift state returns corrupt.
In Claude Code the verbs are namespaced slash commands (/sift-harness:sift-plan, /sift-harness:sift-execute, /sift-harness:sift-review), and a sift-workflow skill walks the agent through the loop. The same engine runs on opencode, Cursor, Codex, Hermes, or bare bin/sift. Nothing beyond bash and Python. No network, no keys.
The toy profile above is a 60-second smoke test. On a real codebase, use the software
profile so "done" means your tests actually ran — the rigor lives in the acceptance test
you write, not in a marker:
./bin/sift packet new add-rate-limit --profile software
# edit tasks/packets/add-rate-limit.md:
# goal: add a token-bucket rate limiter to the API gateway
# scope.paths: [src/gateway/ratelimit.ts, src/gateway/index.ts]
# edit evals/snapshots/add-rate-limit/test.sh to run a REAL check, e.g.:
# npm test -- ratelimit # the suite must pass (W2 reproduce)
# grep -rq 'rateLimit(' src/gateway/index.ts # the new symbol is actually wired (W4)
./bin/sift plan add-rate-limit # fences edits to scope.paths
# ... do the work ...
./bin/sift execute add-rate-limit # runs your test → acceptance_met
./bin/sift review add-rate-limit # witnesses → confirmed (or corrupt if faked)
A marker-grep acceptance (the toy default) passes on an empty file with the marker — fine for
a smoke test, useless as proof. sift doctor flags any packet whose acceptance is still the
unwritten placeholder stub, so a hollow "test" can't quietly bless a confirmed.
Lead with the sift-workflow skill (the entry point); the commands below are manual step controls for driving one phase by hand.
| Command | Purpose |
|---|---|
sift packet new|validate <id> | scaffold / validate a packet |
sift plan|execute|review <id> | drive a packet through the state machine |
sift state <id> · sift status [wave] | lane state · read-only summary (focus, next, integrity) |
sift next <wave> · sift wave-review <wave> | resume point · system gate over a whole wave |
sift focus [<id>|--clear] · sift doctor | active-packet focus · health check |
sift setup · sift verify-log · sift selftest | bootstrap · tamper-check the log · run the test suite |
Profiles decide what "reviewed" means: toy (smoke test), software (code diffs + tests), prose (written deliverables). Add one without touching the kernel.
SECURITY.mddocs/spec/docs/spec/HOSTS.mdMIT licensed. See LICENSE.
npx claudepluginhub baikery-studio/sift --plugin sift-harnessMulti-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Memory compression system for Claude Code - persist context across sessions
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Write feature specs, plan roadmaps, and synthesize user research faster. Keep stakeholders updated and stay ahead of the competitive landscape.