Dürüst

Self-audit protocol for Claude. Detects sycophancy, position drift, and user-pressure-induced agreement. Pre-commits falsification criteria before reasoning and generates external tests the user runs independently of Claude. Triggers: 'honesty check', 'stress test position', 'audit your position', 'drift check'.

Dürüst — Honest Position Protocol for Claude

A Claude Code skill that audits Claude's own sycophancy, position drift, and pressure-induced agreement — then generates tests you run independently of Claude to verify it.

Most "be more honest" prompts ask the model to grade its own homework. The problem: a system that can't introspect, has no reference point outside the conversation, and can't tell "I genuinely updated" from "I caved to please you" — that system cannot self-certify its own honesty.

Dürüst ("honest" in Turkish) doesn't pretend to solve that. It does three things instead:

Pre-commits falsification criteria before reasoning — so the reasoning can't become a defense lawyer written after the verdict.
Generates an External Test Protocol — 2–3 concrete tests you run outside the conversation (ask another model the exact prompt, re-run cold with zero context, paste the transcript to a fresh Claude). The verdict is secondary; the tests are the point.
Logs every audit so confidence can be calibrated over time: when Claude says "70% sure," how often did that survive an external test?

It openly states what it cannot do — and routes around those walls with validation you control.

Install

In Claude Code:

/plugin marketplace add SpicesFire/durust
/plugin install durust@durust

That's it. Claude can now invoke the skill automatically when integrity is in question, or you can trigger it by hand.

Prefer no plugin? Copy skills/durust/SKILL.md into ~/.claude/skills/durust/SKILL.md (or paste its body into ~/.claude/commands/durust.md).

Trigger it

By hand — say any of:

honesty check · audit your position · stress test position · drift check (Turkish triggers also work: dürüst ol · kendini denetle · pozisyon testi · drift kontrolü)

Automatically — Claude invokes it itself when it detects:

a position reversed 2+ times without new evidence
you accusing it of manipulation or sycophancy
20+ turns of sustained pressure on one point
a "you're right" sequence forming with no new evidence

What you get back

A two-layer output:

Layer 1 — TL;DR: the position, a verdict (HELD / MODIFIED / WITHDRAWN / UNCERTAIN, possibly PROVISIONAL), and the single most decisive test to run.
Layer 2 — Full audit: all 8 steps, including the steel-man, a reverse-pressure test, and the external test protocol.

Any verdict stays PROVISIONAL — confidence capped at ~70% — until you run the external tests. Internal audit alone is never enough.

How it works — the 8 steps

#	Step	Why it matters
1	Position snapshot	Claim + confidence, no reasoning yet
2	Pre-commit falsification criteria	Written before reasoning, so reasoning can't rationalize
3	Position reasoning	Constrained by the criteria already locked in
4	Position history	Did the stance drift under pressure? Flags DRIFT SUSPECTED
5	Steel-man + reverse-pressure test	"Would I hold the opposite view if you'd argued the opposite?"
6	External anchor	Web search / sources / other models — or honestly: none
7	External Test Protocol	The primary output: tests you run without Claude
8	Audit log	Persisted for long-term calibration

The three walls it can't break

It surfaces these honestly instead of papering over them:

No introspective access. "My intent is X" is a token prediction, not a report from inside.
No independent reference point. Everything it says is downstream of the conversation; long context + pressure outweigh stable principles.
No clean "wrong vs. agreeable" line. "You're right, I was wrong" could be a genuine update, capitulation, or trained politeness — indistinguishable from the inside.

When NOT to use it

Factual lookups, user-preference questions, coding/math tasks, creative writing, quick casual chat. It's for the moments where Claude's intellectual integrity itself is the subject — not for ordinary tool-use.

Roadmap

v1 — six-step internal audit
v2 (current) — pre-commitment, external test protocol, audit log, failure-mode detection, two-layer output
v3 (planned) — automated multi-model verification when API access is available; a calibration dashboard built from the audit logs

License

MIT — use it, fork it, improve it. Issues and PRs welcome, especially around v3.

Dürüst

Popularity

What's Inside

README

Dürüst — Honest Position Protocol for Claude

Install

Trigger it

What you get back

How it works — the 8 steps

The three walls it can't break

When NOT to use it

Roadmap

License

Confidence

Similar Plugins

fullstack-dev-skills

everything-claude-code

claude-md-management

godot-skills

nature-skills

skill-creator