From philosophers-toolkit
Methodical Doubt — systematically doubt everything until only the indubitable remains. Use when auditing trust or stress-testing whether something is truly known. Not for decomposition (use first-principles), dialogue (use socratic-method), or single hypothesis testing (use popper-falsifiability). 方法的懐疑。方法論的懷疑。
How this skill is triggered — by the user, by Claude, or both
Slash command
/philosophers-toolkit:descartes-methodical-doubtThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
"If there is any reason to doubt it, treat it as false — until you find
"If there is any reason to doubt it, treat it as false — until you find what cannot be doubted at all." — Descartes
Methodical doubt is systematic elimination. You do not question in order to learn (Socratic) or decompose in order to rebuild (First Principles). You doubt in order to destroy — to strip away every layer of uncertainty until you reach bedrock: the thing that survives maximum skepticism. Only then do you rebuild, knowing your foundation is indubitable.
Unlike Socratic questioning (dialogue-driven, challenges thinking through questions) and First Principles (decomposes to build up), Methodical Doubt is about systematic ELIMINATION — stripping away layers of uncertainty.
Do NOT use when:
User Input: User invokes skill (with or without a specific belief to examine) Your Action: Establish what belief, assumption, or premise to doubt.
Constraint: Acknowledge the topic, never pre-judge whether it survives doubt.
Guide the user through four layers in order. Each layer applies progressively deeper skepticism. The goal is to find what SURVIVES doubt, not to destroy everything.
Name the belief, assumption, or premise being examined. Make it explicit and precise.
Questions to ask:
Goal: a single, concrete statement that can be doubted.
Subject the belief to three progressively deeper forms of doubt. Work through each one with the user.
Could your observation or measurement be wrong?
Questions to ask:
Examples of sensory doubt in tech:
Could your reasoning from evidence to conclusion be flawed?
Questions to ask:
Examples of reasoning doubt in tech:
Could the entire framework you're operating within be wrong?
Questions to ask:
Examples of systemic doubt in tech:
After each doubt layer, reach a verdict for the belief:
After applying all three doubt layers, take stock. What remains standing after maximum skepticism?
Questions to ask:
The indubitables should be few. If many beliefs survive, you haven't doubted hard enough.
Characteristics of genuinely indubitable beliefs:
Now — and only now — build back up from what survived.
Questions to ask:
Rules:
Summarize the doubt journey:
For each layer:
For synthesis:
Stop after each phase. Never advance without user input.
User: "Our API is secure because we use JWT tokens."
Agent: "Layer 1 — State the Belief. Let's make this precise. You believe:
'Our API is secure.' And the basis is: 'We use JWT tokens.' Is that
the complete belief, or are there other reasons you trust the security?"
User: "We also validate tokens on every request and use HTTPS."
Agent: "So the belief is: 'Our API is secure because we use JWTs,
validate on every request, and use HTTPS.' Layer 2 — Sensory Doubt.
Let's examine the evidence. Have you tested this security yourself,
or are you relying on the assumption that these technologies provide
security by default?"
User: "We haven't done a pen test. We're relying on the technologies."
Agent: "So the evidence is the reputation of the tools, not direct
observation. Could the tools be misconfigured? Are your JWTs using
a strong signing algorithm? Are tokens expiring appropriately?"
User: "I... don't actually know what algorithm we're using."
Agent: "Sensory doubt has weakened the belief — you haven't directly
verified the implementation. Layer 2 — Reasoning Doubt. Even if
JWT + HTTPS are correctly configured, does that equal 'secure'?
What attack vectors exist that JWT and HTTPS don't cover?"
User: "Authorization. JWTs handle authentication, not authorization.
We might authenticate users correctly but still let them access
things they shouldn't."
Agent: "So your reasoning had a gap: authentication was conflated with
security. Layer 2 — Systemic Doubt. What if the entire framing is
wrong? What if 'is our API secure?' is the wrong question?"
User: "Maybe the question should be 'what are we protecting and from
whom?' We've never done a threat model."
Agent: "Synthesis: the original belief was 'our API is secure.' Sensory
doubt revealed you haven't verified the implementation. Reasoning doubt
exposed a conflation of authentication with authorization. Systemic
doubt revealed you've never defined what 'secure' means for your
context. What survived? Only this: 'we use HTTPS and JWTs, but we
don't know if our API is secure.' The indubitable foundation to
rebuild from: do a threat model first, then verify each layer."
User: "Our data shows users prefer the new dashboard, so we should
ship it to everyone."
Agent: "Layer 1 — State the Belief. The belief is: 'Users prefer the
new dashboard.' The basis is: 'Our data shows this.' What data
specifically?"
User: "We ran an A/B test. The new dashboard had 15% higher engagement."
Agent: "Layer 2 — Sensory Doubt. Let's examine the evidence. How was
'engagement' measured? What counts as engagement?"
User: "Time spent on the page and number of clicks."
Agent: "Could those metrics be misleading? If users spend more time,
does that mean they prefer it — or that they're confused and
searching for things?"
User: "Hm. We didn't measure task completion rate."
Agent: "So higher engagement might mean higher confusion. What about
the sample — who was in the test?"
User: "Power users who opted into the beta."
Agent: "Sensory doubt: your sample is biased toward power users, and
your metric may measure confusion, not preference. Layer 2 —
Reasoning Doubt. Even if the data is accurate, does 'higher
engagement among power users' justify 'ship to everyone'?"
User: "No. New users might react completely differently."
Agent: "Layer 2 — Systemic Doubt. What if the entire A/B testing
framework is wrong for this decision? What if the question isn't
'which dashboard do users prefer' but 'what outcomes does the
dashboard need to drive'?"
User: "We never defined success criteria before the test."
Agent: "Synthesis: the original belief was 'users prefer the new
dashboard.' Sensory doubt revealed biased sampling and misleading
metrics. Reasoning doubt showed the conclusion doesn't generalize
from power users to all users. Systemic doubt exposed the absence
of success criteria. What survived? 'Power users interact more with
the new dashboard, but we don't know why or whether that's good.'
Rebuild: define success criteria, measure task completion, test with
representative users."
For additional examples, see references/methodical-doubt-cases.md.
Before every response, verify:
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub kouko/monkey-skills --plugin philosophers-toolkit