From Evidentia — Medical Evidence & Citation Audit
Comprehensive medical fact-checking and critical appraisal skill. Evaluates any medical content — research papers, articles, social media posts, newsletters, YouTube/podcast transcripts, conference slides, clinical guidelines, pharma marketing, patient leaflets, health app content, and more — across 15 criteria for accuracy, evidence quality, and appropriateness. Generates a structured Markdown report with an A–F score and actionable improvement suggestions. Triggers: 'fact-check', 'evidence check', 'evaluate this article', 'check this post', 'ファクトチェック', 'エビデンスチェック', 'この記事を評価して', 'この投稿の問題点'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/evidentia:medical-fact-checkThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Comprehensive critical appraisal and fact-checking for medical information, producing a structured report.
Comprehensive critical appraisal and fact-checking for medical information, producing a structured report.
Scope. This is a pre-publication aid for writers, editors, and researchers — not clinical decision support. It evaluates how medical content is written and sourced; it does not diagnose, treat, or replace professional medical judgment.
Reference files. This skill bundles three reference files. Read them with paths relative to this skill's directory:
references/checklist.md,references/evidence-levels.md, andtemplates/report-template.md. Depending on how the skill was installed, that directory is~/.claude/skills/medical-fact-check/(manual copy) or${CLAUDE_PLUGIN_ROOT}/skills/medical-fact-check/(installed as the Evidentia plugin).Deterministic citation engine. When checking citations (Step 4), prefer the local
evidentiaengine over verifying each identifier by hand. Use theevidentiabinary first; fall back tonpx -y evidentiaonly if a local install is unavailable. Evidentia resolves DOI/PMID/arXiv/NCT identifiers against CrossRef, PubMed, OpenAlex, arXiv, and ClinicalTrials.gov, and emits the 4-tier classification pluslookupVerifiedandresolverOutcomeslookup traces. Use a cache path when possible so repeat checks are stable and fast. See Step 4 for how to call it.
This skill evaluates medical information across 15 criteria — evidence quality, citation accuracy, statistical interpretation, ethical considerations, and more — then generates a structured Markdown report with an overall A–F score and actionable improvement suggestions.
The skill auto-detects the content type and adjusts its evaluation accordingly:
| Category | Examples | Key Focus |
|---|---|---|
| Research papers | Journal articles, preprints, systematic reviews | Evidence level, methodology, statistical rigor |
| News & articles | Health news, medical blogs, magazine articles | Accuracy of claims, source attribution, exaggeration |
| Social media | X (Twitter), Instagram, TikTok captions, Reddit, note | Brevity-induced omissions, clickbait, misinformation risk |
| Newsletters | Email newsletters, Substack, medical columns | Citation completeness, audience calibration |
| Patient materials | Leaflets, brochures, hospital handouts | Readability, completeness, fear-mongering |
| Video/audio transcripts | YouTube, podcasts, webinar transcripts | Verbal exaggeration, missing nuance, source attribution |
| Presentations | Conference slides, lecture materials, grand rounds | Slide oversimplification, citation on slides |
| Clinical guidelines | Practice guidelines, protocols, algorithms | AGREE II compliance, evidence grading, conflicts of interest |
| Marketing materials | Pharma ads, medical device brochures, supplement claims | Regulatory compliance, selective data presentation, COI |
| Health apps & digital | App descriptions, chatbot outputs, AI-generated content | Hallucination detection, accuracy of automated advice |
| Textbooks & education | Textbook chapters, CME/CPD materials, study guides | Currency, completeness, pedagogical accuracy |
| Infographics | Visual summaries, data visualizations, social cards | Data integrity, oversimplification, source attribution |
Each item is rated Excellent / Good / Fair / Poor. The overall score:
| Score | Criteria |
|---|---|
| A | 12+ Excellent, 0 Poor |
| B | 12+ Excellent or Good, ≤1 Poor |
| C | 12+ Fair or better, ≤2 Poor |
| D | 3+ Poor |
| F | 5+ Poor, or critical ethical issues |
Receive the target medical content from the user. Depending on the input format:
WebFetch to retrieve the contentRead to load the fileIdentify the following:
Adjust the evaluation lens based on detected media type:
Social media posts:
Video/podcast transcripts:
Marketing materials:
Clinical guidelines:
AI-generated content:
Read references/checklist.md (in this skill's directory — see the note at the top) with the Read tool to load the detailed 15-item evaluation checklist.
If the content references research studies, read references/evidence-levels.md with the Read tool and evaluate:
If the content cites papers or sources, verify them. Prefer the deterministic engine for existence and bibliographic checks, then use WebSearch for the semantic context check that the engine cannot do.
If evidentia (or the verify_citations MCP tool) is available, run it on the content first. It resolves DOI/PMID/arXiv/NCT identifiers against CrossRef, PubMed, OpenAlex, arXiv, and ClinicalTrials.gov and returns Tiers 1, 3, and 4 with certainty - no model guesswork. Books (ISBN), guidelines, title-only citations, and other non-indexed sources are returned as Tier 2 ("verify manually"), never as fabrications:
evidentia check <file-or-url> --format json --cache "$HOME/.cache/evidentia/verification-cache.json" --mailto <your-email>
If the local binary is unavailable, fall back to the npm package:
npx -y evidentia check <file-or-url> --format json --cache "$HOME/.cache/evidentia/verification-cache.json" --mailto <your-email>
Use its output as the ground truth for citation existence. Inspect lookupVerified and resolverOutcomes when explaining why a citation was classified:
If the engine is not available, fall back to verifying each identifier manually with WebSearch (steps below).
For every citation the engine marked Verified, still confirm it is used honestly:
AI-generated text (ChatGPT, Claude, Gemini, etc.) frequently contains plausible but fabricated citations. When a citation cannot be confirmed, perform these additional checks:
Do NOT stop at "could not verify." Actively determine whether the citation is unverifiable or provably fabricated.
Classify each citation into one of 4 tiers:
| Tier | Classification | Description |
|---|---|---|
| 1 | Verified | Paper exists and content matches the citation |
| 2 | Content mismatch | Paper exists but is cited out of context |
| 3 | Bibliographic mismatch | Paper exists but DOI, author, or journal info is wrong |
| 4 | Hallucination | DOI points to an unrelated paper, or the paper does not exist |
Rate each of the 15 items using these dimensions:
| Criterion | Social Media | Marketing | Guidelines | Patient Materials |
|---|---|---|---|---|
| #1 Evidence level | Expect source links | Heightened scrutiny | GRADE required | Simplified OK |
| #2 Citations | At minimum, name sources | Full disclosure required | Systematic search required | Source available on request |
| #6 Exaggeration | Very common — flag aggressively | Primary concern | Should be absent | Watch for false reassurance |
| #7 Population fit | Often ignored — flag | Check indication scope | Must be explicit | Must match audience |
| #9 Readability | Platform-appropriate | Accessible to HCPs + public | HCP-level acceptable | 6th-grade reading level |
| #10 Ethics | Check stigma/fear | Check manipulation | Check COI panel | Check dignity/autonomy |
| #12 Images | Memes, infographics | Selective visuals | Evidence figures | Clear illustrations |
Aggregate the 15 item ratings into an A–F score using the criteria table in the Overview section.
Additionally, flag a Public Health Risk Assessment:
Read the report template from templates/report-template.md (in this skill's directory) with the Read tool and produce the structured report.
Required sections:
Save the completed report as a Markdown file using Write:
medical-fact-check-report-YYYY-MM-DD.md in the current directory-2, -3, etc.If the user revises the content based on the report and requests re-evaluation:
-rev2 (or -rev3, etc.) suffixShort-form content requires particular attention to:
Audio/video content often has unique issues:
Slide decks present compressed information:
Guidelines demand the highest methodological standards:
Marketing materials require heightened skepticism:
LLM-generated content requires the most rigorous citation checking:
Patient materials prioritize accessibility and safety:
references/checklist.md — detailed 15-item evaluation checklistreferences/evidence-levels.md — evidence hierarchy & quality assessment toolstemplates/report-template.md — structured report templateProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub kgraph57/evidentia --plugin evidentia