From paper-to-slides
Convert one or more academic papers (PDF, LaTeX, Overleaf project) into polished HTML presentations. Supports multiple papers in one deck, explicit language selection, screen-aware sizing for 4K displays, and style extraction from reference PPT/HTML templates. Outputs to a dedicated slides directory at the project root (never into the paper source directory). Delegates to a slide generation skill (frontend-slides or frontend-design) for rendering and automatically runs overleaf-cleanup for LaTeX/zip/directory inputs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/paper-to-slides:paper-to-slidesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Convert one or more academic papers into a compelling HTML presentation. Reads each paper, extracts the narrative structure, adapts to the target audience, and produces a polished slide deck via the `frontend-slides` or `frontend-design` skill (see Prerequisites for resolution).
Convert one or more academic papers into a compelling HTML presentation. Reads each paper, extracts the narrative structure, adapts to the target audience, and produces a polished slide deck via the frontend-slides or frontend-design skill (see Prerequisites for resolution).
Detailed reference material (outline templates, CSS rules, OOXML transforms, fidelity checklists, etc.) is in REFERENCE.md. Read it on-demand when a specific step requires it.
This skill works with Claude Code CLI, OpenAI Codex CLI, and other SKILL.md-compatible agents. Instructions use Claude Code tool names — see PLATFORM_COMPAT.md for the full cross-platform tool mapping and script path resolution.
All script invocations below use $SKILL_DIR — set it once at the start of the session per the instructions in PLATFORM_COMPAT.md.
.tex file, Overleaf zip, or project directory and wants slides/paper-to-slides # Interactive
/paper-to-slides paper.pdf # Single PDF
/paper-to-slides main.tex # Single LaTeX file
/paper-to-slides ./my-overleaf-project # Overleaf directory
/paper-to-slides project.zip # Overleaf zip
/paper-to-slides paper.pdf academic # With style preset
/paper-to-slides paper1.pdf paper2.pdf # Multiple papers → one deck
/paper-to-slides paper.pdf --template ref.pptx # Use a PPTX as style reference
Use your vision and semantic understanding over script-based heuristics. If your platform supports multimodal input (Claude Code does natively; Codex via view_image):
Read tool with pages param. Codex: extract pages as images via pdf2image, then view_image.)Read on image files. Codex: view_image.)snapshot_slides.py to take screenshots, then visually inspect slide quality, layout, typography, and figure placement.Scripts (extract_pdf.py, diff_supplement.py) are auxiliary tools for data extraction and metadata. They should never be the primary decision-maker for content quality or similarity.
Verify at the start of every invocation. Stop with actionable guidance if anything is missing.
| Skill | Purpose | Check | Invoke |
|---|---|---|---|
frontend-slides or frontend-design | HTML slide generation | See resolution below | See resolution below |
overleaf-cleanup | LaTeX project cleanup | Only for .zip/.tex/directory inputs | Claude: Skill("overleaf-cleanup") / Codex: $overleaf-cleanup |
At the start of every invocation, determine which slide generation skill is available. Check in order:
frontend-slides — Check if Skill("frontend-slides") is available. This is the preferred provider (ships with recent Claude Code versions).frontend-design — Fallback. Check if Skill("frontend-design") is available. Can produce slide-like output but is not specialized for presentations.Use whichever is found first. If neither is available, stop and tell the user to update Claude Code.
| Package | Purpose | When Needed |
|---|---|---|
pdfplumber | PDF text extraction (auxiliary) | PDF inputs with long docs |
python-pptx | PPTX template analysis / export | Template extraction, PPTX export |
Pillow | Image processing, OOXML transforms, PDF export | Template assets, PDF export |
lxml | PPTX XML parsing | Template extraction |
pdf2image | PDF figures → PNG (needs poppler) | PDF figure extraction |
matplotlib | LaTeX math → PNG for PPTX | PPTX export with math |
beautifulsoup4 | HTML parsing for PPTX conversion | PPTX export |
playwright | Headless slide screenshots + PDF export | Snapshots, visual QA, PDF export |
Auto-install missing packages on demand with sys.executable -m pip install.
Playwright also requires a one-time browser install:
python -m playwright install chromium
python --versionconda install -c conda-forge poppler. Mac: brew install poppler. Linux: apt install poppler-utils.snapshot_slides.py on first run. Required for slide screenshots, visual QA, and PDF export.paper_path accepts one or more paths. For each, determine type:
overleaf-cleanup skill first (Claude Code: Skill("overleaf-cleanup", "<path>"); Codex: $overleaf-cleanup <path>), then parse cleaned .tex (Step 3)Primary approach — read the PDF directly. You are the primary reader — form your own understanding of the paper's structure, arguments, and contributions.
Claude Code — use the Read tool with pages parameter (gives both text and visual content):
Read(file_path="paper.pdf", pages="1-20") # first 20 pages (you see both text AND images)
Read(file_path="paper.pdf", pages="21-40") # next batch
Codex — extract text via shell, view figures as images:
python -c "
import pdfplumber
with pdfplumber.open('paper.pdf') as pdf:
for p in pdf.pages[:20]:
print(p.extract_text() or '')
"
Read in batches of 20 pages. Visual reading advantages (when platform supports multimodal):
Auxiliary script — for long PDFs (>40 pages) where you need metadata before deciding what to read:
PDF_SCRIPT="$SKILL_DIR/extract_pdf.py"
python "$PDF_SCRIPT" "<pdf_path>" --output-dir "<output_dir>"
The script produces <output_dir>/_pdf_cache/ with:
| File | Purpose |
|---|---|
metadata.json | Page count, classification breakdown, title/abstract guess |
index.json | Section → chunk → page mapping, figure page list |
chunk_001.txt ... | Raw text per chunk (15-20 pages each) |
Use the script only for:
Do NOT rely on the script's page classifications as ground truth. Its heuristics (text density thresholds, keyword matching) can misclassify pages. Always verify by reading the actual content yourself.
Cleanup: Delete
_pdf_cache/after slide generation is finalized (Step 13).
overleaf-cleanup skill for zip/directory inputs (mandatory)..tex file, read it and all \input/\include'd files.\title, \author, \abstract, \section, \begin{figure}, etc.You are the semantic analyst. Read the paper content from Step 2/3 and extract:
Semantic understanding is critical here. Do not just extract keywords or section headings — understand the paper's narrative arc, what problem it solves, why the approach works, and what the results mean. This understanding drives slide quality.
For PDF inputs, read the actual text from Step 2. If you need more detail on a specific section, re-read those pages directly. The extraction script's metadata is only for navigation, not for understanding.
When multiple papers provided, extract from each separately and label with identifiers.
Multi-paper merging: See
REFERENCE.md → Multi-Paper Merge Guidelinesfor terminology collision, notation conflict, and attribution rules.
Ask per paper using the metadata from Step 2a/2b:
Question: "How many pages should I read from '<filename>'? (Total: <N> pages, ~<M> content / ~<R> references / ~<A> appendix)"
Options:
- "Main body only (auto-detect)" — "Skip references and appendix pages. Recommended."
- "All pages" — "Read everything including appendices (chunked for long PDFs)"
- "First <N/2> pages" — "Read only the first half"
- "Let me specify sections" — "I'll tell you which page ranges or sections to focus on"
The page classification from Step 2b makes this question informative — the user sees the breakdown before deciding. For LaTeX inputs, estimate from content sections instead.
Must be explicitly confirmed. Never assume. Options: English, 中文, Same as paper.
When non-English: translate all text, keep technical acronyms, use CJK fonts, set lang attribute.
Must be confirmed. Options: Academic, Popular science, Pitch, Tutorial.
See
REFERENCE.md → Style Presetsfor slide count, density, tone, and audience adaptation examples per style.
Always ask for a reference template unless template argument was provided.
When a template is available, its visual assets MUST be used with 100% fidelity — backgrounds, logos, colors, fonts, aspect ratio.
See
REFERENCE.md → Template Fidelity Requirementsfor the full element-by-element checklist and common mistakes.
For PPTX templates — run the persisted extraction script:
EXTRACT_SCRIPT="$SKILL_DIR/extract_pptx_template.py"
python "$EXTRACT_SCRIPT" "<pptx_path>" --output-dir "<output_dir>/assets"
The script extracts images (with OOXML transform handling), theme colors, fonts, decorative shapes, and produces style_report.json.
OOXML Image Transforms: The script auto-detects
<a:grayscl/>,<a:duotone>, etc. and applies Pillow equivalents. SeeREFERENCE.md → OOXML Image Transformsfor details and manual fallback.
For HTML templates: Extract CSS properties, fonts, colors, layout, background images, logos.
Distill into _style_template.html: A <style> block with CSS variables/fonts/backgrounds, 3 skeleton slides (title, content, end), logo positioned via CSS, and background images via assets/.
Read aspect_ratio_label from style_report.json. Offer to generate the other ratio version.
See
REFERENCE.md → CSS Transformation Rulesfor the 16:9 ↔ 4:3 conversion table.
Naming: Primary = slides.html, secondary = slides_{ratio}.html (e.g., slides_4_3.html).
Always ask this question. The original HTML with assets is always kept; this only controls whether an additional self-contained copy is generated.
Question: "Would you also like a single-file HTML version (all images embedded, easy to share)?"
Options:
- "Yes, generate single-file version (Recommended)" — "Creates slides_single.html with all assets inlined as base64. The original slides.html + assets/ folder is kept as-is."
- "No, just the original" — "Only output slides.html with the assets/ folder."
If yes: after generating the primary slides.html (with assets/ folder), produce an additional slides_single.html by running inline_assets.py:
INLINE_SCRIPT="$SKILL_DIR/inline_assets.py"
python "$INLINE_SCRIPT" "<slides.html>" --output "<slides_single.html>"
For dual aspect ratio versions, also produce slides_4_3_single.html etc.
Output naming: {original_stem}_single.html — never overwrite the original.
Important — export tools must always use the original HTML:
html_to_pptx.py, html_slides_to_pptx.py) → use slides.html (not _single)snapshot_slides.py) → use slides.html (not _single)_single.html files are for sharing/viewing only — they contain bloated base64 data that slows down Playwright rendering and parsing. Always point export tools at the original HTML + assets/ folder.If no: skip. The original slides.html + assets/ is the only output.
Ask: "Auto-detect from template", "I'll provide a logo file", or "No logo needed".
Two approaches:
Include logo in the HTML generation prompt to frontend-slides. Provide guidelines per slide type:
Always use the original logo image — never apply CSS filters that alter colors.
Lesson learned: Do NOT use
filter: brightness(0) invert(1)on logos — destroys colored logos.
LOGO_SCRIPT="$SKILL_DIR/inject_logo.py"
python "$LOGO_SCRIPT" <slides.html> <logo.png> --alt-text "Org Name"
Idempotent, supports --position (top-right/top-left/bottom-right/bottom-left).
Never write into the paper's source directory. Create <project_root>/slides_<slug>/ with assets/ subdirectory.
Naming: single paper → slides_<paper_name>, multiple → slides_<combined_name>. Sanitize: lowercase, _ for spaces, max 40 chars.
Create a structured outline and present for approval. For multi-paper decks, ask organization strategy: Sequential, Interleaved by theme, or Comparative.
See
REFERENCE.md → Slide Outline Templatesfor per-style outline structures andMulti-Paper Merge Guidelinesfor the pre-generation checklist.
View figures directly — don't just list filenames. (Claude Code: Read tool on image files. Codex: view_image or describe from filename/context.)
assets/.pdf2image to extract specific pages as PNGs for figure-heavy pagesPresent your visual assessment to the user: "Figure 3 (page 8) shows the architecture diagram — worth including on the methods slide. Figure 5 (page 12) is a bar chart comparing baselines — key for the results slide."
SCREEN_SCRIPT="$SKILL_DIR/detect_screen.py"
python "$SCREEN_SCRIPT"
If css_recommendation is "boost" (HiDPI/4K): increase CSS clamp() upper bounds ~50%, add @media (min-width: 2500px) breakpoint. If "default": use standard values.
Invoke the resolved slide provider from the Prerequisites step (Claude Code: Skill("frontend-slides", ...) or Skill("frontend-design", ...); Codex: $frontend-slides ... or $frontend-design ...) with: structured slide content (type, heading, body, figure, notes per slide), style direction, language, screen resolution, _style_template.html, and logo info from Step 7b.
If dual version requested in Step 7a:
slides.html.slides_{ratio}.html, apply CSS transformations (see REFERENCE.md → CSS Transformation Rules).Visual QA before asking the user. Take screenshots and inspect them yourself:
snapshot_slides.py to capture all slides as images.Read tool. Codex: view_image):
Then open in browser. Ask: "Great, done!" / "Adjust content" / "Adjust style" / "Both". Iterate.
Audit and fix slides that use too little vertical space (common with few bullet points).
Workflow:
audit_space.py on the primary HTML:AUDIT_SCRIPT="$SKILL_DIR/audit_space.py"
python "$AUDIT_SCRIPT" "<slides.html>" --threshold 0.55
Report and screenshots are saved to _space_audit/ next to the HTML file.
Visual Review — If any slides are flagged, view each screenshot in _space_audit/ to visually assess the layout.
Determine Fix — Use this lookup table:
| Symptom | Fix |
|---|---|
| 1-3 bullets, fill < 45% | Increase --body-size clamp floor by 0.2-0.4rem |
| All content slides 45-55% | Increase line-height on ul (1.65 → 1.9) |
li items too cramped | Increase margin-bottom on li (0.45em → 0.65em) |
slide-figure low fill | Increase max-height on .fig-panel img |
Apply — Edit CSS in <style> block. Prefer global :root changes when multiple slides flagged. Apply same delta to dual-version HTML.
Re-audit — Run audit_space.py again to verify improvement (max 2 passes).
Cleanup — Remove _space_audit/ directory after optimization is complete:
import shutil; shutil.rmtree('_space_audit/')
Capture high-resolution screenshots of each slide for review/sharing:
SNAP_SCRIPT="$SKILL_DIR/snapshot_slides.py"
# Default: 2x scale, auto-detect aspect ratio
python "$SNAP_SCRIPT" "<slides.html>" --output-dir "<output_dir>/snapshots"
# 4:3 slides at 3x resolution
python "$SNAP_SCRIPT" "<slides_4_3.html>" --output-dir "<output_dir>/snapshots_4_3" --scale 3
# Specific slides only
python "$SNAP_SCRIPT" "<slides.html>" --output-dir "<output_dir>/snapshots" --slides 5-24
The script runs headless Chromium, hides progress bars and navigation overlays, and outputs slide_001.png, slide_002.png, etc. Viewport auto-selects 1920×1080 (16:9) or 1440×1080 (4:3) based on slide CSS.
Always ask the export format:
Question: "Which export format(s) do you need?"
Options:
- "PPTX only" — "PowerPoint file for editing and presenting"
- "PDF only" — "Fixed-layout document for sharing and printing"
- "Both PPTX and PDF" — "PowerPoint for editing + PDF for distribution"
- "No export needed" — "Keep HTML only"
EXPORT_SCRIPT="$SKILL_DIR/html_to_pptx.py"
python "$EXPORT_SCRIPT" "<slides.html>" --output "<slides.pptx>" --style-report "<style_report.json>"
For dual versions, export each separately. Use --width 10 --height 7.5 for 4:3, defaults for 16:9.
See
REFERENCE.md → PPTX Export Notesfor math rendering, figure scaling, and known limitations.
Use the snapshot script to produce high-resolution slide images, then combine into a PDF:
SNAP_SCRIPT="$SKILL_DIR/snapshot_slides.py"
python "$SNAP_SCRIPT" "<slides.html>" --output-dir "<output_dir>/pdf_frames" --scale 2 --format png
Then combine the PNGs into a single PDF using Pillow:
from PIL import Image
from pathlib import Path
frames = sorted(Path("<output_dir>/pdf_frames").glob("slide_*.png"))
images = [Image.open(f).convert("RGB") for f in frames]
images[0].save("<slides.pdf>", save_all=True, append_images=images[1:], resolution=150)
For dual versions, export each separately.
You are the semantic judge. Do NOT rely on the diff script's string-similarity scores to decide what's new vs. existing.
DIFF_SCRIPT="$SKILL_DIR/diff_supplement.py"
python "$DIFF_SCRIPT" "<slides.html>" <supplements...> --extract-only -o "<content.json>"
Read the extracted content yourself. For each supplement section, determine:
Present your semantic assessment to the user with specific reasoning, not just similarity scores.
If you want a rough starting point, run without --extract-only — but treat the SequenceMatcher scores as unreliable hints. Content with different wording about the same topic often scores below 0.6 and gets misclassified as "new".
Apply to all versions. Re-export if needed.
All scripts live in the same directory as this SKILL.md (on Claude Code: ~/.claude/skills/paper-to-slides/). Set SKILL_DIR once per session (see Platform Compatibility above), then run directly — do NOT copy into the project.
| Script | Purpose | Usage |
|---|---|---|
extract_pdf.py | Auxiliary. Chunked PDF extraction for metadata/navigation on long PDFs. Prefer reading PDFs directly via Read tool. | python ... <pdf> -o <dir> [--scope main-body|all|pages] [--pages 1-50] |
detect_screen.py | Screen resolution & DPI detection | python "$SKILL_DIR/detect_screen.py" |
extract_pptx_template.py | PPTX asset extraction with OOXML transforms | python ... <pptx> -o <dir> |
html_to_pptx.py | HTML → PPTX export with math/figure handling | python ... <html> --output <pptx> [--style-report <json>] |
inject_logo.py | Batch logo injection (CSS + elements) | python ... <slides.html> <logo.png> [--position top-right] |
diff_supplement.py | Auxiliary. Content extraction (--extract-only) for LLM semantic comparison. Legacy syntactic diff available but unreliable. | python ... <slides.html> <supp...> --extract-only -o <content.json> |
audit_space.py | DOM-based content fill audit, flags sparse slides, captures screenshots to _space_audit/ | python ... <slides.html> [--threshold 0.55] [--no-screenshots] [--slides 1-5,8] |
snapshot_slides.py | Headless slide screenshots (Playwright) | python ... <slides.html> -o <dir> [--scale 2] [--slides 1-5,8] |
inline_assets.py | Embed images/CSS as base64 for single-file HTML | python ... <slides.html> [-o <standalone.html>] |
_single.html version (Step 7c)assets/, flag figure pages (Step 10)inject_logo.py (Step 7b Approach B)inline_assets.py to produce slides_single.html (Step 7c)npx claudepluginhub whiskychoy/whisky-claude-plugins --plugin paper-to-slidesBuilds slide decks and presentations for research talks, conferences, seminars, and thesis defenses. Provides slide structure, design templates, timing guidance, and visual validation. Works with PowerPoint and LaTeX Beamer.
Creates slide decks for scientific presentations in PowerPoint and LaTeX Beamer, offering structure, design templates, timing guidance, and visual validation. Use for conferences, seminars, defenses.
Creates slide presentations from topics, URLs, PDFs, git repos, or vault notes. Handles research, synthesis, outlining, and editing existing decks. Default output is reveal.js HTML; pptx available on request.