Skill

pdf-to-markdown

Converts PDFs to structured Markdown preserving headings, tables, lists, reading order. Use for text extraction, batch processing, RAG ingestion, LLM context, or PDF analysis tasks.

Bash

Markdown

cli-tools

automation

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/pdf-to-markdown:pdf-to-markdown

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Convert PDFs into structured, semantic Markdown that preserves the document's logical structure — headings, tables, lists, and reading order — rather than producing flat text. This is significantly higher quality than reading a PDF directly with the `read` tool, which only extracts raw text without structure.

Supporting Files

bin/pdf-to-markdown

SKILL.md

57 lines · ~798 tokens

Stats

LanguagePython

Parent stars11

MaintenanceGood

Last CommitApr 2, 2026

Actions

View Source View Plugin View on GitHub View README

PDF to Markdown

Convert PDFs into structured, semantic Markdown that preserves the document's logical structure — headings, tables, lists, and reading order — rather than producing flat text. This is significantly higher quality than reading a PDF directly with the read tool, which only extracts raw text without structure.

Usage

Before running any commands, set SKILL_DIR to the absolute path of the directory containing this SKILL.md file. Use $SKILL_DIR/bin/pdf-to-markdown in all commands below.

The $SKILL_DIR/bin/pdf-to-markdown wrapper automatically installs the platform-specific binary into ~/.local/share/nutrient/cli/ from the CDN. It caches the binary and only checks for updates every 6 hours, so subsequent runs are fast.

Single file

$SKILL_DIR/bin/pdf-to-markdown INPUT.pdf OUTPUT.md

If OUTPUT.md is omitted, the converter writes the Markdown to stdout instead.

Batch directory (2+ files)

For multiple files, pass directories instead of individual files. The converter processes all PDFs in the input directory in parallel, which is much faster than converting one at a time.

$SKILL_DIR/bin/pdf-to-markdown INPUT_DIR/ OUTPUT_DIR/

Workflow

Choose mode: Use batch directory mode for 2+ files, single file mode otherwise.
Run the converter: $SKILL_DIR/bin/pdf-to-markdown INPUT [OUTPUT]
Check the exit code: Exit 0 means success. On failure, read stderr for the error message.
Validate the output: If the output file is empty or near-empty, see Troubleshooting below.
Report the output path: Tell the user where the converted file(s) are. Do NOT read the markdown back into context by default — converted documents can be very large and will fill the context window. Only read the output if the user's task specifically requires analyzing or summarizing the content (e.g., "summarize this PDF", "what does this contract say about X").

Troubleshooting

Empty or minimal output: The PDF may be scanned/image-only and contains no extractable text.
Non-zero exit code: Read stderr for the specific error. Common causes: corrupted PDF, unsupported encryption, or network issues during first-run binary download.
First run is slow: The wrapper downloads the platform binary on first use (~a few seconds). Subsequent runs use the cached binary.

License

Free for processing up to 1,000 documents per calendar month.

Commercial license required for:

processing over 1,000 documents/month
redistributing the binary
OEM/white-label use

Contact [email protected] for commercial licensing.

pdf-to-markdown

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

pdf-to-markdown

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

PDF to Markdown

Usage

Single file

Batch directory (2+ files)

Workflow

Troubleshooting

License

Similar Skills

PDF to Markdown

Usage

Single file

Batch directory (2+ files)

Workflow

Troubleshooting

License

Similar Skills