From text-utils
Use when fetching web content for analysis, summarization, or reference — especially when context window efficiency matters. Triggers on "fetch this page", "get the content from", "read this URL", "summarize this article", or when WebFetch would return bloated HTML.
How this skill is triggered — by the user, by Claude, or both
Slash command
/text-utils:fetch-markdownThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Get clean markdown from a URL. Uses less context than WebFetch by stripping navigation, scripts, and chrome.
Get clean markdown from a URL. Uses less context than WebFetch by stripping navigation, scripts, and chrome.
Try these in order. Stop at the first one that returns clean content.
All curl calls use a browser user agent to avoid bot-blocking:
UA="Mozilla/5.0 (compatible)"
Returns clean markdown. No dependencies.
curl -sL -A "$UA" "https://markdown.new/$URL"
If it returns an error or is blocked, move to step 2.
Handles JavaScript rendering, returns clean markdown. No dependencies.
curl -sL -A "$UA" "https://r.jina.ai/$URL"
If it returns an error or is blocked, move to step 3. Between the two proxies, most domains are covered — they tend to fail on different sites.
Purpose-built article extractor. Strips nav, ads, and chrome. Returns the article text.
uvx trafilatura -u "$URL"
Or if installed globally: trafilatura --URL "$URL". Install with pipx install trafilatura.
Downloads the full page and converts. Includes navigation, footers, and page chrome — usable but not clean.
curl -sL -A "$UA" "$URL" | pandoc -f html -t markdown-raw_html-native_divs-native_spans --wrap=none
Always available on macOS. Gets the article content but loses all markdown formatting (headers, links, emphasis become plain text).
lynx -dump -nolist "$URL"
| Situation | Strategy |
|---|---|
| Blog post, article, documentation | Step 1 (markdown.new) or step 2 (Jina) |
| One proxy blocked or rate-limited | Try the other proxy |
| Both proxies blocked | Step 3 (trafilatura) |
| No Python tools available | Step 4 (curl + pandoc) |
| Nothing else works | Step 5 (lynx) |
| Need exact HTML fidelity | Don't use this skill — use WebFetch |
curl (standard on macOS/Linux)pandoc (for step 4)trafilatura (optional, for step 3: pipx install trafilatura)lynx (usually pre-installed on macOS)::: fenced divs and schema attributes — that's page chrome, not article content. Try trafilatura or lynx instead.npx claudepluginhub jackwillis/claude-plugins --plugin text-utilsExtracts clean markdown from web pages using Defuddle CLI, removing clutter to save tokens. Use instead of WebFetch for articles, docs, and standard web pages.
Extracts clean Markdown from web pages by stripping navigation, ads, sidebars, footers, and boilerplate using Defuddle. Use for reading docs, articles, blog posts, research papers, or release notes.
Extracts clean Markdown from any URL using ezycopy CLI. Handles JS-rendered pages with headless Chrome, retries on failure, and auto-installs tool if needed.