From tavily
Extracts clean markdown or text from URLs via the Tavily CLI. Handles JavaScript-rendered pages, supports query-focused chunking, and processes up to 20 URLs per call.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tavily:tavily-extractThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Extract clean markdown or text content from one or more URLs.
Extract clean markdown or text content from one or more URLs.
If tvly is not found on PATH, install it first:
curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login
Do not skip this step or fall back to other tools.
See tavily-cli for alternative install methods and auth options.
# Single URL
tvly extract "https://example.com/article" --json
# Multiple URLs
tvly extract "https://example.com/page1" "https://example.com/page2" --json
# Query-focused extraction (returns relevant chunks only)
tvly extract "https://example.com/docs" --query "authentication API" --chunks-per-source 3 --json
# JS-heavy pages
tvly extract "https://app.example.com" --extract-depth advanced --json
# Save to file
tvly extract "https://example.com/article" -o article.md
| Option | Description |
|---|---|
--query | Rerank chunks by relevance to this query |
--chunks-per-source | Chunks per URL (1-5, requires --query) |
--extract-depth | basic (default) or advanced (for JS pages) |
--format | markdown (default) or text |
--include-images | Include image URLs |
--timeout | Max wait time (1-60 seconds) |
-o, --output | Save output to file |
--json | Structured JSON output |
| Depth | When to use |
|---|---|
basic | Simple pages, fast — try this first |
advanced | JS-rendered SPAs, dynamic content, tables |
--query + --chunks-per-source to get only relevant content instead of full pages.basic first, fall back to advanced if content is missing.--timeout for slow pages (up to 60s).--include-raw-content), skip the extract step.npx claudepluginhub tavily-ai/skills --plugin tavilyExtracts clean markdown from any URL, including JavaScript-rendered SPAs. Supports concurrent scraping, JS wait times, and content filtering.
Extracts clean Markdown from web pages by stripping navigation, ads, sidebars, footers, and boilerplate using Defuddle. Use for reading docs, articles, blog posts, research papers, or release notes.
Scrapes a single known URL into clean markdown, HTML, links, or structured JSON via the fastCRW CLI or MCP. Handles JavaScript-rendered SPAs automatically.