From anakin
Scrape a single URL using anakin-cli and return clean markdown, structured JSON, or the full raw API response. Use when extracting content from a web page — an article, product page, documentation, or any URL.
How this skill is triggered — by the user, by Claude, or both
Slash command
/anakin:scrape-websiteThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Extracting content from a single web page — an article, product page, documentation, or any URL.
Extracting content from a single web page — an article, product page, documentation, or any URL.
anakin status.mkdir -p .anakin
https://docs.react.dev/learn -> .anakin/docs.react.dev-learn.mdhttps://example.com/pricing -> .anakin/example.com-pricing.md$ARGUMENTS contains --browser. If the URL is a known SPA/JS-heavy site (React apps, Next.js, Angular, etc.), add --browser automatically.anakin scrape "$ARGUMENTS" -o .anakin/<filename>.md
wc -l .anakin/<filename>.md && head -80 .anakin/<filename>.md
# Clean readable text (default)
anakin scrape "<url>" -o .anakin/output.md
# Structured data
anakin scrape "<url>" --format json -o .anakin/output.json
# Full API response with HTML and metadata
anakin scrape "<url>" --format raw -o .anakin/output.json
# JavaScript-heavy or single-page app sites
anakin scrape "<url>" --browser -o .anakin/output.md
# Geo-targeted scraping
anakin scrape "<url>" --country gb -o .anakin/output.md
# Custom timeout for slow pages
anakin scrape "<url>" --timeout 300 -o .anakin/output.md
--browser — Use headless browser (for JS-rendered/SPA pages)--country <code> — Country code for geo-targeting (default: us)--format <fmt> — Output format: markdown (default), json, or raw--timeout <seconds> — Max wait time (default: 120)-o, --output <path> — Save to fileStarts job, polls every 3s until complete.
?, &, # characters.--browser only when a standard scrape returns empty or incomplete content.-o to save output to a file rather than flooding the terminal.--browser. If still fails, check anakin status.npx claudepluginhub anakin-inc/anakin-claude-plugin --plugin anakinExtracts clean markdown from any URL, including JavaScript-rendered SPAs. Supports concurrent scraping, JS wait times, and content filtering.
Scrapes a single known URL into clean markdown, HTML, links, or structured JSON via the fastCRW CLI or MCP. Handles JavaScript-rendered SPAs automatically.
Scrapes web content as clean markdown/HTML/JSON using Bright Data CLI. Handles single URLs, lists, and paginated sites. Requires Bright Data CLI setup.