Skill

scrape-website

Scrape a single URL using anakin-cli and return clean markdown, structured JSON, or the full raw API response. Use when extracting content from a web page — an article, product page, documentation, or any URL.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/anakin:scrape-website

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

Bash(anakin scrape *)Bash(mkdir *)Bash(head *)Bash(wc *)Bash(jq *)

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Extracting content from a single web page — an article, product page, documentation, or any URL.

SKILL.md

82 lines · ~737 tokens

Stats

Stars0

MaintenanceFair

Last CommitFeb 18, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Scrape a website

Trigger

Extracting content from a single web page — an article, product page, documentation, or any URL.

Workflow

Verify anakin-cli is authenticated by running anakin status.
Create the output directory if it doesn't exist:
```
mkdir -p .anakin
```
Determine the output filename from the URL (use domain and path):
- https://docs.react.dev/learn -> .anakin/docs.react.dev-learn.md
- https://example.com/pricing -> .anakin/example.com-pricing.md
Check if $ARGUMENTS contains --browser. If the URL is a known SPA/JS-heavy site (React apps, Next.js, Angular, etc.), add --browser automatically.

Run the scrape:

anakin scrape "$ARGUMENTS" -o .anakin/<filename>.md

Check the output file size and read incrementally:

wc -l .anakin/<filename>.md && head -80 .anakin/<filename>.md

Summarize the content for the user. If the page is long, highlight the key sections and offer to read specific parts.

Commands

# Clean readable text (default)
anakin scrape "<url>" -o .anakin/output.md

# Structured data
anakin scrape "<url>" --format json -o .anakin/output.json

# Full API response with HTML and metadata
anakin scrape "<url>" --format raw -o .anakin/output.json

# JavaScript-heavy or single-page app sites
anakin scrape "<url>" --browser -o .anakin/output.md

# Geo-targeted scraping
anakin scrape "<url>" --country gb -o .anakin/output.md

# Custom timeout for slow pages
anakin scrape "<url>" --timeout 300 -o .anakin/output.md

Options

--browser — Use headless browser (for JS-rendered/SPA pages)
--country <code> — Country code for geo-targeting (default: us)
--format <fmt> — Output format: markdown (default), json, or raw
--timeout <seconds> — Max wait time (default: 120)
-o, --output <path> — Save to file

Async behavior

Starts job, polls every 3s until complete.

Guardrails

Always quote URLs to prevent shell interpretation of ?, &, # characters.
Default to markdown format unless the user asks for structured data or raw output.
Use --browser only when a standard scrape returns empty or incomplete content.
On 429 errors, wait before retrying rather than looping immediately.
Always use -o to save output to a file rather than flooding the terminal.

Output

Scraped content in the requested format
File path where results were saved
If scrape fails or returns empty, retry with --browser. If still fails, check anakin status.

scrape-website

Invocation

Tool Access

Context Preview

SKILL.md

scrape-website

Invocation

Tool Access

Context Preview

SKILL.md

Scrape a website

Trigger

Workflow

Commands

Options

Async behavior

Guardrails

Output

Similar Skills

Scrape a website

Trigger

Workflow

Commands

Options

Async behavior

Guardrails

Output

Similar Skills