Skill

crw-scrape

Scrapes a single known URL into clean markdown, HTML, links, or structured JSON via the fastCRW CLI or MCP. Handles JavaScript-rendered SPAs automatically.

automation

cli-tools

Popularity

Stars

190

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/crw:crw-scrape

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

Bash(crw:*)Bash(curl:*)Read

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- You have one (or a handful of) **known URLs** and want their content.

SKILL.md

89 lines · ~989 tokens

Stats

LanguageRust

Stars190

Forks16

MaintenanceExcellent

Last CommitJun 18, 2026

Actions

View Source View Plugin View on GitHub View README

crw-scrape — single-page extraction

When to use

You have one (or a handful of) known URLs and want their content.
Step 2 in the crw ladder: if you don't have a URL yet, go to crw-search (step 1) first. For many pages under a site, use crw-crawl (step 4). For a local PDF, use crw-parse (step 5).
JS-heavy page? You usually don't need anything special — crw auto-detects and renders. This is a crw advantage: no separate "interact"/browser step.

Quick start

CLI (binary on PATH):

crw scrape "https://example.com"                       # → markdown to stdout
crw scrape "https://example.com" --format json -o page.json
crw scrape "https://example.com" --js --css "article.main"
crw scrape "https://example.com" --format links -o .crw/links.txt

MCP (inside an agent harness):

crw_scrape(url="https://example.com", formats=["markdown"], onlyMainContent=true)

REST (drop-in for Firecrawl SDKs — just swap the base URL):

curl -X POST "$CRW_API_URL/v1/scrape" -H "Authorization: Bearer $CRW_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","formats":["markdown"],"onlyMainContent":true}'

Options

Need	CLI flag	MCP / REST field
Output format	`--format markdown\|html\|rawhtml\|text\|links\|json`	`formats: [...]`
Strip nav/footer/sidebar	(on by default; `--raw` to disable)	`onlyMainContent: true`
Force JS rendering	`--js`	`renderJs: true` (null = auto)
Wait after load	—	`waitFor: 2000` (ms)
Keep only selectors	`--css "article"` / `--xpath …`	`includeTags: ["article"]`
Drop selectors	—	`excludeTags: ["nav","footer"]`
Pick renderer	—	`renderer: "auto\|lightpanda\|chrome\|chrome_proxy\|playwright"` (`auto` is default)
Save to file	`-o FILE`	(write the response yourself)
Structured JSON	`--extract '<schema>'`	`extract: {schema: {...}}` — see crw-extract
Use a proxy	`--proxy URL` `--stealth`	`proxy`, `proxyRotation`, `stealth`

Tips

Quote URLs — ? and & are shell-special. Always wrap in quotes.
Multiple URLs = run them concurrently. Fire several crw scrape … & and wait, or issue parallel MCP calls.
Blank page / loading skeleton? Add --js / renderJs: true, optionally a waitFor. crw's auto-detect covers most SPAs without it.
Don't dump huge pages into context. Write to .crw/, then grep/head. MCP truncates to ~15 000 chars (maxLength: 0 to opt out).
Want a typed object, not prose? --format json returns the raw full-page object (metadata + content), not schema-extracted data. For structured extraction against a schema use --extract '<schema>' — this calls an LLM and requires a configured LLM provider. See the dedicated crw-extract skill.
Source is a file, not a URL? Use crw-parse instead.

crw-scrape

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

crw-scrape

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

crw-scrape — single-page extraction

When to use

Quick start

Options

Tips

See also

Similar Skills

crw-scrape — single-page extraction

When to use

Quick start

Options

Tips

See also

Similar Skills