From benedictking-skills
Scrapes web pages, extracts structured data, takes screenshots, parses PDFs, batch-scrapes URLs, and crawls sites using the Firecrawl API.
How this skill is triggered — by the user, by Claude, or both
Slash command
/benedictking-skills:firecrawl-scraperThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Choose Firecrawl endpoint based on user intent:
Choose Firecrawl endpoint based on user intent:
--wait)--wait)This skill uses a two-phase architecture:
Use Task tool to invoke firecrawl-fetcher sub-skill, passing command and JSON (stdin):
Task parameters:
- subagent_type: Bash
- description: "Call Firecrawl API"
- prompt: cat <<'JSON' | node scripts/firecrawl-api.cjs <scrape|crawl|map|batch-scrape|crawl-status|batch-status|batch-errors> [--wait]
{ ...payload... }
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs scrape
{
"url": "https://example.com",
"formats": ["markdown", "links"],
"onlyMainContent": true,
"includeTags": [],
"excludeTags": ["nav", "footer"],
"waitFor": 0,
"timeout": 30000
}
JSON
Available formats:
"markdown", "html", "rawHtml", "links", "images", "summary"{"type": "json", "prompt": "Extract product info", "schema": {...}}{"type": "screenshot", "fullPage": true, "quality": 85}cat <<'JSON' | node scripts/firecrawl-api.cjs scrape
{
"url": "https://example.com",
"formats": ["markdown"],
"actions": [
{"type": "wait", "milliseconds": 2000},
{"type": "click", "selector": "#load-more"},
{"type": "wait", "milliseconds": 1000},
{"type": "scroll", "direction": "down", "amount": 500}
]
}
JSON
Available actions:
wait, click, write, press, scroll, screenshot, scrape, executeJavascriptcat <<'JSON' | node scripts/firecrawl-api.cjs scrape
{
"url": "https://example.com/document.pdf",
"formats": ["markdown"],
"parsers": ["pdf"]
}
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs scrape
{
"url": "https://example.com/product",
"formats": [
{
"type": "json",
"prompt": "Extract product information",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"description": {"type": "string"}
},
"required": ["name", "price"]
}
}
]
}
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs crawl
{
"url": "https://docs.example.com",
"formats": ["markdown"],
"includePaths": ["^/docs/.*"],
"excludePaths": ["^/blog/.*"],
"maxDiscoveryDepth": 3,
"limit": 100,
"allowExternalLinks": false,
"allowSubdomains": false
}
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs crawl --wait
{
"url": "https://docs.example.com",
"formats": ["markdown"],
"limit": 100
}
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs map
{
"url": "https://example.com",
"search": "documentation",
"limit": 5000
}
JSON
cat <<'JSON' | node scripts/firecrawl-api.cjs batch-scrape
{
"urls": [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3"
],
"formats": ["markdown"]
}
JSON
Returns async job response: { "success": true, "id": "<batch-id>", "url": "..." }
cat <<'JSON' | node scripts/firecrawl-api.cjs batch-scrape --wait
{
"urls": [
"https://example.com/page1",
"https://example.com/page2"
],
"formats": ["markdown"]
}
JSON
node scripts/firecrawl-api.cjs batch-status <batch-id>
Wait for completion:
node scripts/firecrawl-api.cjs batch-status <batch-id> --wait
node scripts/firecrawl-api.cjs batch-errors <batch-id>
node scripts/firecrawl-api.cjs crawl-status <crawl-id>
Wait for completion:
node scripts/firecrawl-api.cjs crawl-status <crawl-id> --wait
onlyMainContent: Extract only main content (default: true)includeTags: CSS selectors to includeexcludeTags: CSS selectors to excludewaitFor: Wait time before scraping (ms)maxAge: Cache duration (default: 48 hours)wait: Wait for specified timeclick: Click element by selectorwrite: Input text into fieldpress: Press keyboard keyscroll: Scroll pageexecuteJavascript: Run custom JSincludePaths: Regex patterns to includeexcludePaths: Regex patterns to excludemaxDiscoveryDepth: Maximum crawl depthlimit: Maximum pages to crawlallowExternalLinks: Follow external linksallowSubdomains: Follow subdomainsTwo ways to configure API Key (priority: environment variable > .env):
FIRECRAWL_API_KEY.env file: Place in .env, can copy from .env.exampleAll endpoints return JSON with:
success: Boolean indicating successdata: Extracted content (format depends on endpoint)crawl-status (or GET /v2/crawl/{id}) to check status{ success, id, url }), use batch-status (or GET /v2/batch/scrape/{id}) to poll statusstatus, total, completed, creditsUsed, expiresAt, next, data[]
next: pagination URL for large/incomplete results (script returns raw response; follow manually if needed)npx claudepluginhub benedictking/benedictking-skillsCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.