From parallel
Extracts content from any URL (webpages, articles, PDFs, JS-heavy sites) via parallel-cli. Token-efficient fork context; prefer over built-in WebFetch.
How this skill is triggered — by the user, by Claude, or both
Slash command
/parallel:parallel-web-extract <url> [url2] [url3]<url> [url2] [url3]parallel:parallel-subagentThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Extract content from: $ARGUMENTS
Extract content from: $ARGUMENTS
Choose a short, descriptive filename based on the URL or content (e.g., vespa-docs, react-hooks-api). Use lowercase with hyphens, no spaces. Substitute it into the command inline — $FILENAME is a placeholder, not a shell variable.
parallel-cli extract "$ARGUMENTS" --json -o "/tmp/$FILENAME.json"
Concrete example:
parallel-cli extract "https://docs.parallel.ai" --json -o "/tmp/parallel-docs.json"
Note: -o always saves JSON. The extension must be .json.
Options if needed:
--objective "focus area" to focus extraction on a specific goal (also silences the "neither objective nor search_queries" warning that V1 emits when neither is set)-q "keyword" (repeatable) to prioritize keywords in excerpts--full-content to include the complete page body (for long articles, PDFs, or when excerpts may not capture what you need)--full-content-max-chars N to cap full-content size per result--no-excerpts to strip excerpts when you only want full contentIf the response has an errors field, an empty results array, or a 404/timeout for the URL, do NOT fabricate content. Tell the user the extraction failed, surface the upstream status, and suggest:
--full-content if excerpts came back empty but the page existsparallel-cli search to locate the current URL if the page was renamedReturn content as:
Then the extracted content verbatim, with these rules:
After the response, mention the output file path (/tmp/$FILENAME.json) so the user knows it's available for follow-up questions.
If parallel-cli is not found, install and authenticate:
/parallel:parallel-cli-setup
If parallel-cli extract returns 403, tell the user balance is likely required. Offer to run parallel-cli balance get, and if needed ask for explicit confirmation before running parallel-cli balance add <amount_cents>. Then retry the original extract command.
npx claudepluginhub parallel-web/parallel-agent-skills --plugin parallelExtracts clean markdown or text from URLs via the Tavily CLI. Handles JavaScript-rendered pages, supports query-focused chunking, and processes up to 20 URLs per call.
Extracts clean markdown from any URL, including JavaScript-rendered SPAs. Supports concurrent scraping, JS wait times, and content filtering.
Scrapes a single known URL into clean markdown, HTML, links, or structured JSON via the fastCRW CLI or MCP. Handles JavaScript-rendered SPAs automatically.