From Firecrawl Workflows
Ingests public or authenticated documentation portals using Firecrawl browser. Handles JS-heavy pages, login-gated content, paginated help centers, and structured markdown/JSON extraction.
How this skill is triggered — by the user, by Claude, or both
Slash command
/firecrawl-workflows:firecrawl-knowledge-ingestThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this when a docs portal needs browser navigation, auth, pagination, or JS rendering.
Use this when a docs portal needs browser navigation, auth, pagination, or JS rendering.
Infer the portal URL, output format, auth needs, and page limit from context. If the portal is clear, proceed immediately.
Ask at most 1-3 concise questions only if blocked, such as the portal URL, whether authentication is required, or the desired output format.
Use Firecrawl browser to:
Try Firecrawl map as a supplement for public URLs, but use browser navigation for auth-gated or JS-heavy content.
# Knowledge Ingest: [Portal]
## Summary
[Pages extracted, sections covered, limitations]
## Output
[JSON/markdown/merged file path or content]
## Sections
[Section names and article counts]
## Failed Or Restricted Pages
[Any access/loading issues]
## Sources
[URLs extracted]
## Rerun Inputs
workflow: firecrawl-knowledge-ingest
url: [portal url]
format: [json/markdown/merged]
max_pages: [number]
Use source, url, extractedAt, totalArticles, and sections[] with article title, url, section, content, and metadata.
npx claudepluginhub firecrawl/firecrawl-workflows --plugin firecrawl-workflowsCrawls websites and extracts content from multiple pages via the Tavily CLI. Supports depth/breadth control, path filtering, semantic instructions, and saving pages as local markdown files.
Extracts clean markdown from any URL, including JavaScript-rendered SPAs. Supports concurrent scraping, JS wait times, and content filtering.
Scrapes single pages or crawls sites using Firecrawl v2.5 API to LLM-ready markdown and structured data. Handles JS rendering, bot bypass, browser automation for dynamic content extraction.