From linkedin-prospect
LinkedIn prospecting pipeline knowledge — scraping companies and people with Crawl4AI, building org-charts via LLM, scoring decision makers, and interactive graph visualization. Use when the user asks about LinkedIn scraping, prospect workflows, JsonCssExtractionStrategy schemas for LinkedIn, geoUrn codes, browser profiles for LinkedIn, or the prospect wizard pipeline.
How this skill is triggered — by the user, by Claude, or both
Slash command
/linkedin-prospect:linkedin-prospectingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The Prospect Wizard is a three-stage pipeline:
The Prospect Wizard is a three-stage pipeline:
c4ai_discover.py): Scrapes LinkedIn company search results and employee pages using Crawl4AI's JsonCssExtractionStrategy with auto-generated CSS schemasc4ai_insights.py): Generates embeddings, builds inter-company similarity graphs, infers org-charts via LLM, and scores decision makersgraph_view_template.html): Interactive graph UI with company clusters, org-chart drill-down, and AI chatLinkedIn requires authentication. A persistent browser profile is created via:
crwl profiles
This opens a Chromium window where the user logs into LinkedIn. The profile name (e.g. profile_linkedin_uc) is reused across runs.
The scraper uses JsonCssExtractionStrategy.generate_schema() with sample HTML snippets and an LLM to create CSS extraction schemas. Schemas are cached in schemas/ and only regenerated if deleted. Two schemas exist:
company_card.json — extracts handle, name, descriptor, about, followers from company search cardspeople_card.json — extracts profile_url, name, headline, followers, connection_degree, avatar_url from people cards| Location | GeoUrn |
|---|---|
| Singapore | 103644278 |
| Malaysia | 102713980 |
| United States | 103644922 |
| United Kingdom | 102221843 |
| Australia | 101452733 |
Find more at: https://www.linkedin.com/search/results/companies/?geoUrn=XXX
companies.jsonl — one JSON per line: handle, name, descriptor, about, followers, people_url, captured_atpeople.jsonl — one JSON per line: profile_url, name, headline, followers, connection_degree, avatar_url, company_handlecompany_graph.json — nodes (companies) + edges (similarity with embed_sim, industry_match, geo_overlap drivers)org_chart_<handle>.json — nodes (people with decision_score 0-1) + edges (reporting relationships)decision_makers.csv — flattened high-score contacts for outreachStage 2 uses LiteLLM, supporting any provider:
openai/gpt-4.1gemini/gemini-2.0-flashanthropic/claude-sonnet-4-20250514--proxy http://user:pass@ip:port for residential proxy supportSet C4AI_DEMO_DEBUG=1 or pass --debug to use bundled sample HTML snippets instead of live scraping. Stage 2 supports --stub to skip LLM calls and generate fake org-charts.
crawl4ai litellm sentence-transformers pandas numpy scikit-learn rich
/li-discover — Run Stage 1 only (scrape)/li-insights — Run Stage 2 only (analyze)/li-prospect — Full pipeline (Stage 1 + Stage 2)npx claudepluginhub theophiluschinomona/linkedin-prospect-plugin --plugin linkedin-prospectGathers deep intelligence on qualified B2B leads via parallel web research sub-agents and Apify LinkedIn scraping, merging into enriched CSV.
Lead generation skill that discovers prospects on LinkedIn, finds emails, researches companies, and enriches contacts using the anysite MCP server. Also supports web scraping and Instagram.
Automates LinkedIn via CLI: fetch profiles, search people/companies, send messages, manage connections, create posts, react, comment, and use Sales Navigator for outreach/research workflows.