Skill

build-list

This skill should be used when the user asks to "build a prospect list", "find prospects", "gather leads", "explore targets", or wants to build a prospect list. Collects prospect candidates via web search based on BUSINESS.md and SALES_STRATEGY.md and registers them in the DB.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/lead-ace:build-list <project-id> [target-count=30]

User invocable

Model invocable

Inline context

Default effort

Argument hint<project-id> [target-count=30]

Tool Access

This skill is limited to the following tools:

BashReadAgentWebSearchWebFetchmcp__plugin_lead-ace_api__add_prospectsmcp__plugin_lead-ace_api__check_prospect_dedupmcp__plugin_lead-ace_api__get_outbound_targetsmcp__plugin_lead-ace_api__get_documentmcp__plugin_lead-ace_api__save_documentmcp__plugin_lead-ace_api__get_master_documentmcp__plugin_lead-ace_api__get_project_settings

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A skill that collects prospect candidates via web search based on the information in BUSINESS.md and SALES_STRATEGY.md, retrieves contact information, and registers them in the database.

SKILL.md

410 lines · ~5.8k tokens(exceeds 5k compaction limit)

Stats

LanguageTypeScript

Parent stars0

MaintenanceGood

Last CommitMay 28, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Build List - Prospect List Building

A skill that collects prospect candidates via web search based on the information in BUSINESS.md and SALES_STRATEGY.md, retrieves contact information, and registers them in the database.

3-Phase Structure:

Phase 1 (Candidate Collection): Find prospect candidates broadly via web search (name, official URL, overview)
Phase 1.5 (Pre-dedup filter): Call check_prospect_dedup with the candidates' domains and drop any the server would reject — saves the per-candidate cost of Phase 1.7 (signal WebSearch) and Phase 2 (contact-retrieval sub-agents) on already-known orgs
Phase 1.7 (Signal Collection): Pull a recent-signal slice for each surviving candidate (press release / funding / hiring) so /outbound has fresh hooks
Phase 2 (Contact + Keyperson Retrieval): Use sub-agents to explore each candidate's official site, retrieve email / form URL, AND surface at least one keyperson (job title + name)

Before starting: Read ${CLAUDE_PLUGIN_ROOT}/references/workspace-conventions.md and follow the cross-cutting conventions there (data storage, MCP error handling, document writes, output discipline).

Phase 1: Candidate Collection

1. Setup

Project ID: $0 (required)
Target count: $1 (default: 30. Approximate is fine -- "around N" is sufficient)

Load the following documents via MCP:

Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "business". Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "sales_strategy". Call mcp__plugin_lead-ace_api__get_master_document with slug: "tpl_industries" and keep the returned vocabulary list — every prospect's industry field MUST be set to one of those exact strings.

Call mcp__plugin_lead-ace_api__get_project_settings with projectId: "$0" and capture:

outboundChannels (subset of email | form | sns_twitter | sns_linkedin): the channels this project is allowed to use for outbound. Phase 2 contact retrieval should focus on the enabled channels — e.g. if only email is enabled, don't spend sub-agent effort discovering form URLs or SNS handles. A candidate with no contact channel matching the allowlist will be skipped at /outbound, so deprioritize discovering them. An empty outboundChannels array means the project has paused outbound entirely — stop and inform the user instead of building a list that can never be reached.
targetCountries (array of ISO 3166-1 alpha-2 codes; subset of US | CA | JP): when non-empty, restrict discovery to organizations in these countries — bias search queries with regional qualifiers, prefer country-specific portals, and drop candidates whose inferred country falls outside the set. When empty, use the full compliance allowlist (US / CA / JP).

If either project document is not found, guide the user to run /strategy.

2. Review Search Notes

Do NOT pre-fetch the registered-prospect list. Server-side dedup in add_prospects (Phase 3) is the single source of truth — it returns structured skippedDetails with reasons (email_duplicate, form_url_duplicate, already_in_project, do_not_contact, duplicate_in_batch) so this skill can adapt mid-flight without an O(N) identifier dump.

Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "search_notes". If found, use its content. It contains knowledge from previous explorations:

Exhausted keywords (do not repeat — they already returned heavy duplicates)
Coverage matrix (industry × region × company-size cells already covered)
Useful information source sites (not yet fully explored)
Directions to try next time

Use this to continue exploration from where the last session left off. If search_notes is missing, treat every cell of the matrix as unexplored.

3. Search Strategy

Based on the "Search Keywords" and "Target" sections of SALES_STRATEGY.md, formulate multiple search queries.

Pick from unexplored cells of the coverage matrix first. Each query should belong to a single (industry × region × size) cell, e.g. B2B SaaS × Pacific Northwest × Series A. Cells already marked exhausted in search_notes should not be retried unless the user explicitly asks.

Avoid every keyword listed under ## Exhausted Keywords in search_notes (those previously returned ≥ 70% duplicates). Pick a synonym or different angle instead.

Types of search queries (choose appropriate ones based on target type):

Search by target industry + region
Member lists of industry associations, federations
Prospect collection from industry media and news sites
Exhibitor lists from trade shows and events
Client case studies from competitors
Target exploration on job sites
Directories or public databases of schools and corporations

4. Web Search Execution

Combine WebSearch and fetch_url.py (Jina Reader + Claude Haiku) to broadly collect prospect candidates.

Use fetch_url.py for page retrieval (do not use WebFetch):

python3 ${CLAUDE_PLUGIN_ROOT}/scripts/fetch_url.py --url "https://example.com" --prompt "Extract company list" --timeout 15

Has timeout control so it won't freeze on unresponsive sites. Also handles SPA sites.

Fallback when fetch_url.py is unavailable: if the invocation fails (either python3 or the claude CLI is missing from PATH, or any execution error), fall back to WebFetch for the rest of the run. WebFetch is blocked by some corporate B2B WAFs (typically 403) — when that happens, skip the candidate and continue with the others rather than retrying.

This phase focuses on discovering candidates. Contact information (email, form, etc.) is collected in Phase 2, so only gather the following here:

Required (skip the candidate if missing):

Name (company name, school name, organization name, etc.)
Business overview (what the organization does; 1-2 sentences summarized from the official site)
Official site URL

If available:

Industry or field
Department or branch name (school name for school corporations, target department for large companies)
Country (ISO 3166-1 alpha-2, e.g., "US", "JP", "GB")
Email addresses or SNS accounts found incidentally during search (no need to look for these intentionally)
Organization name: the legal entity name if it differs from the prospect name (e.g., a school corporation that operates multiple schools)

Skip any prospect for which the official site URL and business overview cannot be obtained.

Search tips:

A single query finds limited prospects, so vary the angles broadly
Use portal sites and listing pages to find many candidates at once
Stop searching once the target count ($1, default 30) is reached. Deduplication rejections don't count (count only newly registered ones)
No need to deep-dive individual official sites in this phase -- focus on securing a quantity of candidates

Duplicate-rate response (threshold-driven):

The duplicate signal comes from two places: Phase 1.5's check_prospect_dedup decisions (most candidates are caught here, before signals / contact retrieval) and Phase 3's add_prospects.skippedDetails (the safety net that catches anything that slipped past 1.5). Combine both when judging a batch — but exclude plan_limit from the tally (it is a budget hit, not an angle-exhaustion signal; treating it as exhaustion would mark a perfectly good keyword as dead just because the user hit their plan cap mid-cycle).

< 30% skip rate — healthy. Continue with the same angle.
30–70% skip rate — the angle is fading. Deep-dive within the same target first before pivoting:
- Look beyond top results to page 2, 3, and beyond
- Add regional qualifiers (e.g., "SaaS companies" → "SaaS companies Portland", "SaaS companies Austin")
- Use synonyms / related terms (e.g., "consulting firm" → "advisory firm", "management consultancy")
- Follow industry-specific portal sites and directories
- Search for "competitors" / "similar services" of already-registered prospects to find new ones organically
≥ 70% skip rate — the angle is exhausted. Stop deep-diving on this keyword / cell, record it under ## Exhausted Keywords (step 9), mark the corresponding coverage-matrix cell as exhausted, and pivot to a different (industry × region × size) cell for the next pass.

The 70% rule is a hard pivot threshold, not advisory — repeating an exhausted angle just spends quota on duplicates.

5. Priority and Match Reason Assessment

For each prospect, assign a match reason (why they're appropriate as a target, including their challenges and needs) and priority (1-5) based on SALES_STRATEGY.md criteria:

1: Top priority (perfectly matches target, needs are clear)
2: High priority (broadly matches target)
3: Standard (within target range)
4: Marginal (only partially meets criteria)
5: Under consideration (indirect possibility)

Factor in email retrieval ease: If the following signals are found during exploration, raise priority by 1 level for equal match quality (more email holders -> higher outbound success rate):

Has press releases on press release distribution sites (high rate of PR contact email inclusion)
Listed in startup DB or industry directory (more public information available)
Email explicitly shown on official site (e.g., info@) discovered during exploration

Note on email types: Both named individual addresses ([email protected]) and generic addresses (info@, contact@, sales@, support@, pr@) are valid outreach targets. Named addresses generally have higher reply rates and deserve slightly higher priority, but generic addresses must not be excluded — for many companies they are the only reachable channel.

Phase 1.5: Pre-dedup Filter

Before paying for Phase 1.7's per-candidate WebSearch and Phase 2's per-candidate sub-agent contact retrieval, drop candidates the server would reject anyway. The dedup decision needs only organizationDomain, which is already known at the end of Phase 1, so running this gate first saves both downstream costs.

Call mcp__plugin_lead-ace_api__check_prospect_dedup with:

projectId: "$0"
candidates: array of { organizationDomain, email?, contactFormUrl? } — one entry per Phase 1 candidate. organizationDomain is the apex domain derived from the candidate's website_url (strip www. and path). Include email / contactFormUrl if Phase 1 happened to surface them (rare but possible).

The response is a decisions array in the same order as the input. Drop any candidate whose kind === 'skip'. Tally the skip reasons (reason ∈ already_in_project | email_duplicate | form_url_duplicate | do_not_contact | duplicate_in_batch) and feed that tally into step 9 (## Exhausted Keywords) — the same threshold rule applies (≥ 70% skip in the batch = exhausted angle, switch keywords for the next pass).

If most candidates are dropped here, the search angle is exhausted; do not push through Phase 1.7 / Phase 2 with a near-empty list. Either (a) re-run Phase 1 with a different keyword / region / size cell from the coverage matrix, or (b) accept the smaller batch and continue. Phase 3's add_prospects re-runs the same dedup as a safety net, so passing through a few skip-marked candidates is harmless but wastes Phase 1.7 / Phase 2 effort.

Phase 1.7: Signal Collection

For each surviving (post-Phase-1.5) candidate, run one WebSearch query of the form "<organization name>" press release OR funding OR hiring 2025..2026 (or your equivalent for the prospect's region / language). Skim the top results for any of:

A press release dated within the last 6 months
A funding round announcement
A hiring spike, role expansion, or new department launch
A product launch, partnership, or named-customer announcement

When something concrete surfaces, append a ## Recent Signals section to the candidate's overview of the form:

## Recent Signals
- 2026-03-12: Announced Series B led by Acme Ventures (TechCrunch)
- 2026-02-04: Hiring 5 senior backend engineers (LinkedIn)

Bullet date + 1 sentence + source. Do not invent signals — if nothing relevant turns up, leave the section out. /outbound reads ## Recent Signals and decides whether to open with a signal-aware hook; absent section means no signal mention.

This is one query per prospect, not deep research. The SaaS-side daily batch (B §4.2-B) refines signals over time; the goal here is to seed the field at registration time.

Phase 2: Contact + Keyperson Retrieval

6. Contact Retrieval via Sub-agents

Split the post-Phase-1.5 candidate list (only the kind === 'fresh' entries; Phase 1.7 may have enriched their overview with signals) into batches of 5 and launch a sub-agent for each batch to retrieve contact information.

Include the following in each sub-agent's prompt:

List of assigned candidates (name, organization_name, website_url, overview, industry, department, country, match_reason, priority)
Retrieve the contact enrichment procedure via mcp__plugin_lead-ace_api__get_master_document with slug: "tpl_enrich_contacts" and follow its procedure
Explore each candidate's official site to retrieve email addresses and contact form URLs
Keyperson lookup is required, not optional. Search the official site's team / leadership / about pages, then LinkedIn public results (site:linkedin.com/in "<organization name>" <target role>), then the press release page. Capture at least one (contactName, department) pair per candidate when any public source mentions one. If absolutely nothing surfaces, leave both null and note it.
Use python3 ${CLAUDE_PLUGIN_ROOT}/scripts/fetch_url.py --url <URL> --prompt <instructions> for page retrieval (do not use WebFetch). If fetch_url.py cannot run (either python3 or the claude CLI is missing from PATH), fall back to WebFetch and skip any candidate the WAF blocks (403)
After completion, return the results as a JSON array

Sub-agent allowed-tools: Bash, WebSearch, WebFetch, Read, mcp__plugin_lead-ace_api__get_master_document

Each object in the JSON array returned by the sub-agent includes the Phase 1 information (name, organization_name, overview, website_url, industry, department, country, match_reason, priority) plus the retrieved contacts (email, contact_form_url, form_type, sns_accounts, contact_name).

6b. Re-search for Candidates Without Contact Info (only when applicable)

If Phase 2 results show candidates with both email / contact_form_url as null, try to supplement contact info from sources other than the official site.

For each such candidate, search WebSearch for:

"{company name}" email address
"{company name}" contact

Information may be found from industry directories, press release distribution sites, event speaker information, etc. If found, update the candidate's data.

Limit: Re-search up to a maximum of 10 candidates without contact info. Register the rest without contact info (they will be skipped during outbound).

Phase 3: Registration

7. Database Registration

Call mcp__plugin_lead-ace_api__add_prospects with:

projectId: "$0"
prospects: array of prospect objects

Field mapping for the MCP tool:

For each prospect, construct the object as follows:

organizationDomain: Extract the apex domain from website_url (e.g., https://www.example.com/about -> example.com). Strip www. prefix and path. Used for dedup.
organizationName: the legal entity name (or name if not separately available)
organizationWebsiteUrl: the organization's official website URL
name: prospect name (company name, school name, department, etc.)
contactName: contact person name (optional)
department: department within the organization (optional)
overview: business overview (1-2 sentences). If Phase 1.7 surfaced any signals, append the ## Recent Signals section after the overview text within the same field.
industry: must be one of the strings from tpl_industries (the vocabulary you fetched in step 1). Free-form industry strings break the /evaluate aggregator and the timing-aware ordering. If none fit, use Other.
country: ISO 3166-1 alpha-2 (e.g. US, CA, JP). Optional in the payload — when omitted the server falls back to TLD inference of the organization domain. Set this when you have stronger evidence than the TLD (LLM-derived from page content, address footer, etc.) and pass countrySource: 'ai_inferred'. LeadAce currently only sends to US, CA, and JP recipients; prospects from other countries register fine but the send paths block them at outreach time. If /strategy already identified a US-, CA-, or JP-only target audience, prefer those.
countrySource: optional, one of manual (operator confirmed) or ai_inferred. Skip this field when leaving country blank.
websiteUrl: the specific page URL for this prospect
email: email address (optional*)
contactFormUrl: contact form URL (optional*)
formType: one of google_forms, native_html, wordpress_cf7, iframe_embed, with_captcha (optional)
snsAccounts: { x?, linkedin?, instagram?, facebook? } (optional*)
matchReason: why this prospect is a good target
priority: 1-5 (default 3)
hypothesis: per-prospect targeting hypothesis as a structured object (optional but recommended). Built from the assembled overview + any ## Recent Signals + matchReason + SALES_STRATEGY context. Read by the inquiry-landing chat snapshot to ground answers about the visiting org. Shape:
- hypothesizedPain: 1–3 short pain hypotheses, one sentence each (e.g. ["Manual lead routing slows reps", "No central buyer-signal aggregation"])
- valueMapping: 1–3 bullets of how our offering addresses those pains (same order as hypothesizedPain when paired)
- timingSignals: 1–3 concrete reasons NOW is a good moment, drawn from ## Recent Signals (e.g. ["Series B announced 12d ago", "2 SDR roles open since 18d"]). Omit when no signals surfaced — do not invent.
- targetDepartment / targetRolePattern: optional. Department / role pattern most likely to buy (e.g. "Sales Operations", "Director of Sales Ops").
- bestChannel / bestKeyperson: optional. Skip when unclear; do NOT guess.
Keep each bullet to one short sentence. Skip fields when public info is too thin to fill them honestly. A partial hypothesis is fine; an invented one harms the chat AI's credibility.

* At least one of email, contactFormUrl, or snsAccounts is required. Prospects with no contact channel are rejected.

The server automatically deduplicates by email, contact form URL, and organization domain within the project. Inspect skippedDetails after the call: each entry is {name, reason} with reason ∈ email_duplicate | form_url_duplicate | already_in_project | do_not_contact | duplicate_in_batch | plan_limit. If the same reason clusters tightly (e.g. ≥ 50% of skips are email_duplicate from one industry), record the keyword in ## Exhausted Keywords and switch angles for the next pass.

Difference between organizations and prospects:

organizations = Legal entity unit (apex domain is PK)
prospects = Prospect unit (specific target within an organization)

Small company: organizationName = name (1:1, department is null) School corporation operating multiple schools: organizationName = "Katayagi Gakuen School Corporation", name = "Nihon Kogakuin College" (1:many possible) Department within large company: name = "ABC Corp.", department = "Sales Planning Dept."

8. Results Report

After DB registration, check reachable count:

Call mcp__plugin_lead-ace_api__get_outbound_targets with projectId: "$0" and limit: 1 to get the total and byChannel summary.

Report the following:

Number of newly registered prospects / target count
Reachable breakdown (among newly registered: N with email, N with form, N SNS-only, N without contacts)
Breakdown by priority
Number rejected as duplicates (if many, briefly describe how the search angle was changed)
Total project reachable remaining (from total field)
Guide the user to run /outbound as the next step
Append a single low-key dashboard line at the end: Dashboard: https://app.leadace.ai/prospects — purely informational, do not push the user to open it

9. Update Search Notes

Save search notes via mcp__plugin_lead-ace_api__save_document with projectId: "$0", slug: "search_notes". Record information useful for the next exploration in the following structure:

# Search Notes
Last updated: YYYY-MM-DD

## Coverage Matrix
Track which (industry × region × company-size) cells have been covered
this run. Cells where the combined dedup-skip rate (Phase 1.5 + Phase 3,
excluding `plan_limit`) is ≥ 70% are marked `exhausted`. New runs should
pick from `unexplored` cells first.

| Industry | Region | Size | Status | Notes |
|---|---|---|---|---|
| B2B SaaS | US-West | Series A | covered | 12 added, 0 dups |
| B2B SaaS | US-West | Series B | exhausted | 14 dups / 18 attempts |
| HealthTech | US-Northeast | bootstrapped | unexplored | next run |

## Exhausted Keywords
Keywords whose combined dedup-skip rate (Phase 1.5 + Phase 3, excluding
`plan_limit`) was ≥ 70% this run. **Do not re-use without a fresh angle**
(different region, different size band, different role seniority). Each
entry: `keyword — reason — date`.

- "B2B SaaS Series B" — 14/18 returned `already_in_project` — 2026-05-06

## Useful Sources
- (Portal sites or listing page URLs that haven't been fully explored yet)

## Directions to Try Next Time
- (Search methods not attempted this time, regions or angles not yet explored)

## Notes
- (Areas where prospects were found unexpectedly, insights for next time)

If the previous version (from step 2) has a ## Hints from evaluate section, preserve its content and carry it over to the end of the new document (to preserve response pattern info added by evaluate).

If the previous version already has ## Coverage Matrix / ## Exhausted Keywords sections, merge into them — don't overwrite. Only mark a cell exhausted when this run's data confirms it; old exhausted entries should be re-tested if the user asks for a sweep across previously-skipped cells.

build-list

Invocation

Tool Access

Context Preview

SKILL.md

build-list

Invocation

Tool Access

Context Preview

SKILL.md

Build List - Prospect List Building

Phase 1: Candidate Collection

1. Setup

2. Review Search Notes

3. Search Strategy

4. Web Search Execution

5. Priority and Match Reason Assessment

Phase 1.5: Pre-dedup Filter

Phase 1.7: Signal Collection

Phase 2: Contact + Keyperson Retrieval

6. Contact Retrieval via Sub-agents

6b. Re-search for Candidates Without Contact Info (only when applicable)

Phase 3: Registration

7. Database Registration

8. Results Report

9. Update Search Notes

Similar Skills

Build List - Prospect List Building

Phase 1: Candidate Collection

1. Setup

2. Review Search Notes

3. Search Strategy

4. Web Search Execution

5. Priority and Match Reason Assessment

Phase 1.5: Pre-dedup Filter

Phase 1.7: Signal Collection

Phase 2: Contact + Keyperson Retrieval

6. Contact Retrieval via Sub-agents

6b. Re-search for Candidates Without Contact Info (only when applicable)

Phase 3: Registration

7. Database Registration

8. Results Report

9. Update Search Notes

Similar Skills