Skill

vc-signals

VC signal-to-thesis skill for Marathon-style deal discovery. Weekly all-sector radar, OSS radar, agent-native research workbench, theme drill-downs, company backtrace, and GitHub trending repos.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/vc-signals:vc-signals vc-signals radar all, vc-signals workbench docs/radar-runs/current, vc-signals oss ai-infra, vc-signals theme "agent evals", vc-signals setup

User invocable

Model invocable

Inline context

Default effort

Argument hint

vc-signals radar all, vc-signals workbench docs/radar-runs/current, vc-signals oss ai-infra, vc-signals theme "agent evals", vc-signals setup

Tool Access

This skill is limited to the following tools:

BashReadWriteWebSearchAskUserQuestion

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Turn noisy public internet chatter into ranked, investor-oriented theme briefs with company mapping.

SKILL.md

1351 lines · ~15.1k tokens(exceeds 5k compaction limit)

Stats

LanguagePython

Stars2

MaintenanceExcellent

Last CommitJun 3, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

VC Signals: Emerging Theme Discovery for Venture Capital

Turn noisy public internet chatter into ranked, investor-oriented theme briefs with company mapping.

This is NOT a generic research tool. It is a signal-to-thesis skill built for a VC workflow.

Argument Parsing

Parse the user's input to determine the mode and arguments:

/vc-signals setup → Setup wizard mode
/vc-signals radar <sector|all> [time] → Company-first weekly radar (sectors: devtools, cybersecurity, ai-infra, vertical-ai, data-infra, oss; all runs one unified brief with sector sections)
/vc-signals weekly <sector> [time] → Alias for radar (kept for backward compatibility)
/vc-signals theme "<topic>" [time] → Theme drill-down
/vc-signals company "<name>" [time] → Company backtrace
/vc-signals oss <sector> [time] → OSS-native radar using repo velocity, community signal, contributor quality, and company-formation likelihood
/vc-signals github <sector> → GitHub trending repos (sectors: devtools, cybersecurity, ai-infra, vertical-ai, data-infra, oss, all)
/vc-signals workbench <run-dir> → Agent-native research workbench for weak-signal synthesis without promoting unverified leads
/vc-signals add-sector <name> → Add a new sector (guided)
/vc-signals compare "<company1>" "<company2>" → Head-to-head comparison (stretch)

Time window (optional): Users can append a time window like 7d, 14d, 30d, 60d, 90d to control how far back to search. Examples:

/vc-signals weekly devtools → default 14 days
/vc-signals weekly devtools 7d → last 7 days only
/vc-signals weekly devtools 30d → last 30 days
/vc-signals theme "agent evals" 90d → last 90 days of evidence

Defaults by mode:

weekly: 14 days (focused on recent signal)
theme: 30 days (broader context for deep analysis)
company: 30 days

If no arguments or unrecognized arguments, show this help and ask what they'd like to do.

Environment Detection (ALWAYS check this FIRST)

Before anything else, detect whether you're running in a sandboxed web environment (Claude.ai, Co-Work web) or a local environment (Claude Code CLI, Co-Work desktop with terminal access).

Quick test: Try running a simple Bash command:

echo "env_check"

If Bash is not available, or if you get network errors (403 Forbidden) when running Python scripts, you are in a web sandbox. In this case:

Web sandbox mode:

Use ONLY WebSearch for retrieval (built-in, always works)
Do NOT run any Python scripts (persistence.py, github_trending.py, last30days_adapter.py) — they will fail with network errors
Do NOT run setup wizard — API keys can't be used in the sandbox
Skip GitHub trending entirely
Skip persistence (save/load) — just print the full brief inline
Tell the user: "Running in web mode — using web search only. For full coverage (Reddit, HN, X, GitHub trending), install Claude Code locally and run from there."

Local mode:

Full access — run all Python scripts, use last30days if configured, save results locally
Proceed with the normal flow below

First-Run Detection (check this after environment detection)

Before executing ANY mode (including setup), check if this is a first run:

cat ~/.config/last30days/.env 2>/dev/null | grep -q "SETUP_COMPLETE=true" && echo "configured" || echo "not_configured"

If NOT configured — and the user did NOT explicitly run /vc-signals setup:

Say this to the user:

"Welcome to VC Signals! This is your first run. I need about 2 minutes to set things up — I'll install the research engine and ask you for a couple of optional API keys. You can skip any you don't have."

"Want me to run setup now, or would you rather jump straight in with basic web search? (Setup gives you Reddit, Hacker News, X/Twitter, YouTube, and GitHub trending coverage.)"

If they choose setup → run the Setup Wizard (below), then continue to their original command. If they choose to skip → proceed with their command using WebSearch path. Note: "Running with basic web search. You can run /vc-signals setup anytime to unlock more sources."

If the user explicitly runs /vc-signals setup:

Always run the setup/update wizard, even when SETUP_COMPLETE=true already exists. Treat setup as "add or update provider keys", not only "first install." First read ~/.config/last30days/.env, detect which keys are already present, and ask only for missing keys unless the user says they want to replace one. Never print existing secret values; show only configured/missing status.

Auto-install last30days during setup:

As part of setup Step 3, ALWAYS clone last30days if it's not already installed — don't ask the user. Just do it:

# Determine where to clone last30days
if [ -d ".claude/skills/vc-signals" ]; then
  # Project-level install — clone to vendor/
  mkdir -p vendor
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git vendor/last30days-skill 2>/dev/null || true
elif [ -d "$HOME/.claude/skills/vc-signals" ]; then
  # Global install (Co-Work) — clone to ~/.claude/vendor/
  mkdir -p "$HOME/.claude/vendor"
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git "$HOME/.claude/vendor/last30days-skill" 2>/dev/null || true
else
  # Marketplace plugin install — keep shared vendor under ~/.claude/vendor/
  mkdir -p "$HOME/.claude/vendor"
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git "$HOME/.claude/vendor/last30days-skill" 2>/dev/null || true
fi

Then install requests into the same Python runtime the adapter will use:

RUNTIME_PY="${CODEX_BUNDLED_PYTHON:-}"
if [ -z "$RUNTIME_PY" ] || ! "$RUNTIME_PY" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1; then
  RUNTIME_PY="$(
    for py in python3.14 python3.13 python3.12 python3; do
      command -v "$py" >/dev/null 2>&1 || continue
      "$py" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1 && { command -v "$py"; break; }
    done
  )"
fi
if [ -n "$RUNTIME_PY" ]; then
  "$RUNTIME_PY" -m pip install 'requests>=2.32,<3' 2>/dev/null || "$RUNTIME_PY" -m pip install --user 'requests>=2.32,<3' 2>/dev/null || true
  "$RUNTIME_PY" -c 'import requests' 2>/dev/null && echo "requests ready for $RUNTIME_PY" || echo "requests missing for $RUNTIME_PY"
else
  echo "Python 3.12+ missing; install with: brew install [email protected]"
fi

Then check Node.js for cookie-based X. last30days uses its bundled bird-search.mjs backend for AUTH_TOKEN/CT0, and that backend requires node on PATH. If X cookies are configured and Node is missing, install it with Homebrew when available:

ENV_FILE="$HOME/.config/last30days/.env"
if [ -f "$ENV_FILE" ] && grep -Eq '^(AUTH_TOKEN|TWITTER_AUTH_TOKEN)=' "$ENV_FILE" && grep -Eq '^(CT0|TWITTER_CT0)=' "$ENV_FILE"; then
  if ! command -v node >/dev/null 2>&1; then
    if command -v brew >/dev/null 2>&1; then
      brew install node || true
    elif [ -x /opt/homebrew/bin/brew ]; then
      /opt/homebrew/bin/brew install node || true
    else
      echo "Node.js missing; install with: brew install node"
    fi
  fi
  command -v node >/dev/null 2>&1 && node --version || true
fi

If either fails, continue — the skill works without them (WebSearch fallback, no GitHub trending). Tell the user what succeeded and what didn't.

Script Paths

Before running any script, determine the skill directory. Check these locations in order:

.claude/skills/vc-signals/ (relative to the current project root — check if it exists)
~/.claude/skills/vc-signals/ (global installation for Co-Work)

Once found, use that path for all script commands. For example:

If found at .claude/skills/vc-signals/, run: python3 .claude/skills/vc-signals/scripts/<script>.py
If found at ~/.claude/skills/vc-signals/, run: python3 ~/.claude/skills/vc-signals/scripts/<script>.py

Store the resolved path and reuse it for all script calls in this session.

Scripts:

<skill_dir>/scripts/persistence.py — save/load briefings, diffs, theme index
<skill_dir>/scripts/github_trending.py — GitHub star velocity search
<skill_dir>/scripts/last30days_adapter.py — last30days integration
<skill_dir>/scripts/attio.py — read-only Attio CRM matching from ATTIO_ACCESS_TOKEN
<skill_dir>/scripts/radar_run.py — weekly evidence collection, quality filtering, Attio merge, and partner-preview rendering
<skill_dir>/scripts/provider_ab_test.py — dry-run-first paid search provider comparison

Config:

<skill_dir>/config/sectors.json — sector taxonomy
<skill_dir>/config/company_aliases.json — company seed map

For free-text identifiers like company names or theme topics, pass --name "Free Text" to save-markdown; it slugifies internally (lowercases, replaces non-alphanumeric runs with hyphens, strips leading/trailing hyphens). Use --slug only when the value is already a known-safe identifier (sector slug, etc.).

Mode: Setup Wizard

Trigger: /vc-signals setup

Walk the user through setup one step at a time. Use plain, non-technical language. Check what's already configured and skip completed steps. If setup was previously completed, do not exit early; run the provider-key update flow below so newly added sources like Exa and Product Hunt can be configured.

Before prompting, load existing config:

mkdir -p ~/.config/last30days
touch ~/.config/last30days/.env
chmod 600 ~/.config/last30days/.env

Use this file as the shared local config:

~/.config/last30days/.env

When a user provides a key, save it as the exact env var name shown below. Preserve existing keys unless the user explicitly asks to replace them.

Step 1: Python Check

RUNTIME_PY="${CODEX_BUNDLED_PYTHON:-}"
if [ -z "$RUNTIME_PY" ] || ! "$RUNTIME_PY" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1; then
  RUNTIME_PY="$(
    for py in python3.14 python3.13 python3.12 python3; do
      command -v "$py" >/dev/null 2>&1 || continue
      "$py" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1 && { command -v "$py"; break; }
    done
  )"
fi
[ -n "$RUNTIME_PY" ] && "$RUNTIME_PY" --version || true

If Python 3.12+ is available, say: "Python is ready at <runtime path>." and move on. If not, say: "You need Python 3.12 or newer. Here's how to install it:" and provide instructions for macOS:

brew install [email protected]

Step 2: Install Python Dependencies

"$RUNTIME_PY" -m pip install 'requests>=2.32,<3' 2>/dev/null || "$RUNTIME_PY" -m pip install --user 'requests>=2.32,<3'
"$RUNTIME_PY" -c 'import requests'

Say: "Installed requests into the Python runtime vc-signals will actually use."

Step 3: Provider Keys

Ask for these provider keys explicitly. Every key is optional, but do not hide Exa or Product Hunt behind a generic "web search" prompt.

Recommended now

Exa API Key (EXA_API_KEY) -- recommended for source-yield work:

"Exa helps us resolve Product Hunt and X launch chatter into official domains and richer page evidence. Paste your Exa API key, or type 'skip'."

If provided, save:

EXA_API_KEY=<value>

Product Hunt API Token (PRODUCT_HUNT_TOKEN) -- recommended for launch discovery:

"Product Hunt gives us structured launch data: products, makers, topics, launch copy, and Product Hunt URLs. Paste your Product Hunt token, or type 'skip'."

If provided, save:

PRODUCT_HUNT_TOKEN=<value>

GitHub Token (GITHUB_TOKEN) -- recommended for OSS market radar: First check:

gh auth status 2>&1

If gh is not authenticated and GITHUB_TOKEN is missing:

"GitHub helps us find OSS momentum and company/market signals. Paste a GitHub Personal Access Token with public_repo, or type 'skip'."

If provided, save:

GITHUB_TOKEN=<value>

Attio Access Token (ATTIO_ACCESS_TOKEN) -- recommended for Marathon owner/status checks:

"Attio lets the packet show whether a company is already known, stale, passed, or missing an owner. Paste an Attio access token, or type 'skip'."

If provided, save:

ATTIO_ACCESS_TOKEN=<value>

X launch radar (XAI_API_KEY, AUTH_TOKEN + CT0, or TWITTER_AUTH_TOKEN + TWITTER_CT0) -- optional:

"X is useful as launch radar, not identity truth. Paste an XAI_API_KEY if you use xAI access for this lane, or paste browser cookies AUTH_TOKEN and CT0. Cookie-based X also needs Node.js for the bundled bird backend; setup can install it with brew install node if missing. TWITTER_AUTH_TOKEN and TWITTER_CT0 also work if your setup already uses those names. You can skip X."

If XAI_API_KEY is provided, save:

XAI_API_KEY=<value>

If browser cookies are provided, save:

AUTH_TOKEN=<value>
CT0=<value>

If the user already has Twitter-prefixed cookie names, preserve them too:

TWITTER_AUTH_TOKEN=<value>
TWITTER_CT0=<value>

After saving cookie-based X values, verify Node.js because the last30days X backend is the bundled bird-search.mjs script:

if ! command -v node >/dev/null 2>&1; then
  if command -v brew >/dev/null 2>&1; then
    brew install node || true
  elif [ -x /opt/homebrew/bin/brew ]; then
    /opt/homebrew/bin/brew install node || true
  else
    echo "Node.js missing; cookie-based X needs Node. Install with: brew install node"
  fi
fi
command -v node >/dev/null 2>&1 && node --version || true

Then re-run <skill_dir>/scripts/last30days_adapter.py check. If x_backend.ready is false with cookie keys present, report the missing dependency instead of saying X is active.

Optional search alternatives

Ask only if the user wants lower-cost provider trials or already has keys:

SERPER_API_KEY
DATAFORSEO_LOGIN and DATAFORSEO_PASSWORD
BRAVE_API_KEY (guarded; do not auto-enable broad last30days grounding)
YOU_API_KEY or YDC_API_KEY
PERPLEXITY_API_KEY

Make the cost rule clear:

"Normal weekly runs keep broad paid last30days grounding disabled. Exa can still be used for targeted hard-evidence resolution. To intentionally enable broad grounding, use VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1 and a dollar cap."

Optional deep research

OpenRouter API Key (OPENROUTER_API_KEY) -- optional:

"OpenRouter gives access to Perplexity-style deep research for theme drill-downs. Paste an OpenRouter key, or type 'skip'."

Optional social/video

ScrapeCreators API Key (SCRAPECREATORS_API_KEY) -- optional:

"ScrapeCreators unlocks TikTok, Instagram, and YouTube through last30days. Paste a key, or type 'skip'."

Manual-mode structured providers

Do not block setup on these. Say:

"Crunchbase, Coresignal, and LinkedIn stay manual-mode for now unless you already have compliant API access. You can add these later, but they are not required for this sprint."

Accepted optional direct-provider vars:

CORESIGNAL_API_KEY
CRUNCHBASE_API_KEY or CRUNCHBASE_TOKEN
LINKEDIN_ACCESS_TOKEN or LINKEDIN_API_KEY

Direct LLM API fallback

Do not ask for OpenAI, Gemini, or xAI as normal reasoning providers. Say:

"Claude Code/Codex is the reasoning layer. Direct OpenAI/Gemini/xAI LLM APIs are only for standalone non-harness runs and stay disabled unless VC_SIGNALS_ALLOW_DIRECT_LLM_API=1 is set."

Step 4: last30days Research Engine

Check availability:

"$RUNTIME_PY" <skill_dir>/scripts/last30days_adapter.py check

If not installed, tell the user:

"The last30days research engine gives us much better results by searching Reddit, Hacker News, X/Twitter, and YouTube independently. Without it, we'll use web search which still works but gives less detailed results."

"Want me to set it up? It takes about 5 minutes and requires a few API keys. Or we can skip this and use web search for now."

If they want to proceed:

git clone https://github.com/mvanhorn/last30days-skill.git vendor/last30days-skill

Step 5: Save Configuration

Save all collected keys to ~/.config/last30days/.env:

mkdir -p ~/.config/last30days

Write the .env file with all provided keys. Preserve existing keys, append missing keys, and never echo secret values back to the user.

Add SETUP_COMPLETE=true at the end.

Then lock down the file permissions:

chmod 600 ~/.config/last30days/.env

Setting secure file permissions so only your user can read the keys.

Step 6: Verify

Run a quick test:

"$RUNTIME_PY" <skill_dir>/scripts/last30days_adapter.py check

Then run source access detection:

"$RUNTIME_PY" <skill_dir>/scripts/source_access.py

Optionally verify key presence without printing secrets:

grep -q '^EXA_API_KEY=' ~/.config/last30days/.env && echo "Exa configured" || echo "Exa missing"
grep -q '^PRODUCT_HUNT_TOKEN=' ~/.config/last30days/.env && echo "Product Hunt configured" || echo "Product Hunt missing"
grep -q '^ATTIO_ACCESS_TOKEN=' ~/.config/last30days/.env && echo "Attio configured" || echo "Attio missing"

Step 7: Summary

Print what's configured and what each unlocks:

Setup complete. Here's what's active:

[x/skip] Exa -- targeted hard-evidence search and official-domain resolution

[x/skip] Product Hunt -- structured product launch source

[x/skip] GitHub API -- trending repo and OSS market-radar discovery

[x/skip] Attio -- owner/status/pass/stale context

[x/skip] last30days -- Reddit, HN, X/Twitter, YouTube

Harness LLM reasoning -- Claude/Codex plans searches and validates evidence

[x/skip] Deep research (OpenRouter) -- Perplexity synthesis for theme drill-downs

[x/skip] X -- launch radar

[manual] Crunchbase/Coresignal/LinkedIn -- manual-mode unless direct API keys were provided

(Show [x] for configured items, [ ] for skipped items, based on what the user actually set up.)

You can run /vc-signals setup again anytime to add more capabilities. Try it out: /vc-signals radar all

Mode: Radar (Weekly Marathon Scan)

Triggers: /vc-signals radar <sector|all> or /vc-signals weekly <sector|all> (alias)

Sectors: devtools, cybersecurity, ai-infra, vertical-ai, data-infra, oss. Use all for the default Marathon weekly artifact.

If sector is not recognized, say so and list valid sectors.

Default Marathon output: one unified all-sector brief delivered Monday 8:00 AM ET when scheduled. Put a cross-sector summary and top 10-15 companies first, then compact sections for each sector. Do not dump 50 rows into Slack; Slack gets the teaser/digest and a link or artifact for the full brief. If no Slack channel is configured yet, produce the Markdown artifact and include instructions to set the channel later.

Step 1: Load Configuration

Read the sector taxonomy:

cat <skill_dir>/config/sectors.json

Read the company alias map:

cat <skill_dir>/config/company_aliases.json

For deterministic weekly runs, use the orchestration helper as the canonical packet writer. Do not replace its generated report with a hand-written web-research memo:

python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir>

This saves raw evidence JSON, normalized signals, scored candidates, a weekly preview, a weekly focus packet, and quality-gate.json. The default weekly command is the full-quality safe path: it uses Product Hunt, YC, GitHub, X/social sources where configured, and direct Exa-first hard-evidence resolution for Product Hunt/X rows by default. Broad last30days paid grounding is disabled by default even when Brave, Exa, Serper, or Parallel keys exist; enable it only for intentional deep validations with VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1 or --allow-last30days-grounding. Disable targeted hard evidence only with --no-hard-evidence-live or VC_SIGNALS_HARD_EVIDENCE_DISABLE=1. Signal investigation is runtime-capped by default (--signal-investigation-max-runtime-seconds overrides it) so weekly runs finish with explicit partial investigation instead of hanging quietly.

After every weekly run, read <output_dir>/quality-gate.json before summarizing. The gate is the product truth:

passing means the generated packet cleared the canonical weekly shape.
partial means the row counts passed but some source coverage needs caveats.
thin means the packet is a partial review queue, not a full-quality weekly radar.
smoke means setup/artifact validation only; never judge Marathon output quality from it.

If quality-gate.json says do_not_freestyle_final_report: true, obey it. Summarize the generated weekly-preview.md and weekly-focus.md; do not create a replacement radar with ad hoc web research. Extra Claude intelligence is allowed only as a clearly labeled supplement, such as "Supplemental leads to verify", and must not promote unsupported companies into the canonical rows.

Official-site crawling is targeted/opt-in because it is useful but slow. Enable it only for focused evidence-completion passes with VC_SIGNALS_OFFICIAL_SITE_CRAWL_ENABLE=1; normal weekly runs rely on hard-evidence search and targeted manual enrichment first.

Paid-search guardrails are always expected for local weekly runs:

Shared provider cache: ~/.cache/vc-signals/provider-search-cache (VC_SIGNALS_PROVIDER_CACHE_DIR overrides it)
Spend ledger: ~/.cache/vc-signals/paid-search-ledger.jsonl (VC_SIGNALS_PAID_SEARCH_LEDGER_PATH overrides it)
Default caps: smoke/dev $0.50, manual enrichment $2, weekly $8, deep validation $25
Override cap: VC_SIGNALS_PAID_SEARCH_MAX_USD=<amount>
last30days passes --web-backend=none by default when paid web keys exist; after VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1, it can use VC_SIGNALS_LAST30DAYS_WEB_BACKEND, Exa, Serper, Parallel, or Brave with VC_SIGNALS_ALLOW_BRAVE_AUTO=1

Before a validation sprint or any repeated run, preview paid-search cost and exit before live collection:

python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir> --paid-search-dry-run

To compare search providers without spending, use the A/B scaffold:

python3 <skill_dir>/scripts/provider_ab_test.py --queries-file queries.json --providers brave,exa,serper,dataforseo

Add --live --max-usd 1 only when the user intentionally wants a paid provider trial.

For a lightweight smoke test, and only when the user wants to try the flow quickly, run:

python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir> --first-pass

--first-pass uses one query per sector and a 45-second per-query cap. Do not use first-pass mode to judge final Marathon radar quality.

The weekly artifact must not silently omit sectors or source quality. If a sector has no qualified candidates, render a sector coverage note explaining whether the cause was no source evidence, source-not-candidate-eligible evidence, weak evidence, or missing grounded web/company enrichment. If Product Hunt, YC, X, hard evidence, Attio, or non-OSS company rows are missing or thin, rely on the generated quality gate instead of writing around the gap.

For lower-level debugging, run collection and preview separately:

python3 <skill_dir>/scripts/radar_run.py collect --output-dir <output_dir>

This writes raw evidence JSON, filters obvious GitHub/reddit noise, uses the bundled Python 3.12 runtime for last30days when available, and keeps collection separate from investor judgment.

To build an automatic scored preview from saved evidence:

python3 <skill_dir>/scripts/radar_run.py preview --from-evidence <output_dir>/<YYYY-MM-DD>-raw-evidence.json --output <output_dir>/<YYYY-MM-DD>-auto-scored-preview.md

The automatic preview is intentionally conservative: it extracts candidate companies/projects, applies first-pass Investment Interest and Evidence Confidence scores, merges Attio context when ATTIO_ACCESS_TOKEN is present, filters low-interest candidates, and renders only Medium/High-interest rows. If it underfills the table, label the run as thin/partial through the generated quality gate. Do not lower the threshold, and do not replace the canonical packet with freestyled synthesis.

The preview schema includes LinkedIn, Founders, and X columns. Fill them only from evidence, Attio/CRM fields, structured seed input, or explicit user-provided data. Do not invent LinkedIn or founder links. Without grounded web or LinkedIn-capable evidence, leave those cells blank and treat them as next diligence.

Step 2: Check for Previous Briefing (Week-over-Week)

python3 <skill_dir>/scripts/persistence.py load-previous --sector <SECTOR> --before $(date +%Y-%m-%d)

If a previous briefing exists, save it for comparison in Step 7.

Also load the theme index to check for durable themes:

cat <skill_dir>/data/history/theme_index.json 2>/dev/null || echo "{}"

Use this to identify themes that have appeared in 3+ consecutive scans — these are candidates for the "Durable" section of the week-over-week comparison.

Step 3: Select Retrieval Path

python3 <skill_dir>/scripts/last30days_adapter.py check

If installed is true -> use last30days path. The engine has useful zero-config free sources (reddit, hackernews, github, polymarket) even when no optional keys are configured. Otherwise -> use WebSearch path.

Tell the user which path you're using:

"Using last30days for multi-source research. Free sources are active; optional keys unlock X, YouTube, TikTok/Instagram, and grounded web."
"Using web search for research. For deeper coverage across Reddit, HN, X, and YouTube, run /vc-signals setup."

Step 4: Retrieve Evidence

WebSearch path:

Generate 8-12 search queries from the taxonomy. Use the sector's discovery_queries plus queries built from subcategories seed_queries. Run each query using the WebSearch tool. Collect all results.

Example query generation for devtools:

Use the sector's discovery_queries directly (6 queries)
Pick the most important seed query from each subcategory (6 queries)
Total: ~12 queries

For each query, use WebSearch. Collect titles, URLs, snippets.

Filtering noise: Check the sector's negative_terms list from the taxonomy config. Skip or deprioritize results that are clearly tutorial content, beginner guides, or consumer product reviews. These terms exist to reduce noise — use them when evaluating search results.

last30days path (if available):

Run 3-5 queries through the adapter with auto-resolve enabled. Auto-resolve discovers the right subreddits, X handles, and GitHub context automatically — no hardcoded lists needed.

Use --lookback-days to control the time window. Default for radar/weekly scans is 14 days. If the user appended a time window like 7d or 30d to the command, use that number's digits instead.

Curated subreddits from sectors.json are passed alongside --auto-resolve so the known-good list is always covered while auto-resolve discovers new sources.

# Set LOOKBACK_DAYS once for this scan, then reuse across all queries below.
LOOKBACK_DAYS=14   # radar/weekly default; override with the user's time-window digits if specified

# Read curated subreddits from the sector taxonomy.
SUBREDDITS=$(python3 -c "import json; print(','.join(json.load(open('<skill_dir>/config/sectors.json'))['<SECTOR>'].get('subreddits', [])))")

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<specific theme query>" --sources "reddit,hackernews,x,youtube,github,polymarket,grounding" --subreddits "${SUBREDDITS}" --auto-resolve --store --lookback-days ${LOOKBACK_DAYS}

Run each of the sector's hn_queries from the config, plus 2-3 discovery queries. Auto-resolve handles subreddit and handle targeting.

For vertical-ai and prosumer/creator-heavy workflows, add TikTok/Instagram only when they plausibly carry customer pull:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<vertical AI query>" --sources "reddit,hackernews,x,youtube,tiktok,instagram,github,grounding" --tiktok-hashtags "<tags>" --ig-creators "<handles>" --auto-resolve --store --lookback-days ${LOOKBACK_DAYS}

For infra-heavy sectors (devtools, cybersecurity, ai-infra, data-infra, oss), keep TikTok/Instagram off by default unless the user asks.

IMPORTANT: Query strategy matters. Hacker News search works best with specific, concise queries — NOT broad category dumps. Use 2-4 focused keywords per query.

Good queries (specific, get results):

"AI coding agent Cursor Claude Code"
"supply chain attack secrets security"
"AI code review tools"
"platform engineering internal developer portal"

Bad queries (too broad, return 0):

"CI CD testing observability platform engineering" (too many unrelated keywords)
"emerging developer tools trends 2026" (too generic)

After collecting results, filter by engagement:

HN: prioritize items with points > 20 or comments > 10
Reddit: prioritize items with score > 10 or num_comments > 5
X: prioritize items with high engagement

If auto-resolve fails or last30days results are thin, supplement with WebSearch using site: targeting:

"<topic> site:news.ycombinator.com"
"<topic> site:reddit.com/r/programming"

Step 5: Retrieve GitHub Trending Data

python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 15

This runs in addition to the general retrieval. GitHub data feeds into momentum scoring and company mapping.

Step 6: Synthesize Themes

Now you have all the evidence. This is where your reasoning does the heavy work.

Extract candidate themes:

Read through ALL retrieved evidence (search results, last30days items, GitHub repos)
Identify recurring topics, problems, technologies, or shifts mentioned across multiple sources
Name each candidate theme concisely (e.g., "AI-Powered Code Review", "Browser Sandboxing for AI Agents")
Tag each with the relevant subcategory from the taxonomy

Cluster and deduplicate:

Merge near-duplicate themes. Examples:
- "AI code review" + "LLM-powered code review" + "automated PR review" -> "AI-Powered Code Review"
- "shift-left security" + "developer-first security" -> "Developer-First Security Tooling"
Pick canonical names that are specific enough to be useful, generic enough to cover the cluster

Filter low-yield themes (Phase 1 radar requirement):

After clustering, drop any theme that produces fewer than 3 mappable companies in Step 7. A theme without investable companies is signal noise — better to surface 6 themes × 8 companies than 12 themes × 2 companies.

Operationally: do a quick first pass through Step 7's company mapping for each candidate theme. If a theme yields fewer than 3 companies (across seed map, evidence, and GitHub data combined), drop it from the radar before scoring momentum. Note the dropped themes in the persistence record so future scans can detect when a previously-dropped theme starts producing companies.

Score each theme -- Momentum (1-10):

Assign a transparent momentum score. For each theme, weigh these factors:

Factor	Weight	What to look for
Recency	High	Are discussions from the last 1-2 weeks?
Source diversity	High	Does it appear across multiple independent sources?
Repetition density	Medium	How many distinct mentions vs one viral post?
Engagement signals	Medium	High upvotes/comments/stars on related content?
Novelty	Medium	Is this a NEW conversation or evergreen background chatter?
Technical specificity	Low	Specific tools/approaches mentioned vs vague hand-waving?
GitHub velocity	Low	Are related repos showing star acceleration?

Rubric:

8-10: Breakout -- multiple sources, very recent, specific tools named, high engagement, new conversation
5-7: Rising -- clear signal but fewer sources, or overlaps with existing known trends
3-4: Ambient -- mentioned but not clearly accelerating, could be background noise
1-2: Faint -- single source, low engagement, or possibly stale

You MUST explain how you arrived at each score in 1-2 sentences.

Rate confidence (low / medium / high):

High: Multiple independent sources, specific evidence, corroborated
Medium: Clear signal but limited sources or partially inferred
Low: Thin evidence, single source, or extrapolated

Assess investment timing (early / mid / late):

Early: Problem is being discussed, OSS projects emerging, no clear commercial winners
Mid: Commercial players exist, some funding, category is forming
Late: Well-known category, established players, late-stage rounds

Hype vs Durable verdict: One blunt sentence. Example: "Durable -- real pain point with multiple well-funded solutions." or "Likely hype -- single viral post driving most of the signal, unclear staying power."

Theme tags will be computed in Step 9, AFTER the theme index is updated (compute-tags reads post-update counts). Do not pre-compute tags here.

Step 7: Map Companies

For each surviving theme (after Step 6 filtering), identify 8-12 relevant companies using three sources:

Seed map: Check company_aliases.json — does any known company map to this theme?
Evidence: Were any companies/projects mentioned in the search results for this theme?
GitHub data: Do any trending repos from Step 5 relate to this theme?

Target: 30-50 total companies across the radar. This is the centerpiece of the output, not an accessory. For all, spread coverage across sector sections and avoid letting AI infra crowd out every other sector.

Deduplicate across themes:

A company that appears in evidence for multiple themes (e.g., CodeRabbit in both "AI Code Review" and "Agentic Coding") must appear in the company list ONCE, not twice. Pick the strongest theme as primary_theme, list the others in secondary_themes. The output table shows only primary_theme.

To pick primary_theme:

Prefer the theme with the most direct evidence
If tied, prefer the theme where the company is direct_solver over beneficiary
If still tied, pick the theme with higher momentum

For each company, capture:

Field	Source	Notes
`name`	Display form	"MintMCP", "Anysphere (Cursor)"
`primary_theme`	Picked above	The theme this company most clearly rides
`secondary_themes`	Other themes the company touches	List of theme names
`why_on_radar`	One sentence	The single most concrete reason this is investable. Specific signal, not generic.
`evidence_url`	Best source	URL of the strongest piece of evidence
`role`	direct_solver / beneficiary / adjacent / unclear	Same taxonomy as before
`confidence`	confirmed / likely / inferred / speculative	Same as before
`stage`	null	Phase 2 fills this; leave null for now
`raised`	null	Phase 2 fills this
`headcount`	null	Phase 2 fills this
`founders`	null	Phase 2 fills this
`investment_interest`	high / medium / low or 0-100	How much Marathon should care if the evidence is true
`evidence_confidence`	high / medium / low or 0-100	How well-supported the claim is
`attio_status`	no_match / stale / no_owner / active / passed / unknown	Phase 4 fills this; use unknown until Attio is connected
`action`	assign owner / refresh note / monitor only / flag quietly / ignore	Use judgment; passed companies get flag quietly
`why_this_may_be_noise`	One sentence	Skeptical counter-case
`source_links`	List	1-3 supporting URLs

The why_on_radar field is critical. It's what Alex reads in the radar table. Bad: "AI testing company". Good: "AI-native test gen, launched 3 weeks ago, ex-Datadog founders". One sentence, specific signal.

Company tags will be computed in Step 9, AFTER the company index is updated. Do not pre-compute tags here.

Do NOT:

Pretend to know things you don't (especially funding amounts — leave null until Phase 2)
Map a company to a theme just because the names sound related
Duplicate companies across themes — pick one primary

Step 7.5: Match Against Attio

If ATTIO_ACCESS_TOKEN is available, enrich the company rows before formatting:

python3 <skill_dir>/scripts/attio.py check

Then pass the deduplicated company list:

printf '%s' '<companies-json>' | python3 <skill_dir>/scripts/attio.py enrich

Input shape:

{"companies":[{"name":"Cascade","domain":"runcascade.com"}]}

Use the returned fields:

attio_status: no_match, no_owner, active, stale, passed, or unknown
attio_action: assign owner, refresh note, monitor only, flag quietly, etc.
attio_lists: human-readable list names
attio_attributes: selected CRM fields such as status, last interaction, round, headcount, and amount raised when present

Matching rule: prefer company domain over name. The Attio client verifies stored domains before accepting a domain-based fuzzy search result, because Attio search can return similarly named but wrong companies. If no domain is known, use company name but mark weak matches conservatively.

Status rule:

no_match -> new discovery; action assign owner
no_owner -> already in Attio but needs assignment; action assign owner
stale -> old pipeline or stale interaction context; action refresh note
passed or Deprioritized -> action flag quietly and explain what changed
active -> avoid duplicate outreach; action monitor only or enrich existing context

Step 8: Format Output

Output uses the radar format: themes are short headers, the company table is the centerpiece, week-over-week diff lives at the top.

Begin with the week-over-week diff (only if a previous briefing exists):

echo '{"current": <CURRENT_COMPANIES_JSON>, "previous": <PREV_COMPANIES_JSON>}' | \
  python3 <skill_dir>/scripts/persistence.py company-diff

The output gives new_companies and faded_companies. Use these to populate the "New To Radar" and "Faded Off Radar" sections below.

Output template:

## VC Radar: {Sector Display Name} — Week of {YYYY-MM-DD}

### What's Moving

[For each theme, exactly 3 lines:]
- **{Theme Name}** — {TAG (Nw)}. {one-line why-now in <120 chars}.
  Companies riding this: {N}

[6-8 themes max. TAG is one of NEW, ACCELERATING (#prev → #curr), PERSISTENT (Nw), or omitted if no tag. (Nw) means "N weeks active".]

### Company Radar ({N} companies)

| Company | Sector | Theme | Interest | Evidence | Attio | Action | Why On Radar | Why This May Be Noise | Sources |
|---------|--------|-------|----------|----------|-------|--------|--------------|------------------------|---------|
| MintMCP | AI Infra | MCP Infra | High | Medium | unknown | assign owner | First SOC2-compliant MCP gateway, picking up enterprise pilots | New category; buyer urgency unproven | [1](...) |
| CodeRabbit | Devtools | AI Code Review | Medium | High | unknown | monitor only | 2M repos, 13M PRs reviewed | Crowded category; likely consensus | [1](...) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

[30-50 rows. For `all`, group by sector after the cross-sector top 10-15. Include NEW / RETURNING / PERSISTENT tags inside Why On Radar when useful instead of adding a separate tag column.]

### New To Radar This Week ({N} companies)

[Bulleted list, max 10. If more than 10 are new, show top 10 by best evidence and note "+N more in the table above".]
- {Company} — {primary_theme}. {why_on_radar}
- ...

### Faded Off Radar ({N} companies)

[Bulleted list, max 10. From company-diff faded_companies.]
- {Company} — last seen {date}, was in {primary_theme}.

### Theme Detail

[For each theme, 2-3 lines max. NOT the long-form analysis from v1. Pointer to deep-dive.]
- **{Theme}** ({company_count} companies) — {2-sentence context}.
  Run `/vc-signals deep "{Theme}"` for full evidence and subthemes.

Sorting rules:

Themes in "What's Moving" are sorted by tag (NEW → ACCELERATING → PERSISTENT → no-tag), then by momentum descending.
Company Radar rows are sorted by primary_theme (matching the order themes appear in "What's Moving"), then alphabetically by company name within each theme.
New To Radar is sorted by primary_theme matching above.

Length budget: the entire radar should fit in roughly 150-250 lines. If it's longer, the per-theme detail is too verbose — tighten it.

Slack Delivery Instructions

If the user asks to schedule or post to Slack:

Default schedule: Monday 8:00 AM ET.
Channel is intentionally not hardcoded. Ask for or read the configured channel name at scheduling time.
Slack teaser should include only the top 10-15 cross-sector companies/projects, their sector, Investment Interest, Evidence Confidence, Attio status, action, and one-line noise caveat.
Attach or link the full Markdown briefing. Do not paste the full 30-50 row table into Slack by default.

To modify later: change the channel name in the scheduler/Slack connector configuration, not in the radar logic. If no channel is configured, save the briefing and print: "Slack channel not configured yet; set VC_SIGNALS_SLACK_CHANNEL or update the scheduler prompt with the desired channel."

Step 9: Persist Results

Order matters. Snapshot the OLD indices, compute tags from the snapshots (so NEW tags fire correctly for first-time entries), THEN update the indices, THEN save. Computing tags after the update breaks NEW: the entry exists in the index by then.

Step 9a: Snapshot the OLD indices (before any updates)

OLD_THEME_INDEX=$(cat <skill_dir>/data/history/theme_index.json 2>/dev/null || echo "{}")
OLD_COMPANY_INDEX=$(python3 <skill_dir>/scripts/persistence.py load-company-index)

Step 9b: Compute tags from the OLD snapshots

echo '{"themes": <THEMES_JSON>, "companies": <COMPANIES_JSON>, "theme_index": '"$OLD_THEME_INDEX"', "company_index": '"$OLD_COMPANY_INDEX"'}' | \
  python3 <skill_dir>/scripts/persistence.py compute-tags

The output enriches each theme and company with a tag field (NEW / ACCELERATING / RETURNING / PERSISTENT / null). Use these tags when rendering the markdown for Step 8 — re-render that section using the tagged data before saving.

Step 9c: Update the theme index

cat <<'THEMES_EOF' | python3 <skill_dir>/scripts/persistence.py update-index --sector <SECTOR> --date $(date +%Y-%m-%d)
[the JSON themes array goes here]
THEMES_EOF

Step 9d: Update the company index (NEW in Phase 1)

cat <<'COMPANIES_EOF' | python3 <skill_dir>/scripts/persistence.py update-company-index --sector <SECTOR> --date $(date +%Y-%m-%d)
[the JSON companies array goes here]
COMPANIES_EOF

Step 9e: Save the briefing JSON (themes + companies in one call)

cat <<'BRIEFING_EOF' | python3 <skill_dir>/scripts/persistence.py save-briefing --sector <SECTOR> --retrieval-path <websearch|last30days> --date $(date +%Y-%m-%d)
{"themes": [the tagged themes array], "companies": [the tagged companies array]}
BRIEFING_EOF

Step 9f: Save the rendered markdown

cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir briefings --slug <SECTOR> --date $(date +%Y-%m-%d)
[the rendered radar markdown content goes here]
MD_EOF

If any persistence step fails, warn the user but still display the full briefing inline. Do not crash.

Mode: Agent-Native Research Workbench

Trigger: /vc-signals workbench <run-dir>

Use this when the user wants Codex/Claude's own LLM judgment over a weekly run, especially when grounded company discovery is unavailable or the radar is OSS-heavy.

This mode is a verification workbench, not a canonical radar writer:

It may synthesize source gaps, theme hypotheses, possible companies requiring verification, and next searches.
It must not add rows to candidates.json.
It must not claim company domains, funding, headcount, founders, customers, or stage unless those facts appear in the supplied evidence.
Possible companies stay "requiring verification" until grounded source URLs support them.

When the user types /vc-signals workbench <run-dir>, do the whole flow. Do not make the user run a second prompt.

First create the machine-readable pack:

python3 <skill_dir>/scripts/radar_run.py workbench --from-run <run-dir> --output-dir <run-dir>-workbench

Then read both generated files yourself:

<run-dir>-workbench/research-workbench-prompt.md
<run-dir>-workbench/research-workbench-input.json

Use the JSON as the only factual source and write <run-dir>-workbench/research-workbench.md with:

Partner Notes
Source Gap Diagnosis
Theme Hypotheses
Possible Companies Requiring Verification
Recommended Next Searches

Finally tell the user where the summary was saved. If the requested run directory is missing, ask them to run /vc-signals radar all first.

If the user wants a lead promoted into the weekly radar, first run or request real verification searches that return credible company/product URLs.

Mode: Theme Drill-Down

Trigger: /vc-signals theme "<topic>"

Step 1: Load Config

Read sectors.json and company_aliases.json.

Check if the topic maps to a known subcategory. If yes, use its seed_queries as starting points. If no, generate queries from scratch.

Step 2: Retrieve Evidence

Use the same retrieval path selection as weekly scan (check last30days availability).

Run 5-8 targeted queries about the specific theme. Include:

"[topic] trends emerging"
"[topic] startups companies"
"[topic] open source projects github"
"[topic] hacker news discussion"
"[topic] why now 2026"
"[topic] problems challenges"

last30days path (if available):

Run 3-5 queries through the adapter with auto-resolve enabled. Default lookback for theme drill-downs is 30 days. If the user appended a time window like 7d or 30d to the command, use that number's digits instead.

# Set LOOKBACK_DAYS once for this drill-down, then reuse across queries below.
LOOKBACK_DAYS=30   # theme drill-down default; override with the user's time-window digits if specified

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<topic-specific query>" --sources "reddit,hackernews,x" --auto-resolve --lookback-days ${LOOKBACK_DAYS}

Auto-resolve discovers the right subreddits, X handles, and GitHub context automatically for the theme.

Deep research (if OPENROUTER_API_KEY is configured):

For theme drill-downs, use the deep research mode for comprehensive synthesis with 50+ citations:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<topic> emerging trends companies" --deep-research --auto-resolve --lookback-days ${LOOKBACK_DAYS}

This uses Perplexity Sonar Pro (~$0.90 per query) to produce a structured research report with citations. If deep research is not available (no OPENROUTER_API_KEY), fall back to the standard multi-query approach above.

Also run GitHub trending for related keywords:

python3 <skill_dir>/scripts/github_trending.py --sector all --limit 10

(Filter results to those matching the theme in post-processing.)

Step 3: Synthesize and Output

## Theme Deep-Dive: [Topic]

### What Is This?
[2-3 sentence explanation of the theme for someone unfamiliar]

### Why Now?
[What changed recently that made this theme emerge? Be specific -- new tech, new problem, regulatory shift, etc.]

### Key Subthemes
- **Subtheme A:** [1-2 sentences]
- **Subtheme B:** [1-2 sentences]
- **Subtheme C:** [1-2 sentences]

### Evidence
- [Source 1: title, URL, key insight]
- [Source 2: title, URL, key insight]
- [Source 3+]

### Adjacent Categories
- [Related theme 1 -- how it connects]
- [Related theme 2 -- how it connects]

### Companies Solving the Problem
| Name | What They Do | Confidence | Source |
|------|-------------|------------|--------|

### Companies Benefiting From the Trend
| Name | How They Benefit | Confidence |
|------|-----------------|------------|

### Relevant OSS Projects
| Project | Stars | Velocity | Commercial Entity |
|---------|-------|----------|------------------|

### Durable vs Hype
[Blunt, honest assessment. 2-3 sentences. What could make this fade? What would confirm it's real?]

### Investment Implications
- **Timing:** Early/Mid/Late
- **What to watch for:** [Specific signals that would confirm or invalidate this theme]
- **Biggest risk:** [One sentence]

Step 4: Persist

cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir themes --name "<topic>" --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF

Mode: Company Backtrace

Trigger: /vc-signals company "<name>"

Step 1: Check Seed Map

Read company_aliases.json. Check if the company exists. If yes, note its known themes, sectors, and OSS projects.

Step 1.5: Select Retrieval Path

Check if last30days is available (same as weekly scan Step 3). If available, use it for source-specific searches in Step 2.

Step 2: Retrieve Evidence

Run 4-6 queries:

"[company name] trends news"
"[company name] product updates"
"[company name] competitors market"
"[company name] open source projects"
"[company name] funding investment"

last30days path (if available):

Search for the company across sources with auto-resolve. Default lookback for company backtrace is 30 days. If the user appended a time window like 7d or 30d to the command, use that number's digits instead.

# Set LOOKBACK_DAYS once for this backtrace, then reuse across queries below.
LOOKBACK_DAYS=30   # company backtrace default; override with the user's time-window digits if specified

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --sources "hackernews,reddit,x" --auto-resolve --quick --lookback-days ${LOOKBACK_DAYS}

If the company has known OSS projects from the seed map, also search for those.

GitHub deep search (if company has known GitHub org or repos):

If the company has known OSS projects from the seed map, search for the repo directly:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --github-repo "<owner/repo>" --auto-resolve --quick --lookback-days ${LOOKBACK_DAYS}

If you can identify a founder's GitHub username, search their activity:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<founder name>" --github-user "<username>" --quick

X/Twitter deep search:

If the company or founder has a known X handle, search their timeline:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --x-handle "<handle>" --sources "x" --quick

Check GitHub:

python3 <skill_dir>/scripts/github_trending.py --sector all --limit 30

Filter for repos owned by or related to the company.

Step 3: Map to Rising Themes

Cross-reference the evidence against:

Known themes from previous weekly scans (load recent briefings)
Themes apparent from current evidence
Seed map themes

Step 4: Output

## Company Backtrace: [Company Name]

### Overview
[What the company does, 2-3 sentences]

### Theme Exposure
| Rising Theme | Role | Confidence | Evidence |
|-------------|------|------------|----------|
| Theme A | Direct solver | Confirmed | [Brief evidence] |
| Theme B | Beneficiary | Likely | [Brief evidence] |

### OSS / Ecosystem Signals
- [Project 1: stars, velocity, relevance]
- [Project 2 if applicable]

### Competitive Context
[Brief note on who else operates in the same themes. NOT a full competitive analysis -- just enough for context.]

### Confidence Notes
[What you're confident about, what's uncertain, what you couldn't verify]

Step 5: Persist

cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir companies --name "<company name>" --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF

Mode: OSS Radar

Trigger: /vc-signals oss <sector> [time]

Use this when the user asks for OSS startup discovery, fast-growing repos, open-source companies, or GitHub-first market signal. This mode is conceptually inspired by Gokul Rajaram's OSS Startup Radar, with permission to reuse ideas/code, but VC Signals should adapt it to Marathon's Seed-to-Series-B workflow.

Step 1: Scope

Valid sectors: ai-infra, devtools, cybersecurity, data-infra, vertical-ai, all. If no sector is provided, default to ai-infra for OSS radar.

Step 2: Collect Repo Candidates

Use GitHub trending and last30days together:

python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 50

Then search community signal:

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<repo or theme>" --sources "reddit,hackernews,x,youtube,github" --github-repo "<owner/repo>" --quick --store --lookback-days 30

Step 3: Rank OSS Signal

Do not rank by total stars alone. Score these separately:

30/60/90-day star velocity and stars/day
age-adjusted momentum
community signal from Reddit, HN, X, YouTube, and GitHub discussions/issues where available
contributor/maintainer quality
repo quality: installability, docs, releases, issue health, package/demo path
license and commercialization path
company formation probability
star authenticity/noise risk

Use the Alex/Gokul OSS email as the output bar: top themes, top 25 fast-growing projects, 30/60/90 velocity, community signal, funding/stage label, founder/contributor profiles, and clear methodology. Improve it by adding Investment Interest, Evidence Confidence, action, and "Why This May Be Noise."

Step 4: Assign Repo Type And Action

Repo type must be one of:

company-backed
founder-likely
ecosystem signal
portfolio-support relevant
high-noise

Action must be one of:

watch
contact maintainer
map ecosystem
track company formation
ignore

Step 5: Output

## OSS Radar: {Sector} -- {YYYY-MM-DD}

### Top Themes
- **{Theme}** — {2 sentence why-now}. Key projects: `{repo}`, `{repo}`, `{repo}`.

### Top 25 OSS Projects

| Repo | Type | Action | Interest | Evidence | +30d | +60d | +90d | Community | Why Now | Why This May Be Noise |
|------|------|--------|----------|----------|------|------|------|-----------|---------|------------------------|
| owner/repo | founder-likely | contact maintainer | High | Medium | +1.5k | +4.9k | +4.9k | 2r+3hn | Agent memory is emerging | Star spike may be launch hype |

### Founder & Contributor Profiles
- **owner/repo** — top contributors, GitHub profiles, LinkedIn if found, caveat that identity should be verified before outreach.

### Methodology
Explain star velocity windows, community source weighting, funding/stage labeling, filtering, and star-authenticity caveats.

Mode: GitHub Trending

Trigger: /vc-signals github <sector>

Step 1: Run GitHub Trending Script

python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 15

If sector is all, run for each sector and merge results.

Step 2: Enrich with Company Mapping

For each repo in the results:

Check company_aliases.json -- is the owner/org a known company?
Check the repo owner type -- is it an organization (likely a company)?
Use your knowledge + the repo description to identify if there's a commercial entity behind it

Step 3: Map to Themes

For each repo, identify which sector themes it relates to. Use the taxonomy subcategories and your judgment.

Step 4: Output

## GitHub Trending: [Sector] -- YYYY-MM-DD

Repos ranked by star velocity (recent growth rate, not absolute count).

| # | Repo | Stars | 7d Growth | 30d Growth | Language | Commercial Entity |
|---|------|-------|-----------|------------|----------|------------------|
| 1 | owner/name | 12,500 | +450 | +1,800 | Rust | Company Name (Confirmed) |
| 2 | ... | | | | | |

### Standout Repos

**[Repo 1: owner/name]**
- **Description:** [What it does]
- **Why it's interesting:** [1-2 sentences -- what's driving the growth?]
- **Theme mapping:** [Which sector themes this relates to]
- **Commercial entity:** [Company behind it, if any. Monetization status if known.]

**[Repo 2: owner/name]**
...

### Acceleration Alerts
[Repos with unusually high velocity relative to their size -- the "0 to 10k in a week" signals]

### Theme Patterns
[Do multiple trending repos point to the same emerging theme? Call it out.]

Step 5: Persist

cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir github --slug <SECTOR> --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF

Mode: Add Sector

Trigger: /vc-signals add-sector <name> (e.g., /vc-signals add-sector fintech)

This mode lets users add a new sector without editing JSON manually.

Step 1: Get Sector Details

Ask the user for:

Sector name (slug, lowercase, hyphens ok): use what they provided or ask
Display name: suggest one based on the slug, let them confirm

Step 2: Generate Taxonomy

Based on the sector name, propose 4-6 subcategories. For example, if the user says "fintech":

"Here are the subcategories I'd suggest for Fintech:

Payments Infrastructure

Neobanking & Digital Banking

Lending & Credit Platforms

Regtech & Compliance

Embedded Finance

Crypto & DeFi Infrastructure

Want to add, remove, or rename any of these?"

For each subcategory, generate:

name: display name
aliases: 4-6 relevant search terms
seed_queries: 2-3 search queries

Also generate:

subreddits: 5-8 relevant subreddits for the sector
hn_queries: 4-6 HN-optimized short queries
discovery_queries: 4-6 broad discovery queries
negative_terms: 3-5 noise filters

Step 3: Confirm and Save

Show the user the complete sector JSON and ask for confirmation. Then:

cat <skill_dir>/config/sectors.json

Read the current config, add the new sector, and write it back:

python3 -c "
import json
from pathlib import Path
config_path = Path('<skill_dir>/config/sectors.json')
config = json.loads(config_path.read_text())
config['<sector_slug>'] = <new_sector_dict>
config_path.write_text(json.dumps(config, indent=2))
print('Sector added successfully')
"

Step 4: Confirm

"Added with subcategories. Try it out: /vc-signals weekly <sector-slug>"

Graceful Degradation

At every step, handle failures gracefully:

last30days not available: Use WebSearch. Say so. Still produce useful output.
GitHub API rate limited: Use partial results. Warn the user. Suggest running again later.
GitHub token missing: Say "GitHub trending requires a token. Run /vc-signals setup to configure one." Still run the rest of the scan.
Persistence fails: Display full output inline. Warn that it wasn't saved. Do not crash.
WebSearch returns thin results: Note limited coverage. Still extract what themes you can. Be honest about confidence.
Unknown sector: List valid sectors. Don't guess.
No previous briefing for comparison: Skip the week-over-week section. Say "This is your first scan for [sector]. Future scans will include week-over-week comparisons."

Never crash. Never pretend you have data you don't. Always tell the user what worked and what didn't.

vc-signals

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

vc-signals

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

VC Signals: Emerging Theme Discovery for Venture Capital

Argument Parsing

Environment Detection (ALWAYS check this FIRST)

First-Run Detection (check this after environment detection)

Script Paths

Mode: Setup Wizard

Step 1: Python Check

Step 2: Install Python Dependencies

Step 3: Provider Keys

Step 4: last30days Research Engine

Step 5: Save Configuration

Step 6: Verify

Step 7: Summary

Mode: Radar (Weekly Marathon Scan)

Step 1: Load Configuration

Step 2: Check for Previous Briefing (Week-over-Week)

Step 3: Select Retrieval Path

Step 4: Retrieve Evidence

Step 5: Retrieve GitHub Trending Data

Step 6: Synthesize Themes

Step 7: Map Companies

Step 7.5: Match Against Attio

Step 8: Format Output

Slack Delivery Instructions

Step 9: Persist Results

Mode: Agent-Native Research Workbench

Mode: Theme Drill-Down

Step 1: Load Config

Step 2: Retrieve Evidence

Step 3: Synthesize and Output

Step 4: Persist

Mode: Company Backtrace

Step 1: Check Seed Map

Step 1.5: Select Retrieval Path

Step 2: Retrieve Evidence

Step 3: Map to Rising Themes

Step 4: Output

Step 5: Persist

Mode: OSS Radar

Step 1: Scope

Step 2: Collect Repo Candidates

Step 3: Rank OSS Signal

Step 4: Assign Repo Type And Action

Step 5: Output

Mode: GitHub Trending

Step 1: Run GitHub Trending Script

Step 2: Enrich with Company Mapping

Step 3: Map to Themes

Step 4: Output

Step 5: Persist

Mode: Add Sector

Step 1: Get Sector Details

Step 2: Generate Taxonomy

Step 3: Confirm and Save

Step 4: Confirm

Graceful Degradation

Similar Skills

VC Signals: Emerging Theme Discovery for Venture Capital

Argument Parsing

Environment Detection (ALWAYS check this FIRST)

First-Run Detection (check this after environment detection)

Script Paths

Mode: Setup Wizard

Step 1: Python Check

Step 2: Install Python Dependencies

Step 3: Provider Keys

Step 4: last30days Research Engine

Step 5: Save Configuration