From brightdata-plugin
Produces cited research briefs from live web data using Bright Data's Discover API by decomposing questions into multiple intent-ranked queries and synthesizing findings with inline citations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/brightdata-plugin:live-researchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn one research question into a **cited, synthesized brief** by fanning out
Turn one research question into a cited, synthesized brief by fanning out
intent-ranked Discover queries, reading the best sources, and writing up findings
with inline citations. This is a workflow on top of the discover-api skill
— read that for the API mechanics, modes, and parameters.
Use this when the deliverable is understanding (a report/briefing), not a link
list (that's search/discover-api) and not a standing system (that's
rag-pipeline).
Discover must be reachable. Quick check (CLI path):
command -v bdata >/dev/null 2>&1 || echo "CLI missing — see bright-data-best-practices/references/cli-setup.md"
bdata zones >/dev/null 2>&1 || echo "not authenticated — run: bdata login"
(SDK/REST paths just need BRIGHTDATA_API_TOKEN.)
If the question is broad or ambiguous, ask 2–3 clarifying questions before
spending API calls: time horizon, geography/market, depth, and what decision the
research supports. A sharp scope is what makes the intent parameters good.
Break the topic into 4–8 angles (definitions, key players, mechanisms, evidence,
counter-evidence, recent developments, risks). Each angle becomes one Discover
call with its own tailored intent. This beats one broad query — num_results
is capped at 20, so coverage comes from breadth of queries, not one big call.
# one call per angle; --include-content so you read sources in the same pass
bdata discover "stablecoin regulation 2026" \
--intent "recent regulatory actions and proposed legislation, primary sources" \
--include-content --num-results 15 -o angle_regulation.json &
bdata discover "stablecoin reserve transparency" \
--intent "audits, attestations, reserve composition disclosures" \
--include-content --num-results 15 -o angle_reserves.json &
wait
For maximum coverage on a hard topic, use the raw REST flow with "mode":"deep"
(see discover-api) — deep is exhaustive but slower and REST-only.
bdata discover -o file is an object {status, results: [...]} — flatten .results[] from every file before merging.relevance_score desc.relevance_score can still be a 404 stub or a nav-only page): drop rows where content is null, matches a block-page signature, is shorter than ~200 chars, or looks like "not found".# VERIFIED: this is the correct merge. `jq -s 'add | unique_by(.link)'` does NOT work —
# each file is {results:[...]}, so you must flatten .results[] first.
jq -s '
[ .[].results[] ] # flatten results from all files
| unique_by(.link) # dedup by URL
| map(select(
.content != null
and (.content | length) > 200 # drop empty / 404 stubs
and ((.content | test("just a moment|captcha|access denied|cf-browser|page not found|post not found"; "i")) | not)
))
| sort_by(-.relevance_score)
' angle_*.json > corpus.json
echo "kept $(jq length corpus.json) sources"
Or just run the helper (same logic, tested): scripts/merge_corpus.sh -o corpus.json angle_*.json
(-m <n> sets the min content length). Copying the jq by hand is error-prone — prefer the script.
Note: with
--include-content, the leading part ofcontentis usually page nav/boilerplate (menus, logos). When extracting claims (Step 5), skip past the chrome to the article body.
From each kept source's content, pull the specific claims, numbers, dates, and
quotes that answer a sub-question. Track which URL each claim came from — you'll
cite it.
Write the structured brief (template in references/brief-template.md). Every
non-obvious claim gets an inline citation [n] mapping to a numbered source list.
Note disagreements between sources rather than averaging them away.
intents > one vague query.[n] to a real URL from the corpus.relevance_score is comparable.content — every claim must come from the corpus.relevance_scores — if a call failed, report the gap.--include-content and just listing links — that's discover-api, not research.references/brief-template.md — the output structure (exec summary, findings per sub-question, contradictions, gaps, numbered sources) and a worked citation example.scripts/merge_corpus.sh — Step 4 as a tested one-liner: flatten .results[] across angle files, dedup by URL, quality-gate (null/short/404/block-page), sort by relevance_score.discover-api — the underlying API (modes, params, trigger/poll). Read first.rag-pipeline — when the user wants a reusable retrieval system, not a one-time report.competitive-intel — competitor-focused research (pricing, hiring, positioning).brand-listening — social/sentiment research across Reddit/X/TikTok/etc.npx claudepluginhub brightdata/skills --plugin brightdata-pluginMulti-source deep research using firecrawl and exa MCPs. Searches the web, synthesizes findings, and delivers cited reports with source attribution.
Runs deep research on topics via web searches, data collection, source verification, synthesis, and structured reports saved to 02-research/. Use for reports, blogs, or exploring new domains beyond quick answers.
Multi-source deep research using firecrawl and exa MCPs to search, scrape, and synthesize cited reports with source attribution. Use for in-depth topic exploration.