From brightdata-plugin
Extracts web data via Bright Data JS/TS SDK: scrape URLs, platform-specific scrapers (Amazon, LinkedIn, Instagram, etc.), SERP search, AI discovery, datasets, browser automation, and Scraper Studio.
How this skill is triggered — by the user, by Claude, or both
Slash command
/brightdata-plugin:js-sdk-best-practicesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Access web data through a unified Node.js/TypeScript SDK (`@brightdata/sdk`). One
Access web data through a unified Node.js/TypeScript SDK (@brightdata/sdk). One
client, several services: web unlocking (scrapeUrl), platform scraping
(scrape.<platform>), SERP search (search.google/bing/yandex), AI discovery
(discover), datasets, browser automation, and Scraper Studio.
Requires Node.js ≥ 20. Ships ESM + CommonJS with full TypeScript types. The
client is exported as bdclient (lowercase — not a typo).
npm install @brightdata/sdk # or: pnpm add / yarn add
The client reads the BRIGHTDATA_API_TOKEN env var, or you pass { apiKey }.
Get a token at https://brightdata.com/cp/setting/users.
// ESM (package.json "type": "module", or .mjs)
import { bdclient } from '@brightdata/sdk';
// CommonJS (.cjs / "type": "commonjs")
const { bdclient } = require('@brightdata/sdk');
const client = new bdclient(); // reads BRIGHTDATA_API_TOKEN
// const client = new bdclient({ apiKey: '...' });
try {
const html = await client.scrapeUrl('https://example.com');
} finally {
await client.close(); // always close (or `await using`)
}
await using client = new bdclient() (TS 5.2+ / Node ≥20) auto-closes at scope end.
Pick the service BEFORE reaching for a specific method. Most routing mistakes come from skipping this step and pattern-matching on keywords.
Have a URL?
├── On a supported platform (Amazon, LinkedIn, Facebook, Instagram, YouTube,
│ TikTok, Reddit, Pinterest, ChatGPT, Perplexity, DigiKey)?
│ → Platform scraping: client.scrape.<platform>.<method>(urls, opts?)
│
├── Generic page (no dedicated platform scraper)?
│ → Web unlocker: client.scrapeUrl(url, { dataFormat: 'markdown' })
│
└── Need login / JS / click-scroll-fill / CAPTCHA / multi-step nav?
→ Browser API: client.browser.getConnectUrl() (connect via Playwright)
No URL?
├── Want entities matching natural-language criteria
│ ("find AI startups in Berlin", "competitors of Acme")?
│ → Discover: client.discover(query, { intent })
│
├── Want web pages / search-result links ("search Google for X")?
│ → SERP: client.search.google(query) [or .bing / .yandex]
│
├── Want to search WITHIN a platform ("Amazon products by keyword",
│ "LinkedIn jobs", "Instagram reels by profile")?
│ → Platform discovery: client.scrape.<platform>.discover*(filters)
│ or amazon.productSearch(...) (NOTE: client.search is SERP-only here)
│
└── Want bulk/historical data at scale?
→ Datasets: client.datasets.<name>.query(filter) → .download(snapshotId)
Edge cases:
client.scrapeUrl() (web unlocker).If you know the Python SDK (brightdata-sdk), the JS surface differs — do not
port names blindly:
| Concept | Python | JavaScript |
|---|---|---|
| Client | SyncBrightDataClient / BrightDataClient | bdclient (single, all async) |
| Web unlocker | client.scrape_url(url=...) | client.scrapeUrl(url, opts) |
| Platform search | client.search.amazon.products(...) | none — use scrape.amazon.productSearch / discover* |
client.search | SERP and platform search | SERP only (google/bing/yandex) |
| Batch | *_trigger methods + job.wait() | pass a string[] to one call, or *Trigger + job.wait() |
| Naming | snake_case | camelCase |
| Datasets | client.datasets.amazon_products | client.datasets.amazonProducts |
Before claiming a platform method exists/doesn't exist, consult
references/scrapers.md — it lists every platform's verified methods. The SDK
ships TypeScript types, so in a typed project you can also let the compiler/editor
confirm a method exists.
Each platform exposes up to three method styles (see references/scrapers.md):
collect<Thing>(input, opts?) — returns the rows directly (object[]).<thing>(input, opts?) — orchestrated (trigger → poll → download); returns { data, status, rowCount }. Default to this.discover<Thing>By<X>(filters, opts?) — find items by keyword/category/URL filter instead of by direct URL.Likely hallucinations (do NOT write these — verify in references/scrapers.md):
| Wrong | Right |
|---|---|
client.search.amazon.products(...) | client.scrape.amazon.productSearch(...) (no platform search router in JS) |
client.scrape.linkedin.people(...) | client.scrape.linkedin.profiles(urls) |
client.scrape.chatgpt... | client.scrape.chatGPT... (camelCase G+T) |
client.datasets.amazon_products | client.datasets.amazonProducts |
BrightDataClient / new BdClient() | bdclient (all lowercase) |
| Method | What it does |
|---|---|
client.scrapeUrl(url | url[], opts?) | Web unlocker — any URL → html / markdown / json / screenshot. Pass an array for parallel batch. |
client.search.google(q | q[], opts?) | SERP results (also .bing, .yandex). Array = batch. |
client.discover(query, { intent }) | AI-ranked entity/page discovery. |
client.discoverTrigger(query, opts?) | Non-blocking discover → Job with .wait() / .fetch(). |
client.datasets.list() | List all available datasets at runtime. |
client.scraperStudio.run(collectorId, { input }) | Run a custom Scraper Studio collector. |
client.browser.getConnectUrl({ country }) | CDP WebSocket URL for Playwright/Puppeteer/Selenium. |
client.listZones() | List active Bright Data zones. |
client.saveResults(data, { filename, format }) | Write results to a file. |
client.close() | Close HTTP connections. Always call when done. |
await client.close() (or await using). The client holds a keep-alive transport; leaking it hangs the process.client.search is SERP-only (google/bing/yandex). To search within a platform, use that platform's discover* / productSearch scraper methods — there is no client.search.amazon.chatGPT is camelCase on client.scrape — client.scrape.chatGPT.search(...), not .chatgpt.scrapeUrl([...urls]) and search.google([...queries]) run in parallel internally. For platform scrapers with many inputs, prefer the orchestrated method with an array, or the *Trigger + job.wait() pattern (see references/advanced.md). Don't wrap blocking calls in Promise.all of single-item calls — you'll fight the rate limiter.products/profiles/etc. trigger a job and poll until ready (often 2–10 min). Don't set tiny pollTimeouts; tune via { pollInterval, pollTimeout }. Defaults are calibrated.scrapeUrl. query() returns a snapshotId; download() blocks until the snapshot is ready.client.scrapeUrl().timeout / rateLimit on the constructor instead.All errors extend BRDError. Import the specific classes to branch:
import { bdclient, ValidationError, AuthenticationError, BRDError } from '@brightdata/sdk';
try {
const data = await client.scrape.amazon.products(['https://www.amazon.com/dp/B0D77BX8Y4']);
} catch (err) {
if (err instanceof AuthenticationError) { /* bad/expired token */ }
else if (err instanceof ValidationError) { /* bad input */ }
else if (err instanceof BRDError) { /* any SDK error */ }
else throw err;
}
Classes: BRDError (base), ValidationError, AuthenticationError, ZoneError,
NetworkError, NetworkTimeoutError, TimeoutError, APIError,
DataNotReadyError, FSError.
const md = await client.scrapeUrl('https://example.com/article', { dataFormat: 'markdown' });
const res = await client.scrape.amazon.products(['https://www.amazon.com/dp/B0D77BX8Y4']);
console.log(res.status, res.rowCount, res.data);
const results = await client.search.google(['best running shoes', 'best trail shoes'], { country: 'us' });
const startups = await client.discover('AI startups in Berlin', {
intent: 'early-stage machine-learning companies',
numResults: 10,
});
const ds = client.datasets;
const snapshotId = await ds.instagramProfiles.query({ url: 'https://www.instagram.com/natgeo/' }, { records_limit: 50 });
// download() blocks until the snapshot is ready
const rows = await ds.instagramProfiles.download(snapshotId);
AuthenticationError / 401 — token missing or invalid. Check BRIGHTDATA_API_TOKEN or the apiKey option.client.scrapeUrl() (web unlocker).timeout (constructor, 1000–300000 ms) or the orchestrated pollTimeout; do not lower it.client.datasets.list(); dataset names are camelCase (amazonProducts, linkedinProfiles).await client.close().references/advanced.md for zone/SSL options.{ rateLimit, ratePeriod } on the constructor, or batch via arrays / the trigger pattern instead of Promise.all.references/scrapers.md — when the user names Amazon, LinkedIn, Facebook, Instagram, YouTube, TikTok, Reddit, Pinterest, ChatGPT, Perplexity, DigiKey — verified per-platform method tables (collect / orchestrated / discover) and the scrapeUrl web-unlocker options.references/search.md — SERP engines (search.google/bing/yandex) and the Discover API (discover / discoverTrigger), with all options.references/datasets-overview.md — dataset names, the query → getStatus → download lifecycle, and filters.references/advanced.md — constructor options & env vars, batch/trigger orchestration, Browser API (Playwright/Puppeteer), Scraper Studio, error classes, and zones/SSL.npx claudepluginhub brightdata/skills --plugin brightdata-pluginExtracts web data from platforms (Amazon, LinkedIn, Instagram, etc.) and generic sites using the Bright Data Python SDK. Covers scraping, search, datasets, browser automation.
Scrape structured data from JavaScript-rendered pages, SPAs, or interactive sites using Bright Data Scraping Browser with Playwright or Puppeteer.
Scrapes web pages, SERPs, e-commerce, YouTube, and ChatGPT via ScrapingBee CLI with smart path extraction and AI rules for LLMs. Handles JS, CAPTCHAs, anti-bot automatically.