Skill

agent-webbridge

Drive the user's REAL Chrome — multiple profiles with their LIVE logins, and MULTIPLE TABS PER PROFILE, all IN PARALLEL — through agent-webbridge. Clean-room, open-source (MIT), no account, no telemetry. Automates the user's actual Chrome with their real logged-in sessions (not headless/scrape like Playwright or Firecrawl). Use for any task needing a real browser across one or more logged-in Chrome profiles: multi-account workflows, acting as the user across several accounts at once, or driving N tabs in one profile concurrently.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/agent-webbridge:agent-webbridge

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Clean-room, open-source (MIT) browser automation for AI agents. Drives the user's **actual**

SKILL.md

328 lines · ~4.7k tokens

Stats

LanguageJavaScript

Stars3

MaintenanceExcellent

Last CommitJun 16, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

agent-webbridge

Clean-room, open-source (MIT) browser automation for AI agents. Drives the user's actual Chrome — multiple profiles with their live logins, and multiple tabs per profile, all in parallel. A lightweight Node daemon runs per profile on a deterministic hashed port; a router on http://127.0.0.1:10086 proxies /command to the right daemon by a top-level "profile" field. macOS-first (Google Chrome).

agent-webbridge is clean-room and standalone: its own daemon (src/daemon/, only runtime dep is ws) and its own MV3 Chrome extension (agent-webbridge-extension/, stable id ifodkkbkmngjlkhiphcjmbceeolhpfeo). No closed-source dependency, no account, no telemetry, and no curl|bash installer. It is localhost-only: the extension connects only to a daemon you run on 127.0.0.1; no data leaves the machine and no remote server is ever contacted.

When to use

Use this whenever a task needs a real browser with the user's real logins, especially across more than one account or with several pages in flight at once:

Any browser / web / "open this URL" / navigate / click / read-a-page task in the user's own Chrome.
Multi-account or multi-profile workflows — acting as the user across several accounts (Work + Personal, multiple Gmail / Drive / Ads accounts) at once.
Throughput tasks where many tabs in one profile should run concurrently — agent- webbridge attaches chrome.debugger per tab, so N tabs in one profile run in parallel (the killer feature vs typical browser bridges, which funnel every call through one global "current tab").
Anything you would otherwise reach for a headless tool (Playwright, Firecrawl) for, but where the real logged-in session matters — here you get the user's actual Chrome instead.

Prefer this over headless/scrape tools when login state, real cookies, multiple simultaneous profiles, or per-profile tab parallelism matter.

Health check (always do this first)

awb status

Then act on the result:

daemonUp: true and extensionConnected: true for the profile(s) you want — healthy. Proceed with the tool calls below.
Anything else (router/daemon not up, extension not connected) — bring the fleet up with the commands under How to use, then re-check. See AGENTS.md for the full setup, install, and diagnose flow.

How to use

Full, copy-pasteable setup is in AGENTS.md. Quick version:

npm i -g agent-webbridge         # daemon + `awb` CLI
awb setup "Work" "Personal"      # install the extension (guided Load unpacked) + bring the fleet up
awb connect "Work" "Personal"    # (re)point each extension at its daemon (zero clicks; closes Chrome)
awb up      "Work" "Personal"    # start each profile's daemon + router on :10086, open windows
awb status                       # verify: extensionConnected:true per profile
awb down                         # tear down the fleet

(Pre-publish, run node bin/awb.mjs <cmd> from the repo. Run awb doctor first for a read-only environment self-check.)

Installing the extension (one-time, no Chrome Web Store)

The extension installs via Chrome's built-in "Load unpacked" — there is no Chrome Web Store listing, and none is needed. awb setup <profile…> automates everything except the one click Chrome reserves for a human. When you (the agent) run it, guide the user through that click and poll for success rather than waiting blindly:

Run awb setup "<profile>". It prints the exact extension folder and opens chrome://extensions.
Tell the user to: toggle Developer mode ON (top-right) → click Load unpacked → select the printed folder.
Poll awb check "<profile>" --json. For each profile it reports developerMode, loaded, enabled, daemonUp, connected, a ready boolean, and a single nextStep hint. Relay nextStep to the user and loop until ready: true. (awb setup also polls and continues on its own once the load lands, then connects + brings the fleet up.)

Zero-click connect: awb connect points each profile's extension at its daemon by writing local_url directly into the extension's on-disk storage.local — no popup, no clicks. It quits Chrome to write (LevelDB is single-writer) and the value persists, so later awb up runs just reconnect. If Chrome is already quit, awb up alone does the write for you.

Driving a profile

POST to the router on http://127.0.0.1:10086/command. The body is the normal command body plus a top-level "profile" field (name / email / directory) selecting the target profile. Omit "profile" to hit the default profile. Every command also carries a top-level "session" naming the current task — see Sessions.

curl -s -X POST http://127.0.0.1:10086/command \
  -H 'Content-Type: application/json' \
  -d '{"action":"navigate","args":{"url":"https://mail.google.com"},"session":"s1","profile":"Work"}'

Parallelism is cross-profile (N profiles) × per-tab (N tabs/profile): fire several /command requests at once — across different profiles, or at different tabs within the same profile — and they run concurrently.

High-throughput batch jobs (fan-out + resume)

For a large worklist (hundreds–thousands of items — scrape N companies, check N accounts, fill N forms), drive it as a fleet fan-out:

Defaults (unless the user specifies otherwise): 5 tabs per profile, across 2 profiles. First check what's available (awb status / awb check --json / awb profiles): use 2 profiles if at least two are set up + connected, otherwise fall back to 1 profile. Within each profile keep 5 concurrent tabs. If the user asks for a specific tabs-per-profile number, use that instead (per-profile parallelism is proven safe up to 10 tabs; 5 is the gentler default).
Total concurrency = #profiles × tabs-per-profile. Bring up several profiles (awb up "P1" "P2" …), split the worklist into one chunk per profile, and within each profile keep up to M /command requests (M tabs) in flight at once — e.g. 5 profiles × 10 tabs = 50 items processed concurrently. Per-tab parallelism is real (chrome.debugger is attached per tab), so tabs in one profile don't block each other.
Make it resumable. Write results to disk incrementally — rewrite/append the output file after each item, not once at the end. On (re)start, load the existing output, skip the items already done, and process only the remainder. An interrupted run (crash, awb down, machine sleep) then resumes where it stopped instead of redoing everything — and you can safely stop a run to re-tune concurrency, then relaunch.
Mind the shared egress IP. All profiles and tabs share one outbound IP. Fanning wider speeds up visiting many distinct sites, but for a step that hammers one service (e.g. a search engine), more parallel requests get rate-limited / CAPTCHA'd sooner — not finished faster. Throttle those steps.

Tools

13 tools, full parity, all verified live in a real browser:

Tool	Args	Returns	Note
`navigate`	`url`, `newTab`(bool), `group_title`	`{success, url, tabId}`	First call opens a tab — see Tabs. `group_title` sets the group's visible label
`find_tab`	`url`, `active`(bool)	`{success, url, tabId}`	Select an already-open tab as the current one — see Tabs
`snapshot`	—	`{url, title, tree}` with `@e` refs	Accessibility tree (text) — use this to read page content and locate elements
`click`	`selector` (@e ref or CSS)	`{success, tag, text}`	Synthetic `el.click()`
`fill`	`selector`, `value`	`{success, tag, mode}`	Works on `<input>`/`<textarea>` AND `[contenteditable]` (ProseMirror/Lexical/Slate). `mode` is `"value"` or `"contenteditable"`
`evaluate`	`code` (supports async/await)	`{type, value}`
`screenshot`	`format`(png\|jpeg), `quality`(0-100), optional `selector` (@e/CSS), optional `path`	`{format, path, sizeBytes, mimeType}`	Returns a file path, not base64 — see Screenshots
`network`	`cmd`(start\|stop\|list\|detail), `filter`, `requestId`	request/response data	Capture requests/responses
`upload`	`selector`, `files`(string[])	`{success, fileCount}`
`save_as_pdf`	`paper_format`, `landscape`, `scale`, `print_background`, optional `path`	`{path, sizeBytes, mimeType, pageTitle}`	Render current page → PDF, returns a file path — see Save as PDF
`list_tabs`	—	`{success, tabs:[{tabId, url, title, active, groupTitle}]}`	Inspect tabs in the current session
`close_tab`	—	`{success, closed: bool}`	Close the current tab in the session
`close_session`	—	`{success, closed: int}`	Close all tabs in the session — `closed` is the count. See Sessions

Tabs and the current tab

Single-tab tools (snapshot, click, fill, screenshot, save_as_pdf) act on the current tab — the one you most recently opened with navigate or selected with find_tab. Because the daemon attaches chrome.debugger per tab, multiple tabs in the same profile can be driven concurrently from parallel /command requests.

Opening pages: use newTab:true when pages should coexist (comparing, cross-referencing, or running in parallel); omit it to send the current tab to a new URL.
Going back to an earlier tab: call find_tab to make a tab you already opened the current one again. Pass the tab's full URL — take it from list_tabs or the earlier navigate result. A bare root domain may miss a www. tab, so prefer the exact URL. active:true picks the tab the user is currently viewing; otherwise the leftmost match wins.
If find_tab returns "no open tab found", the page isn't open — navigate with newTab:true instead.

curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"find_tab","args":{"url":"https://www.example.com","active":true},"session":"research","profile":"Work"}'

Call Format

Every command carries a top-level session naming the current task — see Sessions. Add a top-level profile to target a specific profile (omit for the default). The examples in later sections omit them only for brevity; in real calls always include session (and profile when driving a non-default profile).

curl -s -X POST http://127.0.0.1:10086/command \
  -H 'Content-Type: application/json' \
  -d '{"action":"navigate","args":{"url":"https://example.com","newTab":true,"group_title":"My task"},"session":"my-task","profile":"Work"}'

Sessions

One task = one session = one tab group. A session collects every tab this task opens into a single Chrome tab group, so the user sees one group representing "what the agent is doing right now". Pass it as a top-level field of the request body (not inside args).

Rules:

Pick one session name when the task starts, put it on every command, and never change it mid-task.
One task uses one session — even across multiple sites. Searching and then opening results on three different domains all share the same session and land in the same group. Do not switch session names per site — that is the #1 cause of fragmented tab groups.
Name the session after the task, not the site or domain — e.g. camping-research, phone-compare.
group_title is the human-readable label shown on the group in the browser. Pass it on the first navigate; later calls in the same session don't need it.
Use multiple sessions only when the user asks for several unrelated tasks at once — one session per task.
Reuse an existing group — don't fragment. Before opening tabs, check what's already open (awb groups, or list_tabs and read each tab's groupTitle). If a group for this task already exists (agent:<session>), reuse that exact session name instead of starting a new one. The daemon already re-joins a group by session name across service-worker restarts — fragmentation comes from the caller picking a new name, not from the fleet.
Parallel sub-agents / workflows share ONE session. When you fan out several agents/workers for the same task, pass them all the same session name. Letting each pick its own (probe-x, probe-y, …) scatters the work across many groups — and orphans them if a worker crashes mid-run. (One task = one group still holds, no matter how many workers.)
Clean up temporary sessions — even on failure. For scratch/probe sessions, close_session when done and on error/early-exit paths. List strays with awb groups and prune any with awb close <session> (the name shown after agent:).

# First tab of the task: set session + human-readable label
curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"navigate","args":{"url":"https://www.google.com/search?q=tents","newTab":true,"group_title":"Camping gear research"},"session":"camping-research","profile":"Work"}'

# Another SITE, SAME task → same session → joins the same group automatically
curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"navigate","args":{"url":"https://www.example.com/search?q=tents","newTab":true},"session":"camping-research","profile":"Work"}'

# Every later command carries the same session
curl -s -X POST http://127.0.0.1:10086/command \
  -d '{"action":"snapshot","args":{},"session":"camping-research","profile":"Work"}'

When the task is finished and the user no longer needs these pages, close_session clears the whole group. If they might want to look further, deliver your answer first and leave the tabs open — closing too eagerly throws away work the user can still see.

Screenshots

The daemon writes the image to disk and returns {format, path, sizeBytes, mimeType} — never base64, since the model can't read raw image bytes. Take the .path and open it with the Read tool to actually see it.

# Default: PNG of the visible viewport, daemon picks a temp path
curl ... -d '{"action":"screenshot","args":{}}'
# Options (each independent): JPEG quality, element-only via @e/CSS selector, custom output path
curl ... -d '{"action":"screenshot","args":{"format":"jpeg","quality":60}}'
curl ... -d '{"action":"screenshot","args":{"selector":"@e123"}}'

A caller-supplied path is honored verbatim (parent dirs created, existing file overwritten) — use a unique name to avoid clobbering. save_as_pdf follows the same rule.

Prefer snapshot over CSS/JS selectors

snapshot returns interactive elements with stable @e refs based on semantic role/name. Use them directly with click/fill — they survive CSS class hash changes that break manually-written selectors.

Fall back to evaluate (JS) only when:

The target has no @e ref in the snapshot
You need attributes not in the snapshot (e.g., href)
You need to dispatch complex event sequences, or scroll

Evaluate tips

Always use compact JSON.stringify(data) — never add null, 2 formatting. Indentation and newlines can inflate the response several times over, causing truncation during transmission.
evaluate calls share the page's JS realm — re-declaring the same const/let across two calls throws SyntaxError. Wrap in an IIFE for a fresh scope: (() => { const x = ...; return x; })().

Text input — use `fill`

fill handles native inputs and contenteditable. Pass selector (CSS or @e ref) + value:

Target	What `fill` does	Returned `mode`
`<input>` / `<textarea>`	Sets `.value` via native setter, fires `input`/`change`.	`"value"`
`[contenteditable]` (ProseMirror / TipTap / Lexical / Slate / Quill etc.)	Focuses, selects all existing content, calls `document.execCommand('insertText', ...)` which fires `beforeinput`/`input` with `inputType:'insertText'` and `data:value`.	`"contenteditable"`
Other element	Best-effort `.value` + events.	`"value"`

fill is clear-and-insert: existing content is replaced. For "append to existing text", read the current value via evaluate, concatenate, then fill with the result.

Form submit / special keys

There's no separate "press Enter" tool. To submit a form, click the submit button directly (its @e ref or selector). To dispatch a key event programmatically (e.g. Escape to close a modal):

{"action":"evaluate","args":{"code":"document.activeElement.dispatchEvent(new KeyboardEvent('keydown',{key:'Escape',bubbles:true}))"}}

Save the current page as PDF

save_as_pdf renders the current page to PDF and returns the file path. All args optional:

paper_format: letter (default) | a4 | legal | a3 | tabloid
landscape: false (default)
scale: 1.0 (default), range [0.1, 2.0]
print_background: true (default) — keep background colors
path: caller-supplied output path; if absent, daemon picks a default under OS temp dir using the page title as the filename

path semantics match screenshot: written verbatim, parent dirs auto-created, existing files overwritten.

Known limitations

Sites that strictly check event.isTrusted (some banking portals, captcha challenges) reject fill and click because both go through DOM-level synthetic events (isTrusted=false). This is a product boundary, not a bug.
Cross-origin iframes: fill, click, evaluate, and snapshot operate on the top frame. If a target element lives in a same-page iframe from a different origin, navigate to the iframe's URL directly instead.

Scope and limits

macOS-first (Google Chrome). Linux/Windows is a documented follow-up.
One daemon per profile, each on a deterministic hashed port with its own isolated state.
:10086 is reserved for the router — it is never assigned to a profile.
Localhost-only: the extension connects only to a 127.0.0.1 daemon you run. No remote server is ever contacted, no analytics, no account.
node engines >=18.

agent-webbridge

Popularity

Invocation

Context Preview

SKILL.md

agent-webbridge

Popularity

Invocation

Context Preview

SKILL.md

agent-webbridge

When to use

Health check (always do this first)

How to use

Installing the extension (one-time, no Chrome Web Store)

Driving a profile

High-throughput batch jobs (fan-out + resume)

Tools

Tabs and the current tab

Call Format

Sessions

Screenshots

Prefer snapshot over CSS/JS selectors

Evaluate tips

Text input — use fill

Form submit / special keys

Save the current page as PDF

Known limitations

Scope and limits

Similar Skills

agent-webbridge

When to use

Health check (always do this first)

How to use

Installing the extension (one-time, no Chrome Web Store)

Driving a profile

High-throughput batch jobs (fan-out + resume)

Tools

Tabs and the current tab

Call Format

Sessions

Screenshots

Prefer snapshot over CSS/JS selectors

Evaluate tips

Text input — use fill

Form submit / special keys

Save the current page as PDF

Known limitations

Scope and limits

Similar Skills

Text input — use `fill`

Text input — use `fill`