ALWAYS use this skill when navigating websites, browsing the web, or interacting with web pages. Provides tool usage guide, best practices, and debugging tips for the ABP browser MCP tools.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-browser-protocol:abp-browserThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
ABP **pauses JavaScript and virtual time** between your actions. The page is frozen until your next tool call.
ABP pauses JavaScript and virtual time between your actions. The page is frozen until your next tool call.
When you call any action tool (browser_action, browser_scroll, browser_navigate, etc.):
One tool call = one complete turn. Screenshots are included automatically with every action response. There is no need to take a separate screenshot after performing an action.
browser_action accepts 1-3 actions per call. Batch common workflows to reduce round-trips:
[{mouse_click, x, y}, {keyboard_type, text}, {keyboard_press, key: ENTER}][{mouse_click, x, y}, {keyboard_type, text}][{mouse_click, x, y}]Actions execute sequentially with a 20ms pause between each. One screenshot is taken after all actions complete.
Do NOT batch scrolling — use browser_scroll separately.
Sometimes 500ms isn't enough for the page to finish loading (AJAX, animations, redirects). When the screenshot shows incomplete content:
Call browser_screenshot to wait and observe. It runs the same resume-wait-capture-pause cycle without performing any action, giving the page another chance to settle. Repeat until the content appears.
Pass markup to browser_screenshot to see visual overlays on the page. Available overlays:
clickable — Green outlines around clickable elements (links, buttons, onclick handlers)typeable — Orange outlines around text inputs and textareasscrollable — Purple dashed outlines around scrollable containersselected — Blue outline around the currently focused elementgrid — Red coordinate grid at 100px intervals with viewport coordinate labelsUse grid to map viewport coordinates for targeting clicks. Use clickable and typeable to identify interactive elements visually.
All tab_id parameters are optional and default to the active tab.
Input:
browser_action — 1-3 actions: mouse_click (x, y), keyboard_type (text), keyboard_press (key, modifiers?), mouse_hover (x, y), mouse_drag (start_x, start_y, end_x, end_y). Keys are ALL-CAPS (ENTER, TAB, ESCAPE, CONTROL, META, etc.). Abbreviations accepted: CTRL, CMD, ESC, DEL.browser_scroll — x, y (where wheel fires), delta_x?, delta_y? (positive=down/right)browser_slider — orientation (horizontal/vertical), track bounds, current position, min, max, target_value. Calculates and executes drag automatically.Navigation:
browser_navigate — url? OR action? (back, forward, reload)browser_tabs — action? (list, new, close, info, activate, stop; default: list), tab_id?, url?Observation:
browser_screenshot — markup?, disable_markup?, format?browser_javascript — expression (required). Data extraction and DOM inspection ONLY — do NOT use for interaction; prefer browser_action.browser_text — selector?Situational:
browser_dialog — action? (check, accept, dismiss; default: check), prompt_text?browser_downloads — action? (list, status, cancel, content; default: list), download_id?, state?, limit?, max_size?browser_files — chooser_id (required), files?, content_files?, path?, cancel?browser_select_picker — Respond to pending select/dropdown popupsBrowser:
browser_get_status — no paramsbrowser_shutdown — timeout_ms?Use browser_get_status to check if the browser is ready and connected.
Session data (history database, screenshots) is stored in a session directory. To find the paths, query the REST API:
curl http://localhost:8222/api/v1/browser/session-data
The session directory contains:
<session_dir>/
├── history.db # SQLite database with sessions, actions, events
└── screenshots/ # Auto-saved before/after WebP screenshots per action
Query the database:
-- Recent actions
SELECT id, type, status, url, error FROM actions ORDER BY id DESC LIMIT 10;
-- Events for an action
SELECT * FROM events WHERE action_id = <id>;
-- Screenshot paths
SELECT screenshot_before_path, screenshot_after_path FROM actions WHERE id = <id>;
browser_javascript uses expression as its parameter name (not script)browser_scroll requires x, y coordinates where the mouse wheel fires — target the element centerdelta_y positive = scroll down, negative = scroll upnpx claudepluginhub theredsix/abp-npm --plugin agent-browser-protocolAutomates browser interactions via Chrome DevTools Protocol. Screenshots, clicks, types, navigates, reads page accessibility trees, extracts text, and executes JavaScript in web pages. Use when the user asks to interact with a website, test a web app, fill web forms, scrape web content, or automate browser tasks.
CLI for browser automation: navigate sites, snapshot elements for refs, fill forms, click buttons, screenshot, scrape data, test web apps. Chains commands, imports auth state.
Reference for agent-browser commands to navigate pages, snapshot elements, interact (click/fill/type), extract data. For web testing, form automation, screenshots.