From computer-pilot
Control the macOS desktop via the `cu` CLI tool. Use when the user needs to interact with desktop applications — open apps, click buttons, fill forms, navigate menus, take screenshots, or read screen content. Works with any macOS app via the Accessibility API. Activate this skill whenever a task involves desktop automation, app control, GUI interaction, or any operation outside the terminal and web browser.
How this skill is triggered — by the user, by Claude, or both
Slash command
/computer-pilot:computer-pilotThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Control macOS desktop applications via bash commands. Uses the Accessibility
Control macOS desktop applications via bash commands. Uses the Accessibility API for precise element targeting — no coordinate guessing needed.
cu binary must be installed (Rust, single binary, zero dependencies)cu setup to check Accessibility + Screen Recording permissionscu apps # 1. see what's running
cu snapshot "App Name" --limit 30 # 2. get UI elements with [ref] numbers
cu click 3 --app "App Name" # 3. interact by ref
cu snapshot "App Name" --limit 30 # 4. verify result
cu snapshot returns the AX tree — a structured list of interactive UI elements:
[1] button "Back" (10,40 30x24)
[2] textfield "Search" (100,40 200x24)
[3] statictext "Favorites" (10,100 80x16)
[4] row "Documents" (10,120 300x20)
Each element has: [ref] number, role, title/value, position, size.
Use the [ref] number with cu click <ref> to interact.
Refs change after every action — always re-snapshot before clicking.
| Command | Description |
|---|---|
cu snapshot [app] --limit N | AX tree with [ref] numbers (cheapest) |
cu ocr [app] | Vision OCR text recognition (for non-AX apps) |
cu screenshot [app] --path file.png | Window capture (for visual analysis) |
cu wait --text "X" --app Name --timeout 10 | Poll until text/element appears |
cu apps | List running applications |
| Command | Description |
|---|---|
cu click <ref> --app Name | Click element (AX action first, CGEvent fallback) |
cu click <ref> --right | Right-click |
cu click <ref> --double-click | Double-click (open files, select words) |
cu click <ref> --shift | Shift+click (extend selection) |
cu click <x> <y> | Click screen coordinates |
cu key <combo> --app Name | Keyboard shortcut |
cu type "text" --app Name | Type text (Unicode supported) |
cu scroll down 5 --x 400 --y 300 | Scroll |
cu hover <x> <y> | Move mouse (tooltips) |
cu drag <x1> <y1> <x2> <y2> | Drag |
| Command | Description |
|---|---|
cu setup | Check permissions + version |
Use the cheapest observation method first:
cu snapshot — structured AX tree (lowest tokens, most precise)cu ocr — Vision OCR (for apps with poor AX: games, Qt, Java)cu screenshot — image file (use your own vision to analyze)cu key cmd+space # open Spotlight
cu type "Calculator" --app Spotlight # search
cu key enter --app Spotlight # launch
cu wait --text "Calculator" --timeout 5 # wait for it
# Method 1: Help menu search (works in most apps)
cu key cmd+shift+/ --app "App Name"
cu type "menu item name" --app "App Name"
cu key enter --app "App Name"
# Method 2: Direct menu shortcut
cu key cmd+, --app "App Name" # Preferences/Settings
cu key cmd+n --app "App Name" # New
cu key cmd+o --app "App Name" # Open
cu key cmd+s --app "App Name" # Save
cu key cmd+w --app "App Name" # Close window
cu key cmd+q --app "App Name" # Quit app
# Click app name in menu bar, then About
cu snapshot "App Name" --limit 50 # find menu bar items
cu click <menu-ref> --app "App Name"
# Or use keyboard: most apps respond to cmd+shift+/ → "About"
cu key cmd+a --app "App Name" # select all
cu key cmd+c --app "App Name" # copy
pbpaste # read clipboard
echo "text" | pbcopy # write clipboard
cu key cmd+v --app "App Name" # paste
cu snapshot Finder --limit 50 # see files
cu click <file-ref> --app Finder # select file
cu click <file-ref> --double-click --app Finder # open file
cu key cmd+delete --app Finder # move to trash
cu key cmd+shift+n --app Finder # new folder
cu snapshot --limit 30 # frontmost app (dialog is usually frontmost)
# Look for button refs like "OK", "Cancel", "Allow", "Save"
cu click <button-ref>
--app to target a specific app. Without it, keys go to the frontmost app which may have changed.cu snapshot first.cu snapshot again to confirm the action worked.--no-snapshot to disable.When piped (default for agents), output is JSON:
{"ok":true,"app":"Finder","window":"Downloads","elements":[{"ref":1,"role":"button","title":"Back","x":10,"y":40,"width":30,"height":24}]}
Errors include context:
{"ok":false,"error":"element [99] not found in AX tree (scanned 50 elements)"}
bp (browser-pilot) over cu — DOM-level precision beats AX tree for web contentcu is the right tool — AX tree provides element refs that Chrome CDP can't accesscu snapshot returns sparse results, try cu ocr (Vision OCR) or cu screenshot (visual fallback)cu wait --text "X" after actions that trigger loading or transitionspbcopy/pbpaste directly, no need for cu wrapperGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub relixiaobo/computer-pilot --plugin computer-pilot