From open-computer-use
Use whenever the user wants to control a macOS GUI app — clicking buttons, reading on-screen content, typing into open windows, or operating their logged-in browser. Triggers on "Chrome を操作", "ログイン済みのブラウザで", "Mac アプリを動かして", "ocu", "computer use", "operate the browser". Prefer over Playwright when login state, cookies, SSO, passkeys, or extensions matter.
How this skill is triggered — by the user, by Claude, or both
Slash command
/open-computer-use:open-computer-useThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
macOS Accessibility API + CGEvent + `screencapture` packaged as one CLI/MCP binary. Model-agnostic: works from any agent that can call Bash or MCP.
ocu)macOS Accessibility API + CGEvent + screencapture packaged as one CLI/MCP binary. Model-agnostic: works from any agent that can call Bash or MCP.
curl -fsSL https://raw.githubusercontent.com/nogu66/open-computer-use/main/scripts/install.sh | bash
export PATH="$HOME/.local/bin:$PATH"
ocu --version
Pinned version: OCU_VERSION=v0.1.0 ./scripts/install.sh
Source build only: ./scripts/install.sh --from-source
When installed as a plugin, prefer the bundled wrapper:
OCU="${CLAUDE_PLUGIN_ROOT}/scripts/ocu-cli.sh"
# Codex also sets PLUGIN_ROOT; Cursor project checkout:
# OCU="$(git rev-parse --show-toplevel)/scripts/ocu-cli.sh"
After install:
OCU="$(command -v ocu)"
Wrappers auto-install the latest release to ~/.local/bin/ocu on first use unless OCU_SKIP_AUTO_INSTALL=1.
$OCU apps
$OCU activate --bundle-id com.google.Chrome
$OCU tree --bundle-id com.google.Chrome --depth 8
$OCU click --bundle-id com.google.Chrome --query "Address and search bar"
$OCU type --text "https://example.com"
$OCU key --key return
$OCU shot --bundle-id com.google.Chrome --out /tmp/after.png
| Command | Purpose |
|---|---|
apps | List GUI apps (bundle ID + PID) |
activate --bundle-id <id> | Bring app to front (call first) |
tree --bundle-id <id> [--depth N] | AX tree; add --json for structured output |
find --bundle-id <id> --query <q> | Substring match on AX labels |
wait --bundle-id <id> --query <q> [--timeout SEC] | Poll until element appears |
click / rclick | Left / right click by --query |
type --text <s> | Type into focused field |
key --key <name> [--mods cmd,shift,alt,ctrl] | Key press |
scroll | Scroll at element or cursor |
menu --bundle-id <id> --path <p> | Menubar path, e.g. "File/New Tab" |
shot [--bundle-id <id>] [--out <path>] | Screenshot PNG |
clip get / clip set --text <s> | Clipboard |
Same binary. Plugin install exposes MCP tools (list_apps, get_ax_tree, click_element, …). Bash agents use the CLI subcommands above.
Direct MCP (no wrapper):
claude mcp add open-computer-use -- $(command -v ocu)
activate first — otherwise type/key leak to the terminalmenu over Cmd+T when Chrome is playing video (keys may be captured)find/click return the first match — narrow queries or inspect tree first@eN refs — use --query (MCP click_ref uses refs from last tree)Repository: https://github.com/nogu66/open-computer-use
npx claudepluginhub nogu66/open-computer-use --plugin open-computer-useCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.