From claude-commands
Runs bounded UI automation loops with screenshot → decide → act cycles, safety guardrails, and step limits. Useful for desktop/browser control and visual task execution.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-commands:claude-code-computer-useThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run Claude Code as a bounded UI-control agent that observes screenshots, takes one action, re-checks state, and repeats until success or stop conditions.
Run Claude Code as a bounded UI-control agent that observes screenshots, takes one action, re-checks state, and repeats until success or stop conditions.
Require these tool primitives (or closest equivalents):
screenshot()click(x,y) / double_click(x,y)type(text) / key(combo)scroll(delta)ocr(region) / accessibility_tree()Never take multiple UI actions in a single reasoning step.
Use non-interactive Claude Code mode for deterministic logs:
claude --print --permission-mode bypassPermissions "<task prompt>"
Task prompt template:
You are controlling a UI via tools.
Goal: <goal>
Allowed scope: <apps/sites>
Max steps: <N>
Rules:
- One action per step.
- After each action, screenshot and verify.
- If uncertain for 2 steps, stop and ask.
- Ask before destructive actions.
Return a step log and final status: success | blocked | needs-human.
Declare completion only when:
Always return concise evidence:
npx claudepluginhub jleechanorg/claude-commands --plugin claude-commandsGuides GUI automation with computer use: when to use over shell/MCP/browser tools, visual validation for native apps, regression workflows, and verification patterns.
Automates desktop GUI workflows using Claude's Computer Use API for screenshot capture, mouse/keyboard control. Useful for GUI testing, form filling, and visual app interactions without CLI.
Controls desktop GUI as a fallback when APIs, CLIs, file editing, and browser automation are unavailable or have failed. Clicks, types, reads screen, and drives native apps on Windows/macOS/Linux.