Skill

CV Plugin Help

From computer-vision

Show available Computer Vision tools and usage examples.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/computer-vision:cv-help

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Show available Computer Vision tools and usage examples.

SKILL.md

62 lines · ~766 tokens

Stats

LanguagePython

Parent stars1

MaintenanceExcellent

Last CommitMar 21, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

CV Plugin Help

Show available Computer Vision tools and usage examples.

Available Tools

Tool	Description
`cv_list_windows`	List all visible windows with HWND, title, process, rect
`cv_screenshot_window`	Capture a specific window by HWND
`cv_screenshot_desktop`	Capture the entire desktop (all monitors)
`cv_screenshot_region`	Capture a rectangular region of the screen
`cv_focus_window`	Bring a window to the foreground
`cv_mouse_click`	Click at screen coordinates (left/right/double/middle/drag)
`cv_type_text`	Type text into the foreground window
`cv_send_keys`	Send key combinations (Ctrl+S, Alt+Tab, etc.)
`cv_move_window`	Move/resize a window or maximize/minimize/restore
`cv_ocr`	Extract text from a window or region with bounding boxes and confidence
`cv_find`	Find elements by natural language query (UIA + OCR fuzzy search)
`cv_get_text`	Extract all visible text from a window (UIA primary, OCR fallback)
`cv_list_monitors`	List all monitors with resolution, DPI, and position
`cv_read_ui`	Read the UI accessibility tree of a window
`cv_wait_for_window`	Wait for a window matching a title pattern to appear
`cv_wait`	Simple delay (max 30 seconds)

Quick Start Examples

Find and click an element by description:

cv_find(query="Submit button", hwnd=<HWND>) — finds matching elements
Click the returned bbox center with cv_mouse_click

Extract text from any app:

cv_get_text(hwnd=<HWND>) — UIA for native apps, OCR fallback for Chrome/Electron

List windows and take a screenshot:

Call cv_list_windows to see all open windows
Find the HWND of the window you want
Call cv_screenshot_window with that HWND

Click a button in an app:

cv_screenshot_window to see the current state
Identify the button coordinates from the screenshot
cv_mouse_click at those coordinates

Drag and drop (works with WebView, UWP, Electron, WPF apps):

cv_mouse_click(x=<END_X>, y=<END_Y>, start_x=<START_X>, start_y=<START_Y>, hwnd=<HWND>) — drag from start to end
Optionally tune speed with drag_duration_ms (default 300ms)

OCR with bounding boxes:

cv_ocr(hwnd=<HWND>) — extract text with word-level bounding boxes and confidence scores

Background mode (work without disturbing the user):

cv_mouse_click(x=100, y=200, hwnd=<HWND>, background=True) — click without moving cursor
cv_type_text(text="hello", hwnd=<HWND>, background=True) — type without stealing focus
cv_send_keys(keys="ctrl+s", hwnd=<HWND>, background=True) — send keys in background
Screenshots already work per-window regardless of focus — no changes needed

Grid overlay for precise clicking:

cv_screenshot_window(hwnd=<HWND>, grid=True) — screenshot with coordinate grid overlay
Read the coordinate labels from the grid image (e.g., x=300, y=200)
cv_mouse_click(x=300, y=200, hwnd=<HWND>, coordinate_space="window_capture") — click at exact grid position

CV Plugin Help

Popularity

Invocation

Context Preview

SKILL.md

CV Plugin Help

Popularity

Invocation

Context Preview

SKILL.md

CV Plugin Help

Available Tools

Quick Start Examples

Similar Skills

CV Plugin Help

Available Tools

Quick Start Examples

Similar Skills