Skill

AgenticUI

Use when authoring, customizing, or extending an AgenticUI agent-observability web app — a generated standalone frontend that shows an AI agent working via a live in-browser terminal (xterm.js + node-pty over WebSocket), a pseudo-desktop (Dockerized Xvfb + x11vnc + noVNC), and optional computer-use (a VLM-driven click/type loop). Activates for the /gerdsenai:agentic-ui scaffolder and any work on its templates.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/gerdsenai:agentic-ui

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

AgenticUI is a **generator** in the GerdsenAI toolkit: `/gerdsenai:agentic-ui` scaffolds a

Supporting Files

references/architecture.mdreferences/design-ultraplan.mdreferences/models.mdreferences/security.mdreferences/ui-ux.md

SKILL.md

66 lines · ~1.1k tokens

Stats

LanguagePython

Stars0

MaintenanceExcellent

Last CommitMay 28, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

AgenticUI

AgenticUI is a generator in the GerdsenAI toolkit: /gerdsenai:agentic-ui scaffolds a standalone web app (a deliverable, not a UI for this plugin) that lets anyone watch an AI agent work. It mirrors how /gerdsenai:init scaffolds projects and /gerdsenai:build makes PDFs.

The generated app is public-safe and BYO-model: it ships no credentials and no hardcoded model endpoints. The end user supplies their own model for the computer-use brain via environment variables.

The three phases (cumulative via `--phase 1|2|3`)

Terminal — a live in-browser terminal. xterm.js (frontend) ⇄ WebSocket ⇄ node-pty (backend PTY). The PTY runs a configurable command (default shell), so the app can be pointed at any agent.
+ Pseudo-desktop — a Dockerized virtual display the agent renders into: Xvfb (headless X server) → x11vnc (exposes it as VNC) → websockify + noVNC (VNC in the browser). Shown in a Desktop tab.
+ Computer-use — a control loop that screenshots the virtual desktop, asks a VLM where to act, and simulates input with pyautogui/xdotool. Provider-agnostic (see below). A --read-only variant observes only (screenshots, no input simulation).

BYO-model contract (never hardcode endpoints)

The computer-use agent reads its model config from the environment, in this order:

VLM_BASE_URL + VLM_MODEL + VLM_API_KEY — any OpenAI-compatible vision endpoint (messages:[{type:"image_url", image_url:{url:"data:image/png;base64,…"}}]). Works with hosted or local servers (the user points it wherever they like).
else ANTHROPIC_API_KEY — use Claude's computer-use tool loop.
If neither is set, the agent runs in observe-only mode and says so. No localhost default ships.

Security model (bake into every generated app)

Bind all services to 127.0.0.1 by default; never 0.0.0.0 on the host.
Require a shared auth token on every WebSocket/HTTP control endpoint.
Default-recommend --read-only (watch, don't drive) in docs; make enabling input simulation explicit.
Run the desktop and computer-use pieces in containers (isolation); the agent only touches the virtual display, never the host input devices.
Ship a SECURITY.md: computer-use is powerful/dual-use — run it only against environments you own or control, inside the provided container, never exposed to the network.
Keys come from the environment / .env (git-ignored). .env.example holds placeholders only.

UI/UX quality bar (Apple-HIG-aware, accessible)

Semantic layout; visible focus; keyboard operable; sufficient contrast (WCAG AA).
Clear connection status and loading / error / empty states (a terminal that can't connect must say why, with a retry).
Responsive: usable from a laptop down to a tablet width.
Trust + AI transparency: label what's live, what the agent is doing, and when it's acting on the desktop. Never expose provider keys in client-side code.

How the scaffolder works

scripts/agentic-ui-build.py copies templates/agentic-ui/phase1..N/ into the target dir, rendering tokens (__APP_NAME__, __TERM_PORT__, __DESKTOP_PORT__, __AGENT_PORT__, __AUTH_TOKEN__, __READ_ONLY__). Later phases override earlier files (e.g., docker-compose.yml). It generates a random auth token into .env, refuses to clobber without --force, and emits the standard JSON result.

See references/architecture.md, references/ui-ux.md, references/security.md, references/models.md (computer-use capability tiers: Claude native vs reasoning/vision vs weak local), and references/design-ultraplan.md (design-driven planning via /gerdsenai:design-plan → /ultraplan → frontend-design) for detail.

AgenticUI

Invocation

Context Preview

Supporting Files

SKILL.md

AgenticUI

Invocation

Context Preview

Supporting Files

SKILL.md

AgenticUI

The three phases (cumulative via `--phase 1|2|3`)

BYO-model contract (never hardcode endpoints)

Security model (bake into every generated app)

UI/UX quality bar (Apple-HIG-aware, accessible)

How the scaffolder works

Similar Skills

AgenticUI

The three phases (cumulative via `--phase 1|2|3`)

BYO-model contract (never hardcode endpoints)

Security model (bake into every generated app)

UI/UX quality bar (Apple-HIG-aware, accessible)

How the scaffolder works

Similar Skills

AgenticUI

Invocation

Context Preview

Supporting Files

SKILL.md

AgenticUI

Invocation

Context Preview

Supporting Files

SKILL.md

AgenticUI

The three phases (cumulative via --phase 1|2|3)

BYO-model contract (never hardcode endpoints)

Security model (bake into every generated app)

UI/UX quality bar (Apple-HIG-aware, accessible)

How the scaffolder works

Similar Skills

AgenticUI

The three phases (cumulative via --phase 1|2|3)

BYO-model contract (never hardcode endpoints)

Security model (bake into every generated app)

UI/UX quality bar (Apple-HIG-aware, accessible)

How the scaffolder works

Similar Skills

The three phases (cumulative via `--phase 1|2|3`)

The three phases (cumulative via `--phase 1|2|3`)