From cli-anything-web
Analyzes captured HTTP traffic from raw-traffic.json, identifies protocols and endpoints, designs Click CLI architecture, and implements Python CLI package for API wrappers.
How this skill is triggered — by the user, by Claude, or both
Slash command
/cli-anything-web:methodologyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Analyze captured traffic, design the CLI command structure, and implement the
references/auth-strategies.mdreferences/client-architecture-example.pyreferences/exception-hierarchy-example.pyreferences/google-batchexecute.mdreferences/helpers-module-example.pyreferences/persistent-context-example.pyreferences/polling-backoff-example.pyreferences/rich-output-example.pyreferences/ssr-patterns.mdreferences/traffic-patterns.mdAnalyze captured traffic, design the CLI command structure, and implement the complete Python CLI package. This skill owns the core transformation from raw HTTP traffic to a production-ready CLI.
Do NOT start unless:
raw-traffic.json exists (with WRITE operations, or read-only GET-only traffic)If raw-traffic.json is missing or has no WRITE operations, invoke the
capture skill first.
Exception for read-only sites: If the site is genuinely read-only (search engine,
dashboard, analytics viewer with no create/update/delete), the trace may contain only
GET requests. In this case, note "read-only site — no write operations" in <APP>.md
and proceed. The generated CLI will have read-only commands (list, get, search) but
no create/update/delete commands. This is valid.
No-auth sites: If the target site requires no authentication (public API,
no login needed), the "Auth state captured" prerequisite does not apply. Note
"no-auth site" in <APP>.md and proceed.
Goal: Map raw traffic to a structured API model.
Process:
Read traffic-analysis.json first (if it exists alongside raw-traffic.json).
This file is auto-generated by parse-trace.py or mitmproxy-capture.py → analyze-traffic.py and contains
pre-detected protocol type, auth pattern, endpoint grouping, GraphQL operations,
batchexecute RPC IDs, and suggested CLI commands. Use it as a starting point —
verify its findings and fill in anything marked "unknown" by reading raw-traffic.json
manually.
Enhanced analysis (v1.3.0, when captured via mitmproxy-capture.py):
request_sequence: Timeline-ordered requests with auth flow detection (login → token → API calls)session_lifecycle: Cookie inventory, auth cookie identification, session pattern (cookie_auth/token_refresh/no_session)endpoint_sizes: Response body size classification per endpoint (small/medium/large) and total data transferred
These fields are only present when mitmproxy-capture.py was used. If missing (has_timestamps: false), rely on manual analysis.If traffic-analysis.json doesn't exist, run the analyzer:
python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
<app>/traffic-capture/raw-traffic.json --summary
Parse raw-traffic.json (for details the analyzer couldn't extract)
Group requests by base path (e.g., /api/v1/boards/, /api/v1/items/)
For each endpoint group, identify:
:id)Identify RPC protocol type -- classify the API transport:
| Protocol | Detection Signal | Client Pattern |
|---|---|---|
| REST | Resource URLs (/api/v1/boards/:id), standard HTTP methods | client.py with method-per-endpoint |
| GraphQL | Single /graphql endpoint, query/mutation in body | client.py with query templates |
| gRPC-Web | application/grpc-web content type, binary payloads | Proto-based client |
| Google batchexecute | batchexecute in URL, f.req= body, )]}'\n prefix | rpc/ subpackage (see references/google-batchexecute.md) |
| Custom RPC | Single endpoint, method name in body, proprietary encoding | Custom codec module |
| Public REST API | Documented /api/ endpoints, OpenAPI spec, JSON responses | Standard client.py with httpx |
| Plain HTML (no framework) | No SPA root, no framework globals, data in <table>/<div> | client.py with httpx + BeautifulSoup4 |
This determines client architecture in Step B -- REST uses simple client.py,
non-REST protocols need a dedicated rpc/ subpackage with encoder/decoder/types.
Detect data model:
Detect auth pattern:
WIZ_global_data),
not in HTTP headers. Requires CDP for initial cookies, HTTP for token extraction.
See references/auth-strategies.md "Browser-Delegated Auth" section.Write <APP>.md -- software-specific SOP document
Output: <APP>.md with API map, data model, auth scheme.
References: traffic-patterns.md, google-batchexecute.md, ssr-patterns.md
Before implementing, read an existing CLI that uses the same protocol as your target. These are battle-tested implementations that solved the same problems you'll face.
| Protocol | Reference CLI | Key files to read |
|---|---|---|
| Google batchexecute | notebooklm/agent-harness/cli_web/notebooklm/ | core/rpc/encoder.py, core/rpc/decoder.py, core/client.py, core/auth.py |
| GraphQL + WAF | booking/agent-harness/cli_web/booking/ | core/client.py (curl_cffi + GraphQL), core/auth.py (WAF tokens) |
| HTML scraping | futbin/agent-harness/cli_web/futbin/ | core/client.py (httpx + BS4), commands/players.py |
| HTML + Cloudflare | producthunt/agent-harness/cli_web/producthunt/ | core/client.py (curl_cffi impersonate) |
| REST API | unsplash/agent-harness/cli_web/unsplash/ | core/client.py, commands/photos.py |
| Simple HTML | gh-trending/agent-harness/cli_web/gh_trending/ | Minimal structure example |
How to use reference CLIs:
core/client.py — understand the request/response patterncore/auth.py — copy the login_browser() pattern exactly for Google appscore/rpc/ (for batchexecute) — understand encoder/decoder, DO NOT reinventcommands/ — see how Click commands are structured, how --json worksutils/helpers.py — see handle_errors(), _resolve_cli(), repl patternsFor batchexecute apps specifically, the notebooklm CLI is your bible:
The agent implementing the CLI MUST read these files before writing code. Use the
Agent tool to dispatch a research agent that reads
the reference implementation while you design the command structure.
Before writing any code, note the command structure in <APP>.md (10 minutes max):
/api/v1/boards/* → boards command group/api/v1/items/* → items command grouplist, GET single → get,
POST → create, PUT/PATCH → update, DELETE → delete)auth login, auth status, auth refresh; credentials at
~/.config/cli-web-<app>/auth.jsonrepl_skin.pyGoal: Generate the complete Python CLI package.
See HARNESS.md "Generated CLI Structure" for the complete package template.
Key points: cli_web/ namespace (NO __init__.py), <app>/ sub-package (HAS __init__.py),
core/, commands/, utils/, tests/ directories.
Before writing implementation code, read ${CLAUDE_PLUGIN_ROOT}/skills/boilerplate/SKILL.md
and follow its instructions to scaffold the core/ modules. This generates exceptions.py,
client.py skeleton, helpers.py, config.py, and (for batchexecute) the rpc/ subpackage.
After scaffolding, review the generated files and customize client.py with actual
endpoint methods from <APP>.md.
exceptions.py -- implement first. Required types: AppError (base), AuthError(recoverable), RateLimitError(retry_after), NetworkError, ServerError(status_code), NotFoundError. See references/exception-hierarchy-example.py for the complete template.
client.py -- HTTP client with exception mapping and auth retry:
httpx (default) — for most sites (REST, GraphQL, batchexecute)curl_cffi — for Cloudflare-protected sites. Uses Chrome TLS fingerprint
impersonation to bypass bot detection without cookies or auth:
from curl_cffi import requests as curl_requests
resp = curl_requests.get(url, impersonate="chrome")
Use curl_cffi when Phase 1 detects Cloudflare (cf-ray header, challenge page).
Add curl_cffi, beautifulsoup4 to setup.py instead of httpx.AuthError, 404→NotFoundError, 429→RateLimitError, 5xx→ServerErrorAuthError(recoverable=True), refresh tokens and retry oncereferences/polling-backoff-example.py)client.notebooks.list(), client.sources.add())references/client-architecture-example.py for the full patternauth.py -- handles token storage, refresh, expiry. Implementation depends on auth type:
For no-auth sites: DO NOT create auth.py, session.py, or auth command groups.
These files are dead code for public APIs and confuse users. The CLI should have
NO auth-related files or commands. The only exception is if the site has optional
auth (e.g., API key for write operations) — in that case, implement a minimal
auth module.
For browser-delegated auth (Google, Microsoft, etc.): Full playwright-cli login flow with cookie domain priority for international users.
See references/auth-strategies.md for all patterns (browser login, cookie priority, API key, env var, context commands).
Store cookies at ~/.config/cli-web-<app>/auth.json with chmod 600.
Anti-bot resilient client construction (when detected in Phase 2):
bl), session IDs (f.sid), or CSRF tokens -- extract dynamically at runtimex-same-domain: 1 for Google apps)references/google-batchexecute.md for the complete Google patternRPC codec subpackage (for non-REST protocols like batchexecute):
When the API uses a non-REST protocol, add core/rpc/ with:
types.py -- method ID enum, URL constantsencoder.py -- request encoding (protocol-specific format)decoder.py -- response decoding (strip prefix, parse chunks, extract results)
The client.py still exists but delegates encoding/decoding to rpc/.Progress feedback -- Use rich>=13.0 spinners for operations >2s (suppress in --json mode). See references/rich-output-example.py.
JSON error output -- --json mode errors are JSON too, not plain text. Standard codes: AUTH_EXPIRED, RATE_LIMITED, NOT_FOUND, SERVER_ERROR, NETWORK_ERROR. Implement via utils/output.py json_error().
All commands use handle_errors(json_mode) context manager — centralizes error handling, exit codes (1=user, 2=system, 130=interrupt), and JSON errors. See references/helpers-module-example.py.
Generation commands support --wait, --retry N, --output path — for agent-scriptable end-to-end workflows. See references/polling-backoff-example.py.
Windows UTF-8 fix — Add at the top of <app>_cli.py before any imports that print:
import sys
if sys.stdout.encoding and sys.stdout.encoding.lower() not in ("utf-8", "utf8"):
try: sys.stdout.reconfigure(encoding="utf-8", errors="replace")
except AttributeError: pass
HTML table parsers MUST extract ALL visible columns — not just name/price,
because missing fields in --json output make the CLI useless for filtering and analysis.
If the site shows version, club, nation, stats, skills, weak foot — parse all of them.
Empty fields in --json output = incomplete parser.
Entry point: cli-web-<app> via setup.py console_scripts
Namespace: cli_web.*
Copy repl_skin.py from plugin for consistent REPL experience
utils/helpers.py -- shared CLI helpers (generate for every CLI):
resolve_partial_id(partial, items) — prefix-match UUIDs for get/rename/deletehandle_errors(json_mode) — context manager replacing try/except in all commandsrequire_notebook(notebook_arg) — gets notebook ID from arg or persistent contextsanitize_filename(name) — safe filenames from artifact titlespoll_until_complete(check_fn) — exponential backoff pollingget_context_value(key) / set_context_value(key, value) — persistent context.json
See references/helpers-module-example.py for the complete module.Not all helpers apply to every CLI. Include only what the CLI uses:
handle_errorsandprint_jsonare always needed.resolve_partial_idonly for UUID-based apps.require_notebook/context helpers only for apps with persistent context.poll_until_completeonly for generation/async operations.
These three bugs appear in almost every generated REPL. Get them right the first time:
1. Use shlex.split(), never line.split()
# ✓ Correct — handles quoted args: players search "messi" -> ['players', 'search', 'messi']
import shlex
args = shlex.split(line)
# ✗ Wrong — produces: ['players', 'search', '"messi"'] — quotes become part of the value
args = line.split()
2. Never pass **ctx.params to cli.main() in REPL dispatch
# ✓ Correct — preserve --json flag by prepending to args
repl_args = ["--json"] + args if ctx.obj.get("json") else args
cli.main(args=repl_args, standalone_mode=False)
# ✗ Wrong — ctx.params = {"json_mode": False} gets passed to Context.__init__()
# which doesn't accept it → TypeError: Context.__init__() got an unexpected
# keyword argument 'json_mode'
cli.main(args=args, standalone_mode=False, **ctx.params)
3. Keep _print_repl_help() in sync with the actual command surface
The _print_repl_help() function in <app>_cli.py is the user's first discovery surface — it's what they see when they type help in the REPL. It must mirror the real commands, including all key options. A REPL that shows outdated or incomplete help is confusing and makes the CLI feel broken.
# ✓ Correct — help lists actual options users can pass
def _print_repl_help():
_skin.info("Available commands:")
print(" players list [OPTIONS]")
print(" --position <GK|ST|CM|...> Filter by position")
print(" --rating-min N --rating-max N Rating range")
print(" --cheapest Sort cheapest first")
# ✗ Wrong — stale help doesn't mention new --position, --rating-min, etc.
def _print_repl_help():
print(" players list [--min-price N] List players with filters")
Rule: every time you add options to a command, update _print_repl_help() in the same commit.
4. Use @click.argument for positional REPL params, not @click.option("--x", required=True)
REPL commands show players search <query> in help. If query is a --query option,
users typing players search messi get "Error: Missing option '--query'".
Use positional arguments for natural command-line style:
# ✓ Correct — users type: players search messi OR players get 21610
@players.command()
@click.argument("query")
def search(query): ...
@players.command()
@click.argument("player_id", type=int)
def get(player_id): ...
# ✗ Wrong — users get an error unless they type: players search --query messi
@players.command()
@click.option("--query", required=True)
def search(query): ...
Rule of thumb: if a command takes a single required value that would be a positional arg
in a shell command (git checkout main, grep pattern), use @click.argument.
Use @click.option only for optional or named parameters (--rating-min, --platform).
When the CLI has 3+ command groups (e.g., notebooks, sources, chat, artifacts), dispatch parallel subagents -- one per command module. Each agent gets:
<APP>.md API spec for its resourceclient.py and auth.py interfaces it depends oncommands/notebooks.py with list, get, create, delete"Parallelization opportunities:
| Independent from each other | Dispatch in parallel |
|---|---|
commands/notebooks.py, commands/sources.py, commands/chat.py | Yes -- each command file only depends on client.py |
rpc/encoder.py and rpc/decoder.py | Yes -- encoder doesn't depend on decoder |
auth.py and models.py | Yes -- no shared logic |
client.py and commands/* | No -- commands depend on client |
<app>_cli.py (entry point) | Last -- imports all commands, write after they're done |
Implementation order (with maximum parallelism):
Phase A (sequential): Write core foundation
exceptions.py → client.py → auth.py (if needed) → models.py
Phase B (parallel): Dispatch ALL independent work simultaneously
┌─ Agent 1: commands/notebooks.py
├─ Agent 2: commands/sources.py
├─ Agent 3: commands/chat.py
├─ Agent 4: commands/artifacts.py
├─ Agent 5: rpc/encoder.py + rpc/decoder.py (if non-REST)
└─ Agent 6 (background): test_core.py (unit tests for core modules)
All run concurrently — each only depends on Phase A modules
Phase C (sequential): Wire everything together
utils/helpers.py → <app>_cli.py → __main__.py → setup.py → copy repl_skin.py
Key parallelism rules:
commands/*.py file)<app>_cli.py, setup.py) must come last (depends on all commands)Before invoking testing, install (pip install -e .) and verify:
cli-web-<app> --help loadscli-web-<app> auth status --json shows valid (if auth-required)cli-web-<app> <resource> list --json returns real dataRed flags — fix before testing:
wrb.fr, af.httprm in output → decoder broken[] or null where data expected → wrong params or client-side operationreferences/google-batchexecute.md "Client-Side Operations"Update phase state:
python ${CLAUDE_PLUGIN_ROOT}/scripts/phase-state.py complete <app> \
--phase methodology --output <app>/agent-harness/
When implementation is complete and the smoke check passes, invoke the testing
skill to plan and write tests.
Do NOT skip testing -- every CLI must have comprehensive tests before publishing.
| Skill | When it activates |
|---|---|
capture | Phase 1 -- traffic recording (prerequisite for this skill) |
testing | Phase 3 -- test writing, documentation |
standards | Phase 4 -- publish, verify, smoke test |
| Relationship | Skill |
|---|---|
| Preceded by | capture (Phase 1) |
| Followed by | testing (Phase 3) |
| References | traffic-patterns.md, auth-strategies.md, google-batchexecute.md, ssr-patterns.md, exception-hierarchy-example.py, client-architecture-example.py, polling-backoff-example.py, rich-output-example.py |
references/traffic-patterns.md -- Common API patterns (REST, GraphQL, RPC)references/auth-strategies.md -- Auth implementation strategiesreferences/google-batchexecute.md -- Google batchexecute RPC protocol specreferences/ssr-patterns.md -- SSR framework patterns and data extraction strategiesreferences/exception-hierarchy-example.py -- Complete exception hierarchy with HTTP status mappingreferences/client-architecture-example.py -- Namespaced sub-client pattern with auth retryreferences/polling-backoff-example.py -- Exponential backoff polling and rate-limit retryreferences/rich-output-example.py -- Rich progress bars, JSON error responses, table formattingnpx claudepluginhub itamarzand88/cli-anything-web --plugin cli-anything-webGenerates boilerplate core/ and utils/ modules for cli-web-* Python CLIs. Produces exceptions.py, client.py, helpers.py, config.py, output.py, and optional rpc/ subpackage with placeholders for protocol, auth, resources.
Provides patterns for building production CLI tools in Python with Typer/Click, featuring parseable JSON output, predictable command structure, and composability for agentic AI workflows.
Designs CLI surfaces including args/flags/subcommands/help/output/errors/config for new tools. Audits existing CLIs for consistency, composability, and agent ergonomics.