Preflight UX

Preflight UX is an open toolkit for pre-ship UX risk review. It runs structured persona panels against the same product surface, normalizes findings into a shared issue taxonomy, and keeps the raw evidence needed to score predictions against known product outcomes.

The project is designed for teams that want faster early critique without confusing model output for user research. A panel finding is a product-risk hypothesis until it is validated by benchmark scoring, user observation, telemetry, support data, or another real-world signal.

What It Contains

A persona library for expert and user-type review lenses.
An issue taxonomy for common pre-ship UX risks.
Benchmark scaffolds for products with documented post-launch UX issues.
JSON schemas for review briefs, panel runs, findings, benchmark metadata, and scores.
A small Python CLI for validation, benchmark scaffolding, run scaffolding, scoring, and report generation.
A deployable BYOK web UI for building review briefs, attaching screenshots, running a panel through a user-provided model key, and exporting repo-ready artifacts.

Current Status

This repository is an early public scaffold. The method is intentionally evidence-scoped:

The schemas, taxonomy, prompts, CLI, web UI, and structural checks are usable.
Seed benchmark entries are draft examples until their launch surfaces and scorer reviews are complete.
Personas are not benchmark-validated yet.
Reports should be treated as decision support, not as validated user research.

The next credibility step is to promote benchmark entries from draft to ready, run panels against them, and publish hits, misses, false positives, and persona-by-issue-class calibration notes.

Positioning

Preflight UX is adjacent to browser-agent usability testing, synthetic heuristic evaluation, persona-conditioned UI/UX evaluation, and LLM simulation benchmarks. It does not claim that LLM personas are new.

The project focuses on a different layer: an open, repo-native calibration loop for deciding when synthetic UX critique is useful. The intended contribution is the combination of shared issue classes, benchmark surfaces, normalized predictions, scored misses and false positives, persona reliability notes, and product-ready exports.

The open-source posture is part of the method. Prompts, schemas, taxonomy, benchmark entries, redacted execution receipts, score files, and report templates should be inspectable enough that contributors can challenge scoring decisions and improve the calibration loop. Public run folders keep normalized findings and redacted execution receipts; raw model transcripts are archived outside the public artifact surface.

See docs/POSITIONING.md and docs/EVALUATION_PROTOCOL.md.

Quick Start

Validate the repository:

python3 tools/validate_repo.py
python3 -m uxpanel validate

Install As Agent Skills

Preflight UX also ships as a small skill plugin for Claude Code/Cowork and Codex-style skill workflows.

Claude Code:

claude plugin marketplace add sparckix/preflight-ux
claude plugin install preflight-ux-review@preflight-ux

Codex:

codex plugin marketplace add sparckix/preflight-ux
codex plugin add preflight-ux-review@preflight-ux

Installed skills:

catch-ux-risks — review a product surface for launch risks
calibrate-ux-findings — compare findings against known failures
write-ux-risk-report — turn findings into a product-ready report

The skill plugin is a distribution layer. The benchmark, CLI, schemas, and scoring artifacts in this repo remain the source of truth. The plugin includes a small helper script that locates this checkout from the current workspace or PREFLIGHT_UX_REPO and delegates to python -m uxpanel when repo-backed validation, scoring, or report generation is available.

Create a benchmark entry:

python3 -m uxpanel new-benchmark example-product-2026

Inspect benchmark readiness and scored runs:

python3 -m uxpanel benchmark-status

Create a run scaffold:

python3 -m uxpanel run \
  --surface benchmark/products/example-product-2026/surface.md \
  --surface-type benchmark \
  --panel panels/default.yaml \
  --run-id example-product-2026-seed

Create a baseline scaffold for comparison:

python3 -m uxpanel baseline \
  --surface benchmark/products/example-product-2026/surface.md \
  --surface-type benchmark \
  --kind generic-critique \
  --run-id example-product-2026-generic-baseline

Generate a Markdown report from a run:

python3 -m uxpanel report \
  --run runs/example-product-2026-seed/run.json \
  --out reports/example-product-2026-seed.md

Score a run against known issues:

python3 -m uxpanel score \
  --run runs/example-product-2026-seed/run.json \
  --benchmark benchmark/products/example-product-2026 \
  --out calibration/example-product-2026-seed.score.json

preflight-ux-review

Popularity

What's Inside

README

Preflight UX

What It Contains

Current Status

Positioning

Quick Start

Install As Agent Skills

Confidence

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana