From harbormaster
Run scripted regression UI flows against iOS simulator and Android emulator for React Native, Expo, or native mobile apps using Maestro. Use this skill whenever the user wants to test mobile builds locally, smoke test before pushing to the App Store or Play Store, run click-through tests on a simulator, validate Expo dev build changes, exercise an iOS sim or Android emulator programmatically, or do regression testing on mobile. Trigger even if the user does not mention "Maestro" — any request to drive an iOS simulator or Android emulator, run mobile UI tests locally, or validate mobile app behaviour before release should activate this skill.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harbormaster:harbormasterThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Drive iOS simulators and Android emulators with scripted [Maestro](https://maestro.mobile.dev) flows. Built for the loop: "I made a change → did I break anything obvious?" — before pushing to the stores.
Drive iOS simulators and Android emulators with scripted Maestro flows. Built for the loop: "I made a change → did I break anything obvious?" — before pushing to the stores.
This skill targets macOS (iOS simulator requires Xcode, which is macOS-only). Android automation works on Linux/Windows too, but this skill currently assumes macOS.
Required:
Optional:
For deeper setup help, see references/ios-setup.md and references/android-setup.md.
Always step through these in order. Do not skip preflight even if the environment "looks fine."
Run ${CLAUDE_PLUGIN_ROOT}/scripts/preflight.sh. It verifies:
maestro on PATH (offers install command if missing — required for both platforms)The script exits 0 if at least one of iOS or Android is fully usable. Its final line reads Platforms usable: iOS + Android, iOS only, or Android only — read it to constrain later steps. If neither path is usable or maestro is missing, preflight exits 1.
If a required tool is missing, walk the user through fixing it interactively. Don't dump the raw error output and stop — explain what's missing, why it matters, and the exact command to fix it. The preflight script's output is already structured this way; relay it cleanly to the user.
Ask the user — or infer from project context (look for app.json, app.config.ts, ios/, android/):
ios, android, or both. Default rules:
iOS + Android usable → default both.iOS only → default ios (do not offer android or both — they will fail at boot).Android only → default android (same).dev-build — a built .app (iOS sim) or .apk (Android). Recommended default for regression suites. No Expo Go overlays, no dev menu interruptions, no extra cold-reload tax. Slower first build, but cleaner steady-state.expo-go — Expo Go installed, dev server running, deep-link to project. Inner-loop only. Expo Go can pop up dev tools / "What's new" / network-permission dialogs that block flows and require manual user input — fine for quick "did I break the launch?" checks, not suitable for a hands-off pre-release smoke run.installed — app already installed; skip the install step.If the user is testing before pushing a build to TestFlight / App Store / Play Store (the primary use case for this skill), default to dev build. Only fall back to Expo Go when the user explicitly chooses speed over reliability or asks for it.
If the project has <project>/.maestro/config.json, read it for these defaults rather than asking. See Project configuration below.
Don't assume a default device — let the user pick. Run ${CLAUDE_PLUGIN_ROOT}/scripts/list-devices.sh <ios|android|both> to enumerate available iOS simulators and Android AVDs, then present them to the user via AskUserQuestion.
When platform = both, you need one of each — an iOS sim and an Android AVD. Run the picker for each platform in turn (or merge into a single two-question prompt). Don't try to substitute one for the other if a platform's pick is unavailable; either fix the missing platform or downgrade to single-platform mode and tell the user.
Selection rules:
ios.preferredDevice (matched by simulator name) or android.preferredAvd, use that and skip the prompt — but still mention which device was picked so the user can override.platform = both, this rule applies per-platform: e.g. an already-booted iOS sim is auto-selected while the Android AVD still needs a pick.<name> — <runtime> (e.g. iPhone 17 Pro — iOS 26.2); Android options show the AVD name. Pre-select the most recent runtime / first AVD as the recommended default.Capture the chosen <udid> (iOS) and/or <avd-name> (Android) and pass them to the next step. Optionally offer to write the choice into <project>/.maestro/config.json as preferredDevice / preferredAvd so the prompt is skipped on future runs.
Run ${CLAUDE_PLUGIN_ROOT}/scripts/boot-sims.sh <ios|android|both> [--ios-udid <udid>] [--avd <name>]. This:
boot_completed before returningRun ${CLAUDE_PLUGIN_ROOT}/scripts/install-app.sh --platform <p> --source <s> [--path <p>] [--url <u>].
install-app.sh takes one platform per invocation. When platform = both, call it twice — once with --platform ios and once with --platform android — using the per-platform devBuildPath (and, for Expo Go, the same dev server URL for both). Surface a per-platform pass/fail to the user so they know which install succeeded if one of them errors.
Cross-platform footgun for Expo Go: the APP_ID env var passed at run time is one value, but Expo Go's bundle ID differs by platform (host.exp.Exponent on iOS, host.exp.exponent on Android — capital E matters). On a --platform both run targeting Expo Go, neither value works for the other device. Recommend dev builds whenever the user asks for both; if they insist on Expo Go for both, run each platform sequentially with a single-platform invocation rather than trying to share an APP_ID.
For Expo Go, the script uses xcrun simctl openurl (iOS) or adb shell am start (Android) to deep-link into the running dev server.
This skill never stores credentials. Anything sensitive lives only in the user's environment for the duration of the run.
If a flow file references ${USERNAME}, ${PASSWORD}, or any other secret env var:
direnv), use it as-is and don't prompt. Read with printenv NAME; do not log the value.AskUserQuestion for each missing value. State explicitly: "I won't store this — it's used only for this run."--env KEY=VALUE for this invocation. Do not write them to disk, do not echo them in logs, do not include them in error reports..env file. Do not call any keychain command.If a user wants persistent creds, they manage that themselves outside the skill — for example by exporting in their shell rc file or using direnv with a gitignored .envrc. The skill stays out of that boundary entirely.
Run ${CLAUDE_PLUGIN_ROOT}/scripts/run-flows.sh --flows <path> --platform <p> [--env KEY=VAL ...].
The flows path is typically <project>/.maestro/ — a directory of .yaml files Maestro will run in alphabetical order. The script:
--device when both are running)<project>/.maestro/artifacts/<run-id>/Summarise pass/fail per flow per platform. On failure, diagnose in tiers — cheapest first. Do not jump straight to screenshots; they are the most token-expensive tool and rarely needed first.
Tier 0 — Maestro's own output (read every time):
artifacts/<run-id>/<platform>/run.log — Maestro names the failed step and the selector it couldn't matchreport.xmlMost failures (selector typos, missing waits, behaviour changes) resolve here. Stop if Tier 0 is enough.
Tier 1 — live UI hierarchy (on demand, only if Tier 0 isn't enough):
maestro hierarchy inline to inspect the current screen state — text, ids, resource-ids, enabled state — without re-running the flow.maestro --device <udid> hierarchy. For Android: maestro --device <serial> hierarchy. Pipe through jq to keep only the fields you need (e.g. jq '.. | objects | {text, "resource-id", "accessibility-label", enabled} | select(.text or ."resource-id" or ."accessibility-label")') — raw hierarchy on a busy RN screen can exceed 5k tokens.Tier 2 — screenshot at failure point (only if Tier 0 + Tier 1 don't explain it):
artifacts/<run-id>/<platform>/screenshot-*.png only when the question is visual: layout overlap, unexpected dialog, dark/empty render, animation state. Each image costs ~1.5k tokens — load deliberately.Tier 3 — video (suggest to the user, do not load yourself):
artifacts/<run-id>/<platform>/*.mp4. Don't try to consume it.After diagnosis, suggest the likely cause (element not found → selector wrong or screen not loaded; timeout → app slow / wrong screen; assertion failed → behaviour changed) and the specific fix.
Do NOT just say "tests failed, here's the log." Diagnose.
This tiering applies to debugging failures. When authoring a new flow, use whatever you need (including screenshots) to get the flow correct first — optimise later.
The skill ships four slash commands that cover the flow lifecycle. Use these as the entry points; do not reinvent the loop ad-hoc.
| Command | When to use | What it does |
|---|---|---|
/initflow | Project has no .maestro/ directory yet | One-time bootstrap — discovers the project, detects auth, scaffolds .maestro/{config.json, README.md, app-launch.yaml, login.yaml?} with the project's actual appId substituted in |
/buildsuite | After /initflow, when you want a real suite of flows fast | Guided tour — walks the running app once with the user, builds a shared selector + screen plan in .tour-plan.json, then loops over each planned flow with a per-flow checkpoint to compose, run, and commit. Reuses Phase D-F of /authorflow and the Tier 0/1/2 debug ladder for failed runs |
/authorflow [flow-name] | After init, adding a new flow | Phased loop — Discover → Interview → Walk-the-screens (one screenshot + maestro hierarchy per step) → Compose → Run once → Commit + update .maestro/README.md |
/stabiliseflow <flow> [N] | Before relying on a flow as a release-gate smoke | Runs the flow N times consecutively (default 3), reports flake rate, diagnoses non-deterministic failures |
The full process — phase definitions, auth-detection grep patterns, selector-priority order, screenshot/hierarchy capture commands, README template — lives in references/authoring-flows.md. Read it before authoring, do not improvise.
For /buildsuite specifically, the depth doc is references/building-suites.md — five phases (Discover & confirm → Guided tour → Plan materialisation → Authoring loop → Index & report), the .tour-plan.json schema, the coverage-checklist UX, and the edge cases. Read it before invoking the command. /buildsuite shares project-discovery logic with /initflow (extracted under "Project discovery (shared)" in authoring-flows.md) and reuses Phase D-F conventions from /authorflow for the per-flow deep dives.
/authorflow ships with a one-run stability bar — if a freshly-authored flow passes once, it ships. Multi-run stability is a separate explicit step (/stabiliseflow). Rationale: one-run is fast for the inner-loop case; users who want release-gate confidence opt into the bar by running stabilise. Don't impose multi-run requirements during authoring.
Phase C of /authorflow captures one screenshot per step into <project>/.maestro/authoring-evidence/<flow>/. These are gitignored but persisted on disk — useful when a flow breaks months later and someone wonders what the screen used to look like. Don't delete them at the end of authoring.
/buildsuite writes its working plan to <project>/.maestro/.tour-plan.json between Phase 3 and Phase 5. The file is the persistence boundary for tour-derived data — once written, the user can quit and resume in a new session via /buildsuite (which detects an unfinished plan and offers to resume). On successful Phase 5 completion the plan is moved to .tour-plan.archive.json so a future invocation starts clean. Both files are gitignored. Schema in references/building-suites.md → "Plan schema".
Across all four commands, when picking selectors:
testID (RN: see references/writing-flows.md for the accessible={false} container pattern when testIDs are missing from maestro hierarchy)accessibilityLabelaccessibilityText)For Maestro YAML syntax (commands, env vars, runFlow, retry, conditional logic), see references/writing-flows.md.
Two config files live under <project>/.maestro/, with overlapping names but different owners:
config.json — read by this skill. Holds skill-level defaults (bundleId, dev-build paths, preferred device, Expo Go URL). Schema below.config.yaml — read by Maestro itself when you invoke maestro test. Workspace-level controls: flow discovery, executionOrder, tag filters, onFlowStart/onFlowComplete hooks. Template at references/flow-examples/config.yaml. /initflow scaffolds it; the skill does not otherwise touch it at runtime.If the project has <project>/.maestro/config.json, read it for defaults:
{
"ios": {
"bundleId": "com.example.app",
"devBuildPath": "ios/build/Build/Products/Debug-iphonesimulator/Example.app",
"preferredDevice": "iPhone 17 Pro"
},
"android": {
"package": "com.example.app",
"devBuildPath": "android/app/build/outputs/apk/debug/app-debug.apk",
"preferredAvd": "Pixel_8_Pro"
},
"expoGo": {
"devServerUrl": "exp://192.168.1.10:8081"
},
"flowsDir": ".maestro"
}
If the file is missing, prompt for the values you need and offer to write the config so the user doesn't have to re-answer next time.
Common failure modes and fixes:
xcrun simctl shutdown all && xcrun simctl erase all to reset statemaestro --device <udid-or-serial> hierarchy to inspect the live accessibility tree, prefer text or accessibility-label selectors.app was built for a real device, not the simulator. Look for a Debug-iphonesimulator build directory.bunx expo start and use the fresh URL.adb kill-server && adb start-serverFor a fuller list see references/troubleshooting.md.
references/writing-flows.md — Maestro YAML syntax cheat sheetreferences/ios-setup.md — Xcode + sim setup, troubleshootingreferences/android-setup.md — Android SDK + AVD setup, troubleshootingreferences/troubleshooting.md — common errors and fixesreferences/flow-examples/app-launch.yaml — verify app launches and reaches homereferences/flow-examples/login.yaml — log in using env-supplied creds (never stored)references/flow-examples/view-list.yaml — navigate into a list and verify items renderProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
npx claudepluginhub connectivitychris/harbormaster --plugin harbormaster