From agency-os
Notion-as-source-of-truth dispatch board. Status flow Suggestion→Discussion→To-Do→In Progress→Done with subtasks, recurring tasks, deps, batch agent execution. Trigger: 'suggest X', 'discuss X', 'approve', 'start X', 'run todo', 'mark X done', 'kill X', 'next', 'capture this to Notion', 'agency-os status'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agency-os:agency-osThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Notion-as-source-of-truth dispatch board. One Tasks database, one Hub page, one page per Corpus, one page each for General Guidance and Resources. The skill mutates Notion via the Notion MCP (`mcp__*__notion-*` tools); only `references/notion-pointers.json` is committed to git.
references/bootstrapping.mdreferences/commands/add-subtask.mdreferences/commands/approve.mdreferences/commands/discuss.mdreferences/commands/done.mdreferences/commands/init.mdreferences/commands/kill.mdreferences/commands/list.mdreferences/commands/log.mdreferences/commands/move.mdreferences/commands/next.mdreferences/commands/refresh.mdreferences/commands/run.mdreferences/commands/scaffold.mdreferences/commands/show.mdreferences/commands/start.mdreferences/commands/status.mdreferences/commands/suggest.mdreferences/commands/tree.mdreferences/commands/update.mdNotion-as-source-of-truth dispatch board. One Tasks database, one Hub page, one page per Corpus, one page each for General Guidance and Resources. The skill mutates Notion via the Notion MCP (mcp__*__notion-* tools); only references/notion-pointers.json is committed to git.
Skill name decision: the skill is named agency-os (matching the repo). All commands are /agency-os <cmd>. This is the single plugin entry point; there is no agency-os/notion sub-namespace. If you embed this plugin alongside others, prefix commands with agency-os to avoid collisions.
This is the canonical positioning brief. Apply it to any user-facing copy you draft for agency-os — README sections, blog posts, X threads, LinkedIn posts, Reddit submissions, plugin-directory entries, awesome-list PR bodies, social cards, launch surfaces, anything operators or audiences will read.
Lead with these, in this order:
Exec=Agent is opt-in per row.Demote — mention later in the piece, never in the hook, headline, or opening tweet:
The product's first value is structuring AI-assisted work and keeping the human in control of what ships. Cost efficiency is a happy consequence of doing that well, not the pitch.
This section is mechanics-neutral: how the Sonnet/Haiku dispatch works is documented below under "Execution model" — that's how, not why. Don't conflate the two when writing copy.
This SKILL.md is the authoritative spec and is harness-agnostic. The slash-command interface (/agency-os ...), the status flow, the schema, the workspace layout, and every command's behavior are the same everywhere.
Per-harness wrappers only differ in two things:
/agency-os ... (or the natural-language equivalent) into chat. Either way, the parser is the same.See docs/harnesses/ for per-harness setup. If you're reading this in a non-Claude-Code harness, treat the subagent dispatch instructions as optional: execute commands inline instead.
Every /agency-os <command> invocation runs on Sonnet at medium reasoning effort via a subagent. Sonnet at medium is the right balance for ID resolution, dedup, and brief assembly. The orchestrator stays free for conversation.
Agent({
description: "Run /agency-os <command>",
subagent_type: "general-purpose",
model: "sonnet",
prompt: "Run /agency-os <command> <args>.\n\nRead .claude/skills/agency-os/SKILL.md AND .claude/skills/agency-os/references/commands/<command>.md for the specific command spec. Execute end-to-end: sync preflight (run `python .claude/skills/agency-os/scripts/sync-tasks.py` via Bash — this incrementally refreshes the local mirror at `state/tasks.json`; do NOT call notion-fetch on the full data source as a default read path), resolve IDs against the local mirror, mutate Notion via the Notion MCP, write-through patch the mirror with the MCP response in the same step, and return the exact output format the command file specifies. All task/result links must be formatted as CommonMark markdown links — `[title](url)` — never HTML `<a>` tags and never a bare URL. If anything fails (sync, MCP, ID resolution), stop and report — do not guess.\n\nSTRUCTURE CHECK (for `suggest` without `--parent`, and for any batch suggest): Before creating any row, read `.claude/skills/agency-os/state/task-tree.json` for the current active task hierarchy. If the file is absent, call `/agency-os tree` first. Scan In-Progress and To-Do rows for an obvious umbrella parent. If a candidate exists with high confidence, surface it and require the orchestrator to confirm before proceeding. Never silently create a top-level row when an obvious parent exists. Full rules: `references/structuring-work.md`.\n\nALWAYS PRODUCE OUTPUT. On success: the skill's standard output. On failure: one paragraph naming what failed and what state Notion was left in. Returning nothing is not an option."
})
Orchestrator MUST relay subagent output verbatim. If output is empty: say subagent returned no output; check Notion directly. Never re-run, never absorb silently.
Natural-language driving stays on the orchestrator. Parsing "let's discuss the X task" into /agency-os discuss <id>, asking clarifying questions during a discussion, deciding log vs add-subtask — that runs on the orchestrator so thread context survives. Only discrete mutations dispatch to Sonnet.
Rule of thumb: touches Notion via MCP → Sonnet subagent. Deciding what to touch → orchestrator.
On Cursor, Cline, Continue, and generic MCP harnesses, the skill can't spawn subagents. Instead:
/agency-os init to store your preferred models in .claude/skills/agency-os/config.json./agency-os run): read config.json and use the stored models' constraints when evaluating task complexity.The config file has this shape:
{
"harness": "cursor",
"models": {
"haiku": "claude-haiku-4-5-20251001",
"sonnet": "claude-sonnet-4-6",
"opus": "claude-opus-4-7"
}
}
The /agency-os run command still uses the same picker heuristic (Haiku for mechanical work, Sonnet default, Opus for strategic) — but on non-Claude harnesses, you need to ensure your harness has an API or SDK integration with those models. If your harness only supports one model, set all three to that model.
Three transitions require explicit user authorization every time. The skill must never make these moves on its own initiative.
Gate 1 — Promoting to To-Do. Any transition into To-Do (via approve, move --to todo, or direct Status write) is a scheduling commitment. MUST NOT promote unless the user named that row or an explicit batch the row clearly belongs to. "Cascade-approve discussion descendants because the user wants the parent executed" is forbidden — that's auto-promotion. Recurring-loop auto-flip on done is the only exception (the operator authorized it when marking Type=recurring).
Gate 2 — Marking Exec=Agent. Operator-owned. Skill MUST NOT set Exec=Agent on any row unless the user explicitly named that row and asked for it. Bulk-flipping across a subtree because "these look agent-runnable" is forbidden. Exec=Human and Exec=none are safe defaults; only Exec=Agent requires the gate.
Gate 3 — Firing the queue. /agency-os run is dry-run by default; --go is the user's explicit dispatch authorization. Orchestrator never adds --go to a run call the user did not type.
Worked failure mode. User says "execute all todo tasks under OS-XXX" → dispatch rows that are already Status=To-Do AND Exec=Agent under OS-XXX. Does NOT authorize flipping Discussion descendants to To-Do, flipping leaves to Exec=Agent, or any prep step. If literal scope yields zero runnable rows, say so and ask — do not invent prep.
When in doubt — ask. List candidates, wait for selection, then act. Never guess the set.
.claude/skills/, every doc under docs/, the notion-pointers.json binding, and references/general-guidance.md (the canonical General Guidance text — Notion's page is a one-way mirror).references/general-guidance.md → Notion General Guidance. Edit locally, push to Notion. Never reverse-sync.Briefs pull task state from Notion (via the local mirror) and stable spec from local files. The brief contains pointers to local files, not copies.
# setup
/agency-os scaffold [--parent=<page-id-or-url>] [--corpora="<n1>,<n2>,..."]
/agency-os init [--harness ...] [--haiku=...] [--sonnet=...] [--opus=...]
/agency-os sync
/agency-os tree [--depth N] [--corpus=<s>] [--status=<s>]
# suggestions
/agency-os suggest "<title>" [--corpus=<s>] [--type one-time|recurring]
[--cadence ...] [--notes "..."] [--effort S|M|L|XL]
[--parent=<id>] [--force-top-level]
# clarification & subtasks
/agency-os discuss <id>
/agency-os log <id> "<entry>"
/agency-os add-subtask <parent-id> "<title>" [--effort=<e>] [--notes "..."] [--deps=...]
/agency-os approve <id>
# execution
/agency-os start <id>
/agency-os refresh
/agency-os run [--go]
/agency-os done <id> [--result-link <url>] [--note "..."]
/agency-os kill <id> [--reason "..."]
# read
/agency-os next [N] [--corpus=<s>]
/agency-os status
/agency-os list <suggestion|discussion|todo|inprogress|done|killed|recurring|all> [--corpus=<s>]
/agency-os show <id> [--section description|discussion|donelog|all] [--entry <date>]
# escape hatches
/agency-os update <id> [--title="..."] [--notes="..."] [--priority=1|2|3|4]
[--effort=...] [--type=...] [--cadence=...] [--corpus=<s>]
[--deps=<id1>,<id2>,...|none] [--parent=<id>|none]
/agency-os move <id> --to <status>
/agency-os add-corpus "<name>" [--goal "..."]
Natural language is also a trigger — see the "Natural-language driving" table below.
Command specs live in references/commands/<name>.md. The subagent loads SKILL.md + the one command file for the dispatched command.
Suggestion --discuss--> Discussion --approve--> To-Do --start--> In Progress --done--> Done
^
any --kill--> Killed (terminal) |
Recurring tasks: done logs + loops to To-Do ---+
| Status | Meaning | Set by |
|---|---|---|
Suggestion | Idea in the inbox; not yet discussed | suggest, manual |
Discussion | Under clarification | discuss |
To-Do | Approved; scheduled | approve, recurring loop |
In Progress | Agent actively working | start |
Done | Closed (one-time) | done |
Killed | Intentionally dropped | kill |
next filters Status == "To-Do" sorted by Priority asc then Created asc. run requires Status == "To-Do" AND Exec == "Agent". suggest refuses dupes via Title-Jaccard ≥ 0.8 against any non-terminal status. If start crashes, manual recovery: /agency-os move <id> --to todo.
Every command reads from a local JSON mirror at state/tasks.json, not from live Notion. The mirror is kept fresh by an incremental delta sync that only fetches pages whose last_edited_time advanced since the last sync — typically a handful of rows per command, often zero. Live notion-fetch of the full data source is the escape hatch, not the default. Token cost per command drops by roughly an order of magnitude after the first bootstrap.
Preflight (every command). Run:
python .claude/skills/agency-os/scripts/sync-tasks.py
via the Bash tool. The script reads NOTION_KEY from .env, pulls the data source ID from notion-pointers.json, queries Notion with a last_edited_time filter, parses each changed page's body (Description / Discussion log / Done log toggleable H2 sections), and upserts into state/tasks.json. On first run (no mirror file) it does a full bootstrap. Then resolve <id-or-substring> against state/tasks.json.
Write-through on mutations. Every command that mutates Notion (via the Notion MCP — notion-create-pages, notion-update-page, etc.) MUST patch the corresponding row in state/tasks.json in the same step, using the response payload from the MCP call. This keeps the mirror authoritative across the same session and prevents the next command from re-syncing the row we just wrote. The shape of each row is what sync-tasks.py produces (see schema below).
Escape hatches.
python scripts/sync-tasks.py --full — full rebuild (e.g. after schema changes or suspected drift).notion-fetch <data_source_id> directly — only when you need a property the mirror doesn't carry, or when verifying the mirror against live data. Do not make this the default; the whole point is to stop dragging the full DB into LLM context on every command.Tree snapshot side-effect. After every successful sync, also write state/task-tree.json — a compact nested representation of all non-terminal tasks (Suggestion / Discussion / To-Do / In Progress), structured as a tree via Parent Task. This is the free structure reference for suggest preflight and orchestrator planning. Both files are gitignored.
{
"snapshot_at": "<iso>",
"tree": [
{
"id": "OS-12", "notion_id": "<uuid>", "url": "<url>",
"title": "Top-level initiative", "status": "In Progress",
"corpus": "General", "effort": "L", "depth": 0,
"children": [
{ "id": "OS-13", "notion_id": "...", "url": "...",
"title": "Container subtask", "status": "To-Do",
"corpus": "General", "effort": "M", "depth": 1,
"children": [] }
]
}
]
}
Mirror schema (state/tasks.json).
{
"synced_at": "<iso>",
"previous_synced_at": "<iso|null>",
"tasks": [
{
"notion_id": "<uuid>", "task_id": "OS-12", "url": "<url>",
"title": "...", "status": "To-Do",
"type": "one-time", "cadence": null,
"corpus": "General", "priority": "2", "impact": "high",
"effort": "M", "exec": "Agent", "tags": [],
"parent_task_id": "<uuid|null>", "dependency_ids": ["<uuid>"],
"last_done": null, "done_at": null, "result_link": null,
"created_time": "<iso>", "last_edited_time": "<iso>",
"description": "<first 600 chars of Description section>",
"latest_discussion_entry": "<most recent ### entry, truncated>",
"last_done_log_entry": "<most recent ### entry, truncated>"
}
]
}
If sync-tasks.py fails (Notion API down, OAuth expired, network error), print sync failed: <reason> and abort the command — do not fall back to a stale mirror for mutations. Read-only commands (list, next, show, status, tree) MAY proceed against the existing mirror with a sync failed; reading stale mirror warning; mutations (suggest, discuss, approve, start, done, kill, update, move, log, add-subtask) MUST abort.
Hub (page)
+-- intro: what this board is for
+-- General Guidance -> page
+-- General Plan -> table; one row per Corpus, linked to its page
+-- Suggestions Inbox (linked DB view: Status=Suggestion, sort Created desc)
+-- In Discussion (linked DB view: Status=Discussion, sort Created desc)
+-- To-Do (Scheduled) (linked DB view: Status=To-Do, sort Priority asc, group by Corpus)
+-- Recurring (linked DB view: Type=recurring, sort Last Done asc)
+-- In Progress (linked DB view: Status=In Progress)
+-- Recently Done (linked DB view: Status=Done, sort Done At desc, limit 25)
+-- Resources -> page
Tasks (database)
+-- one row per task, of any status, including subtasks
| Property | Type | Notes |
|---|---|---|
Title | title | Imperative phrase, <=80 chars |
Status | status | Suggestion, Discussion, To-Do, In Progress, Done, Killed. Default Suggestion |
Type | select | one-time (default), recurring |
Cadence | select | daily, weekly, biweekly, monthly, quarterly, yearly. Empty for one-time |
Last Done | date | Set on done for recurring |
Corpus | select | One of the configured corpora |
Priority | select | 1–4 — urgency (1 = this week / 4 = nice to have). Default 4 |
Impact | select | low, medium, high, outsized — outcome size within corpus. Default medium |
Effort | select | S, M, L, XL. Default M |
Exec | select | none (default), Agent, Human. Operator-set gate for run |
Parent Task | relation (self) | Subtask linkage; empty for top-level |
Subtasks | rollup | Inverse of Parent Task |
Created | created_time | Automatic |
Done At | date | Set on terminal Done |
Result Link | url | Live link, post URL, PR URL |
Tags | multi_select | Cross-cutting |
Dependencies | relation (self) | Must reach Done before this row is dispatchable by run. Ignored by start, next, list |
Every task page uses toggleable H2 sections (is_toggleable: true). If the MCP can't set the flag, fall back to plain H2.
> Launch this task
> Paste in Claude Code: `/agency-os start <id>`
Description <- starts expanded
<freeform: what to do, why, acceptance criteria, links>
Subtasks <- expanded if any exist
(linked DB view: Parent Task = this)
Discussion log <- collapsed
### 2026-01-10 — initial clarification
...
Done log <- collapsed; mainly for recurring
### 2026-01-12: completed by agent — link <url>
Related <- expanded; one-line each
Corpus: [-> <corpus>](<corpus-url>)
General guidance: [-> Guidance](<guidance-url>)
Parent: [-> <parent-title>](<parent-url>)
# Corpus: <name>
## Goal
1-3 sentences on what "done" looks like.
## Local guidance
Conventions, owners, references.
## Tasks
(linked DB view: Corpus = this, group by Status)
Project-wide rules. Kept short. Links into docs/ and .claude/skills/. Links beat duplication. Seeded from references/general-guidance.md — edit the local file, push the mirror.
.claude/skills/agency-os/references/notion-pointers.json (committed):
{
"hub": { "page_id": "<uuid>", "url": "...", "title": "..." },
"tasks_database": {
"database_id": "<uuid>",
"data_source_id": "<uuid>",
"url": "...",
"task_id_prefix": "OS"
},
"guidance": { "page_id": "<uuid>", "url": "...", "title": "..." },
"resources": { "page_id": "<uuid>", "url": "...", "title": "..." },
"corpora": {
"General": { "page_id": "<uuid>", "url": "...", "title": "General" }
},
"hub_views": { "Suggestions Inbox": "view://<uuid>", "...": "..." },
"corpus_views": { "General": "view://<uuid>" },
"schema_summary": { }
}
Each command's full spec lives in its own file. The dispatch subagent loads this SKILL.md plus the one relevant file.
| Command | Spec file |
|---|---|
init | references/commands/init.md |
scaffold, add-corpus | references/commands/scaffold.md |
suggest | references/commands/suggest.md |
discuss | references/commands/discuss.md |
log | references/commands/log.md |
add-subtask | references/commands/add-subtask.md |
approve | references/commands/approve.md |
start (alias launch) | references/commands/start.md |
refresh | references/commands/refresh.md |
run | references/commands/run.md |
done | references/commands/done.md |
kill | references/commands/kill.md |
next | references/commands/next.md |
status | references/commands/status.md |
list | references/commands/list.md |
tree | references/commands/tree.md |
show | references/commands/show.md |
update | references/commands/update.md |
move | references/commands/move.md |
Shared rules: parent-vs-subtask-vs-log decisions, structure preflight protocol, "move this chat to Notion" workflow, and a worked example → references/structuring-work.md. The suggest and add-subtask specs reference this file.
When the user says these things in chat, the skill translates to commands:
| User says | Skill calls |
|---|---|
| "what's the structure" / "show me the hierarchy" / "how is this organized" / "what tasks are active" | tree |
| "add a suggestion: " / "new idea: <title>" | suggest "<title>" — structure preflight runs inside; if an obvious parent exists the subagent will surface it and wait for confirmation before creating the row |
| "create N tasks for X, Y, Z" / "batch add: ..." / "set up tasks for " | orchestrator reads state/task-tree.json (or runs tree if absent), proposes the full planned tree in chat (existing umbrella? container needed?), waits for user confirmation, THEN dispatches suggest --parent=<id> per item |
| "reorganize " / "restructure under " / "move under " / "group these tasks" | update <id> --parent=<parent-id> per row (or --parent=none to demote) |
| "let's discuss X" / "open X for discussion" | discuss <id> |
| "log: " / "note that " (during discuss) | log <id> "<thing>" |
| "add a subtask: " / "we'll also need to <thing>" | add-subtask <parent-id> "<title>" |
| "approve" / "approve it" / "ship it" / "go ahead" | approve <id> (Gate-1 — user must name the row or explicit batch; never auto-cascade beyond what the user said) |
| "start X" / "launch X" / "let's do X now" | start <id> |
| "run todo" / "run all" / "execute the queue" | run (dry-run) -> user reviews -> run --go. Gate-3 — --go is added ONLY when the user explicitly says go/fire/yes. NEVER cascade-approve or flip Exec=Agent as a "prep step" — execute only rows that already meet Status=To-Do AND Exec=Agent literally |
| "X is done [link ]" / "mark X done" | done <id> [--result-link <url>] |
| "kill X" / "drop X" / "X is no longer relevant" | kill <id> |
| "make X recurring weekly" / "X is recurring monthly" | update <id> --type=recurring --cadence=weekly |
| "what's next" / "what should I work on" | next [N] |
| "show me X" / "details on X" | show <id> |
| "what's the status" / "where are we" | status |
| "save this to Notion" / "capture this conversation" | "move this chat to Notion" workflow — see references/structuring-work.md |
The skill is the user's hands. When in doubt, the agent asks: "Should I <command equivalent> for that?" before mutating. Trivial appends (a log entry during an open discuss, an add-subtask from an explicit "we'll need to") can proceed without asking each time.
Fresh setup and migration from a prior hub: see references/bootstrapping.md.
next shows; only start flips status and loads brief.Exec=Agent without explicit authorization (Gate 2).--go to a run the user didn't write (Gate 3).kill is archival; the row remains.notion-pointers.json (which scaffold writes).suggest. Manual Notion-UI rows bypass.start per task at a time. In Progress is the lock./agency-os suggest -> discuss -> log/add-subtask (clarify) -> approve -> start -> done
^ |
|recur |
+-------+
Sync runs as preflight via scripts/sync-tasks.py, refreshing the local mirror at state/tasks.json incrementally. Notion is canonical; the mirror is the read path; the pointer file is the only repo binding. The DB grid stays clean; details fold; briefs stay bounded.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub automatelab-tech/agency-os