From rama-skill
Guides multi-phase Rama module development with strict phase isolation, artifact requirements, and production-first design (PStates, topologies, cooperative multitasking).
How this skill is triggered — by the user, by Claude, or both
Slash command
/rama-skill:ramaThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Read `references/phases.md` for the overview of the multi-phase process. Each phase has a dedicated doc under `references/phase-N-*.md` with the full instructions for that phase, including which artifact template to copy and what to fill in.
references/aggregators.mdreferences/app-design.mdreferences/artifact-impl-validation.mdreferences/artifact-implicit-spec.mdreferences/artifact-plan-validation.mdreferences/artifact-plan.mdreferences/artifact-test-validation.mdreferences/batch.mdreferences/core-concepts.mdreferences/dataflow.mdreferences/depot-design.mdreferences/depot-migration.mdreferences/depot-reference.mdreferences/external-depots.mdreferences/foreign-client.mdreferences/formal-model.mdreferences/microbatch.mdreferences/mirrors.mdreferences/operate.mdreferences/paths.mdRead references/phases.md for the overview of the multi-phase process. Each phase has a dedicated doc under references/phase-N-*.md with the full instructions for that phase, including which artifact template to copy and what to fill in.
One phase per agent session. Each phase is a fresh context, reads exactly one phase doc, produces exactly one artifact, and stops. Validation phases emit a verdict consumed by the calling system; on FAIL the prior phase is re-invoked with the failure artifact as input. This isolation is intentional: empirically, agents that walk multiple phases in one session anchor on early design decisions and fail to revise them when later thinking surfaces problems.
Every phase produces a required artifact. Skipping artifacts leads to wrong PState schemas, wasted disk I/O, and topologies that need to be rewritten.
If context compacts, re-read the current phase doc and the artifact you're filling in — both contain the full instructions on disk.
Always design for production. InProcessCluster is a test harness — production has node failures, process crashes, and retries. Never downgrade a correctness guarantee because "tests won't hit that case." Design the module to be correct under all conditions, then test it as best as possible with IPC.
Every task is single-threaded. All topology processing, query invocations, and foreign selects on a task execute sequentially with exclusive access to all PState partitions on that task. Within a single event, reads and writes to any number of PStates on the same task are atomic — no locking or concurrency coordination needed.
Because tasks are single-threaded, topology code must use cooperative multitasking. Long-running synchronous work on a task blocks everything else — including query topology invocations, creating latency spikes for reads. Risk factors: large local-select> calls on subindexed structures that iterate many entries, or loop<- with many iterations of PState reads/writes. Use {:allow-yield? true} on large reads of local PStates, such as iterating on a subindexed structure (e.g. (local-select> [(keypath *k) MAP-VALS] $$p {:allow-yield? true} :> *v)), and (yield-if-overtime) in long loops to periodically yield the task thread. See "Yielding" section in references/dataflow.md for details on both.
Worker restart does NOT replay depot history. PStates are durable storage in their own right — replicated and persisted to RocksDB at write time, NOT a materialized view recomputed from depots — and topologies persist their consumed offset / microbatch ID and resume from where they left off on restart of worker processes.
Never trade I/O efficiency for code simplicity. Rama modules are production backends serving millions of queries. Unnecessary I/O is not just a latency concern for one query — it is wasted resources that will lower the throughput the module can handle. Code complexity is a one-time cost; unnecessary I/O is a per-query cost multiplied by every invocation in production. When there is a conflict between simpler code and fewer disk reads or network roundtrips, always choose fewer I/O operations.
Never trade fault tolerance for code simplicity. Duplicate side effects from retried processing are bugs, not acceptable tradeoffs. Every write must produce correct results even if processing retries. Do NOT dismiss retry-safety concerns as "acceptable for this use case."
Never delete data unless the spec explicitly requires it. Read operations can be called at any time, including after processing is complete. Deleting data that read operations depend on breaks the contract, even if the spec is silent on deletion.
Performance costs to design around:
foreign-select/foreign-select-one/foreign-invoke-query from client code is a full network roundtrip. Partitioner calls in topologies also add network latency. Minimize roundtrips by using query topologies instead of multiple client-side foreign selects.Write-path work (updating an extra precomputed level or denormalized view) happens once per event and is amortized across all future queries. A small increase in write-path work that dramatically reduces read-path seeks is almost always worth it. When evaluating a design, estimate worst-case cost as: (number of seeks × ~0.5ms) + (number of iterated entries × ~5µs).
The full Phase 0-4 process applies to module implementations — tasks that involve designing depots, PStates, and topologies together.
For simpler tasks (e.g., implementing a single deframafn, fixing a bug, or explaining existing code), use judgment. Still load and consult references relevant to the task, but the full planning artifact may not be needed.
Every Rama implementation should optimize for:
Balanced computation across tasks. Work should distribute as evenly as possible across all N tasks. Avoid funneling writes through |global (task 0) for high-throughput data, unless it's filtered down first with two-phase aggregation in batch blocks. Partition depots by the key that will be used for PState lookups.
Balanced storage across tasks. PState data should spread evenly. Choose depot partitioners that match PState key access patterns so data colocates naturally. Global PStates (:global? true) are only for singletons (config, counters, ID generators).
Colocate related data. Design PState partitioning around the application's core queries, not just the top-level key. For example, in a social network: $$accounts, $$account->posts, and $$post->likes might all partition by account ID (not post ID for likes), because the core query is fetching a timeline — all information about a user's posts should be colocated on the same task. select> is convenient when repartitioning is needed, but the goal is to minimize how often that's necessary. There are always tradeoffs — optimizing colocation for one query pattern may require repartitioning for another.
Subindexing for large collections. PState collections that grow beyond ~100 elements should use {:subindex? true}. This enables O(1) lookups, efficient range queries, and avoids loading entire collections into memory. Don't subindex small collections (< 50 elements) — overhead without benefit.
Correct partition alignment. Every local-select>/local-transform> must be preceded by a partitioner that routes to the correct task for the key being accessed, if it's not already on that task. Misalignment is silent — it compiles, runs, and produces wrong results.
Appropriate topology choice. Stream for low-latency + ack coordination. Microbatch for everything else (exactly-once, cross-partition atomicity, higher throughput).
Minimize storage I/O. Each path navigation through a top-level key or subindexed structure is a RocksDB read. Choose the approach that minimizes total reads:
(term f) for read+write in one operation (e.g., (term inc) to increment without reading separately).local-select>): compute the new value and use (termval *new-val) — this does NO read, just a write. Do NOT use (term f) to re-read a value you already have.keypath + termval directly skips the read entirely; keypath + further navigation does read. See references/paths.md "No-Read Optimizations" for details and other similar optimizations.*var — value binding$$pstate — PState reference%frag — fragment var (microbatch source binding or anonymous operations)**unground — outer join variable (nullable, batch mode):> — output bindingRead references/core-concepts.md for the full reference on: dataflow language (:> binding, <<if, loop<-, emit semantics), paths (navigators, local-select>, local-transform>), foreign context (depot appends, PState queries, ack levels), mirrors, ACID/replication, serialization, and task model (partition alignment, tasks/threads/workers).
references/mirrors.md — mirror declarations, cross-module depot/PState/query access, partition routing, ack semanticsreferences/pstate-migration.md — PState schema migration patterns, idempotency, subindex conversionreferences/depot-migration.md — depot record migration, DEPOT-TOMBSTONE, migration IDsreferences/operate.md — cluster setup, CLI commands, module deploy/update/scale, monitoring, upgrades, module management functions, dynamic optionsreferences/formal-model.md — typed lambda calculus model: types, effects, judgments, ownership, visibility, transaction scope, invariantsreferences/syntax.md — EBNF grammar for declarations, built-in ops with emit cardinality, custom operations, partitioners, state interaction, source/ingressnpx claudepluginhub redplanetlabs/rama-ai-learn --plugin rama-skillGuides distributed systems design: CAP theorem analysis, Raft/Paxos consensus, sharding/partitioning, eventual consistency, leader election, distributed locking (Redlock/ZooKeeper).
Guides customization of Choo Choo Ralph workflows, formulas, learning harvest, troubleshooting loops, and multi-Ralph setups. Use for Ralph-specific questions.
Architect, build, and debug Kafka Streams apps (JVM-embedded stream processing). Use when user mentions KStream, KTable, topology, TopologyTestDriver, StreamsBuilder, interactive queries, GlobalKTable, joins/windows/aggregations, or debugging issues (rebalancing, state stores, lag, deserialization errors). Do NOT trigger for Flink, connectors, CDC, or plain producer/consumer.