Skill

swarm

Dispatches many independent items in parallel using table-based fan-out, merges results, and supports retry. Useful for batch-processing files or records.

automation

Popularity

Stars

782

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/langchain-skills:swarm

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Process many independent items in parallel. `create` builds a table handle;

Supporting Files

scripts/batching.tsscripts/executor.tsscripts/filter.tsscripts/index.tsscripts/interpolate.tsscripts/table.tsscripts/types.tsscripts/utils.ts

SKILL.md

338 lines · ~3k tokens

Stats

LanguageTypeScript

Stars782

Forks68

MaintenanceExcellent

Last CommitJun 8, 2026

Actions

View Source View Plugin View on GitHub View README

Swarm

Process many independent items in parallel. create builds a table handle; run fans work out across rows and merges results back. One row = one unit of work — swarm handles batching automatically.

Flow

Create. Build a table from a source — files, a glob pattern, or pre-parsed records. One row per item. Returns a handle.
Run. Dispatch an instruction template across rows. Results are merged back into the table. Returns { completed, failed, skipped, failures }.
Aggregate. Use rows() and plain JS to count, filter, or summarize. Do not spawn additional subagents for aggregation.
Retry. Re-run with filter: { column: "<col>", exists: false } to reprocess only failed rows.

Choosing a source

glob / filePaths — one file = one row. Use when each file is an independent unit of work. Each row gets { id, file }; the subagent reads the file itself via the {file} placeholder.

tasks — pass pre-built records directly. Use when the data lives inside a file (JSONL, CSV, JSON array). Read and parse the file first inside eval, then pass the records. One record = one row — do not group multiple items into a single row.

For small files (under ~500 lines), parse and create in one block:

const { create } = await import("@/skills/swarm");
const raw = await tools.readFile({ file_path: "/data.jsonl" });
const records = raw.trim().split("\n").map(l => JSON.parse(l));
const table = await create({ tasks: records });
console.log(table);

For large files, read in chunks of 500 lines to avoid truncation:

const { create } = await import("@/skills/swarm");
let records = [];
let offset = 0;
while (true) {
  const chunk = await tools.readFile({ file_path: "/data.txt", offset, limit: 500 });
  const lines = chunk.split("\n").filter(l => l.trim());
  for (const l of lines) { records.push({ id: `r${records.length}`, text: l }); }
  if (lines.length < 500) break;
  offset += 500;
}
const table = await create({ tasks: records });
console.log(table);

When the file is too large to parse and dispatch in one eval call, split across two blocks. Only the block that calls swarm functions needs the import:

// eval 1: parse only — no swarm import needed
const raw = await tools.readFile({ file_path: "/data.jsonl" });
globalThis.records = raw.trim().split("\n").map(l => JSON.parse(l));
console.log(`Parsed ${globalThis.records.length} records`);

// eval 2: create and dispatch
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: globalThis.records });
const result = await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: {
    type: "object",
    properties: { label: { type: "string" } },
    required: ["label"],
  },
});
console.log(result);

Passing filePaths: ["/data.jsonl"] would produce a table with one row pointing at the file — not one row per record inside it.

When to use `subagentType`

Omit subagentType for classification, extraction, labeling, and any task where a single model call with structured output is sufficient. This is the default and is significantly cheaper and faster — each dispatch is a direct model call, no tools, no iteration.

Set subagentType when the task requires tools, file access, or multi-step reasoning. Each dispatch runs a full agentic loop with the named subagent.

// Direct model call — classification, no tools needed
await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: { type: "object", properties: { label: { type: "string" } }, required: ["label"] },
});

// Subagent — needs to read files and reason over multiple steps
await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues.",
  responseSchema: { type: "object", properties: { finding: { type: "string" } }, required: ["finding"] },
});

Instruction + context

instruction is a per-item template with {column} placeholders. Placeholders are resolved by the framework — your column names appear in prompts as references to the values listed alongside, never as raw template syntax. Subagents do the work — do not process items yourself in JS and write the results into rows.

context is free-form prose prepended to every subagent prompt. Use it for shared background: domain terms, classification rules, examples, etc.

const { create, run } = await import("@/skills/swarm");

const table = await create({ glob: "src/**/*.ts" });
const r = await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues. List findings or write 'no issues'.",
  context: "TypeScript Express backend using Prisma ORM. Focus on injection, auth bypass, path traversal.",
  responseSchema: {
    type: "object",
    properties: { review: { type: "string" } },
    required: ["review"],
  },
});
console.log(r);
// → { completed: 45, failed: 2, skipped: 0, failures: [...] }

Structured output

responseSchema is required. Schema properties become top-level columns on each row and constrain what subagents can return.

const { run } = await import("@/skills/swarm");
await run(table.id, {
  instruction: "Classify: {text}",
  responseSchema: {
    type: "object",
    properties: {
      sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
    },
    required: ["sentiment"],
  },
});
// Row after: { id: "r1", text: "...", sentiment: "positive" }

Batching

By default, swarm auto-batches to keep total dispatches under 10. For small tables (≤10 rows) each row gets its own subagent call. For larger tables, rows are grouped automatically.

Set batchSize to control grouping:

Number — uniform batch size for all rows. batchSize: 1 forces per-row dispatch; batchSize: 20 groups in twenties.
Function — (row, rowCount) => number. Returns the desired batch size for each row. Rows with the same batch size are grouped together, then chunked. Allows mixed dispatch where some rows go solo and others batch.

const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: items });

// Complex items get individual attention; simple ones batch together
await run(table.id, {
  instruction: "Analyze {text}",
  responseSchema: {
    type: "object",
    properties: { analysis: { type: "string" } },
    required: ["analysis"],
  },
  batchSize: (row) => (row.token_count > 1000 ? 1 : 10),
});

Batch sizes are clamped to [1, 50] after evaluation.

Aggregation

After run(), use rows() and plain JS — no additional subagents needed.

const { rows } = await import("@/skills/swarm");
const data = await rows(table.id, { columns: ["sentiment"] });
const counts = {};
data.forEach(r => { counts[r.sentiment] = (counts[r.sentiment] || 0) + 1 });
console.log(counts);
// → { positive: 120, negative: 45, neutral: 35 }

Chaining passes

run updates the table in place — chain calls to accumulate columns.

const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: interviews });
await run(table.id, {
  instruction: "Classify sentiment of {text}",
  responseSchema: {
    type: "object",
    properties: { sentiment: { type: "string", enum: ["positive", "negative", "neutral"] } },
    required: ["sentiment"],
  },
});
await run(table.id, {
  filter: { column: "sentiment", equals: "negative" },
  instruction: "Summarize why {text} had negative sentiment.",
  responseSchema: {
    type: "object",
    properties: { summary: { type: "string" } },
    required: ["summary"],
  },
});

Action-only tasks

When subagents perform actions (write a file, apply a fix) rather than return data, use a simple schema with a status or marker field. The exists: false filter still works for retries.

const { create, run } = await import("@/skills/swarm");
const fixedSchema = {
  type: "object",
  properties: { fixed: { type: "string" } },
  required: ["fixed"],
};
const table = await create({ glob: "src/**/*.ts" });
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
});
// retry any that failed
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
  filter: { column: "fixed", exists: false },
});

Filtering

{ column: "status", equals: "done" }
{ column: "status", notEquals: "done" }
{ column: "category", in: ["A", "B"] }
{ column: "result", exists: false }      // not yet processed
{ and: [filter1, filter2] }
{ or: [filter1, filter2] }

Technical notes

Only import @/skills/swarm in blocks where you call swarm functions. Data preparation (reading files, parsing, storing in globalThis) does not need the import. Destructure only what you use: { create }, { run }, { create, run }, etc.
Console output is capped at ~5 KB. Never log raw file contents — log only counts and short samples.
readFile inside eval returns raw content — no line-number prefixes. Request at most 500 lines per call. For files with more than 500 lines, loop with incrementing offset.
When building a table from a file, read it inside eval. Data read inside the sandbox stays there; it never enters the agent's context window.
Never write to .swarm/ directly. Always use create().
Everything the subagent needs must be in instruction + context. Subagents can't see the agent's context.
Row ids must be unique. create() rejects sources that produce duplicate ids. For tasks, that's a caller-side responsibility; for glob / filePaths, ids are auto-disambiguated by parent directory.
Unknown columns fail fast. If instruction references {foo} and no matched row provides foo, run() throws before any subagent is dispatched.

API Reference

`create(source)`

Create a table. Returns a handle { id, count, columns }.

Source	Description
`{ glob: "src/*/.ts" }` or `{ glob: ["src/*/.ts", "lib/*/.ts"] }`	Match files by one or more patterns. Columns: `id`, `file`
`{ filePaths: ["a.ts", "b.ts"] }`	Explicit file list. Columns: `id`, `file`
`{ tasks: [{ id: "t1", text: "..." }] }`	Custom rows. Each must have `id`

`run(tableId, options)`

Dispatch work across rows. Returns { completed, failed, skipped, failures }.

Option	Default	Description
`instruction`	(required)	Template with `{column}` placeholders
`responseSchema`	(required)	JSON Schema (`type: "object"`) — properties become row columns
`context`	—	Prose prepended to every subagent prompt
`filter`	—	Only dispatch matching rows
`subagentType`	—	Name of subagent to dispatch to. When set, runs a full agentic loop. When omitted, runs a direct model call
`batchSize`	auto	Number or `(row, rowCount) => number`. Auto caps dispatches at 10; `1` = per-row; function = per-row sizing
`concurrency`	`10`	Max concurrent subagent dispatches (clamped to 1–10)

`rows(tableId, options?)`

Retrieve rows. Use for inspection and JS-based aggregation.

Option	Description
`filter`	Only return matching rows
`columns`	Project to specific columns
`limit`	Max rows returned

swarm

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

swarm

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Swarm

Flow

Choosing a source

When to use subagentType

Instruction + context

Structured output

Batching

Aggregation

Chaining passes

Action-only tasks

Filtering

Technical notes

API Reference

create(source)

run(tableId, options)

rows(tableId, options?)

Similar Skills

Swarm

Flow

Choosing a source

When to use subagentType

Instruction + context

Structured output

Batching

Aggregation

Chaining passes

Action-only tasks

Filtering

Technical notes

API Reference

create(source)

run(tableId, options)

rows(tableId, options?)

Similar Skills

When to use `subagentType`

`create(source)`

`run(tableId, options)`

`rows(tableId, options?)`

When to use `subagentType`

`create(source)`

`run(tableId, options)`

`rows(tableId, options?)`