Recursive Language Model (RLM) plugin for Claude Code
npx claudepluginhub ondrasek/spike-claude-code-rlmRecursive Language Model (RLM) - Analyze documents far exceeding LLM context windows by recursively exploring them with LLM-generated Python code in a REPL.
A modern Python 3.11+ implementation of the Recursive Language Model paradigm from MIT CSAIL research (arXiv:2512.24601).
Unlike traditional RAG (Retrieval-Augmented Generation), RLM treats document context as an external variable in a Python REPL environment. The LLM doesn't see the full document - instead, it writes Python code to:
len(CONTEXT), CONTEXT[:1000])re.findall(r'pattern', CONTEXT))llm_query(f"Summarize: {chunk}"))FINAL(answer))This approach enables processing of documents far exceeding typical context windows while maintaining adaptive, task-specific exploration strategies.
┌─────────────────────────────────────────────────────────┐
│ User Query │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ RLM Orchestrator │
│ • Manages iteration loop │
│ • Parses code blocks from LLM response │
│ • Detects FINAL() answers │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ Root LLM │
│ • Receives query + system prompt (NOT the context!) │
│ • Generates Python code to explore CONTEXT │
│ • Calls llm_query() for sub-processing │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ REPL Environment │
│ • CONTEXT variable (holds the massive text) │
│ • llm_query(snippet, task) for sub-RLM calls │
│ • FINAL(answer) for returning results │
│ • Isolation delegated to container runtime │
└─────────────────────────────────────────────────────────┘
pip install -r requirements.txt
Requirements:
anthropic>=0.39.0 (for Anthropic API backend)openai>=1.0.0 (optional, for local models via Ollama/vLLM)from rlm import RLM
from rlm.backends import AnthropicBackend
# Initialize with API key (or set ANTHROPIC_API_KEY env var)
backend = AnthropicBackend()
rlm = RLM(
backend,
model="claude-sonnet-4-20250514",
sub_rlm_model="claude-haiku-3-20250813", # Cheaper model for sub-RLM calls
verbose=True,
)
# Process a large document
with open("large_document.txt") as f:
context = f.read()
result = rlm.completion(
context=context,
query="What are the main themes discussed in this document?"
)
print(result.answer)
print(rlm.cost_summary())
OPENAI_API_KEY=... uvx --with openai rlm --backend openai --model gpt-4o --context-file doc.txt --query "Summarize"
OPENROUTER_API_KEY=... uvx --with openai rlm --backend openrouter --model anthropic/claude-sonnet-4 --context-file doc.txt --query "Summarize"
HF_TOKEN=... uvx --with openai rlm --backend huggingface --model Qwen/Qwen2.5-Coder-32B-Instruct --context-file doc.txt --query "Summarize"
from rlm import RLM
from rlm.backends import OpenAICompatibleBackend
backend = OpenAICompatibleBackend(
base_url="http://localhost:11434/v1",
api_key="ollama",
)
rlm = RLM(backend, model="llama3.2", verbose=True)
result = rlm.completion(context=doc, query="Summarize this document")
# With Anthropic API (set ANTHROPIC_API_KEY)
python demo.py --verbose
# With Ollama
python demo.py --backend ollama --model llama3.2 --verbose
# Custom context file
python demo.py --context-file /path/to/document.txt --query "Your question here"
Use --config rlm.yaml to configure per-role backends, models, and system prompts:
defaults:
backend: anthropic
model: claude-sonnet-4-20250514
roles:
root:
model: claude-sonnet-4-20250514
sub_rlm:
backend: ollama
model: llama3.2
base_url: http://localhost:11434/v1
verifier:
model: claude-haiku-4-5-20251001
settings:
max_iterations: 10
verbose: true
verify: true
uvx rlm --config rlm.yaml --context-file doc.txt --query "Summarize"
CLI flags always override config values. Merge priority: CLI flags > roles.{role} > defaults > hardcoded defaults.