From total-recall
Manages total-recall memory system: search, store, forget memories, inspect tiers, run eval, and configure. Automatically activates on memory-related queries.
How this skill is triggered — by the user, by Claude, or both
Slash command
/total-recall:commandsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**IMMEDIATE ACTION REQUIRED — before responding to the user's first message, call the total-recall `session_start` MCP tool. This syncs host imports and loads hot tier context. If `session_start` has already run server-side, the call returns cached results instantly.**
IMMEDIATE ACTION REQUIRED — before responding to the user's first message, call the total-recall session_start MCP tool. This syncs host imports and loads hot tier context. If session_start has already run server-side, the call returns cached results instantly.
If session_start is blocked by permissions (e.g., in TUI fullscreen / dontAsk mode), do NOT just report the error. Instead:
mcp__plugin_total-recall_total-recall__session_start allowed to function. You may need to adjust permissions if you want total-recall active this session."/total-recall:commands setup to auto-configure permissionsMemory and knowledge base management for total-recall.
These behaviors run automatically throughout the session. Tool calls will be visible to the user.
session_start MCP tool to sync imports and assemble hot tier context (this may already be cached — call it regardless to receive the context)tierSummarystorage (e.g. "sqlite", "cortex", "postgres"). If it shows a fallback like "sqlite (cortex failed)", flag this prominently.lastSessionAge is present, mention when the last session washints are present, briefly surface the most relevant onespinned_budget_pressure is present in hints, surface it prominently: pinned entries are eating over half the context budget — suggest unpinning or trimming entriestotal-recall loaded — 2 pinned, 3 hot, 12 warm, 5 cold, 2 KB collections. Storage: cortex. Last session: 2 hours ago. Context: TODO list at docs/TODO.md; user prefers bundled PRs for refactors.
hints to inform your behavior throughout the session — they represent high-value memories like user corrections, preferences, and frequently accessed project contextcontext field to inform your responsesWhen you detect these patterns in user messages, call memory_store:
Do NOT ask permission — just store it.
On each user message that is a question or task request:
memory_search with the message, searching warm tiermemory_search and kb_search now return a retrievalId alongside results.
After you use retrieved memories in your work, call memory_feedback:
memory_feedback({ retrievalId, used: true })memory_feedback({ retrievalId, used: false })Report once per search, as soon as you know the outcome. Skip the call only when
retrievalId is empty. Do NOT ask permission — just report it. This is what makes
the Dashboard "Retrieval quality" metric real.
Pinned directives are re-asserted automatically near the live edge by the per-turn pinned floor (where the host supports it — see the capability matrix under "Pinned directive floor" below). You do not need to do anything for this.
Additionally: when the user makes a significant task switch (a clearly new
piece of work), call session_refresh once. This re-prepends the pinned block
and refreshes hot-tier context near the current generation point.
At session end, follow the directive in session-end.md — it is the single source of truth, injected verbatim by the SessionEnd hook (hooks/session-end/run.sh).
A UserPromptSubmit hook re-injects the pinned block on an adaptive throttle
(config: [tiers.pinned] floor_enabled / floor_every_n_turns / floor_growth_tokens)
so pinned directives keep being followed in long sessions. It re-asserts when
either trigger trips since the last injection: N turns elapsed, or ~growth_tokens
of transcript growth. Set floor_enabled = false to disable.
| Host | Per-turn floor | Mechanism |
|---|---|---|
| Claude Code | Active | UserPromptSubmit → additionalContext |
| Copilot CLI | Pending upstream fix | UserPromptSubmit → additionalContext |
| Cursor | Layered fallback | session-start + skill-guided session_refresh |
/total-recall:commands <subcommand> [args]
Print the command reference table below. Do not call any MCP tools.
| Command | Purpose |
|---|---|
help | Show this command reference |
setup | Auto-configure permissions for total-recall MCP tools |
status | Dashboard with tier sizes, session ID, DB stats |
search <query> | Semantic search across all tiers |
store <content> | Store a memory (--tier, --tags, --type) |
forget <query> | Find and delete memories (with confirmation) |
inspect <id> | Full details for a single entry |
promote <id> | Move an entry up one tier |
demote <id> | Move an entry down one tier |
pin <id> | Pin an entry to the pinned tier (always injected, never decays) |
unpin <id> | Release a pinned entry back to the warm tier |
history | Timeline of recent tier movements |
recent | List newest memories by timestamp (--limit, --tier, --type, --project, --order); --tier pinned supported |
lineage <id> | Compaction ancestry tree for an entry |
export | Export memories to JSON (--tiers, --types) |
import <path> | Import memories from a JSON file |
import-host | Import from host tools (Claude Code, etc.) |
ingest <path> | Ingest a file or directory into the knowledge base |
kb search <query> | Search the knowledge base |
kb list | List KB collections |
kb refresh <col> | Re-ingest a KB collection |
kb remove <id> | Remove a KB entry |
compact | Run hot-tier compaction now |
eval | Retrieval quality report (--benchmark, --compare, --snapshot) |
config [get|set] | View or update configuration |
update | Update the plugin to the latest version |
ui | Open the built-in web management UI in the browser (total-recall ui) |
Auto-configure Claude Code permissions so total-recall MCP tools are allowed (required for TUI fullscreen / dontAsk permission mode).
~/.claude/settings.json (create if it doesn't exist)permissions.allow already contains a rule matching mcp__plugin_total-recall_total-recall__* (exact or equivalent glob)"mcp__plugin_total-recall_total-recall__*" to the permissions.allow arrayImportant: Preserve all existing settings — only add to the allow array, never remove or overwrite other entries. If the file doesn't exist, create it with just the permissions block.
Call the status MCP tool. Format as a dashboard showing:
Call memory_search with the query, all tiers enabled, top_k=10. Format results grouped by tier, showing: content preview, similarity score, source, tags. Valid tier filter values: hot, warm, cold, pinned. Offer actions: /total-recall:commands promote <id>, /total-recall:commands pin <id>, or /total-recall:commands forget <id>.
Call memory_store with the provided content. Optionally accept flags:
--tier hot|warm|cold (default: hot)--tags tag1,tag2--type correction|preference|decision--pinned is not a tier value — to store-and-pin new content use memory_store with pinned: true (see ### pin).memory_search with the query to find matching entriesmemory_delete for each selected entryNever auto-delete without user confirmation.
Call memory_inspect with the entry ID. Show full entry details including content, tier, source, tags, access count, decay score, creation/update timestamps, and compaction history.
Call memory_promote with the entry ID and target tier/type. Default target: one tier up, same content type.
Call memory_demote with the entry ID and target tier/type. Default target: one tier down, same content type.
Call memory_pin with the entry ID and optional scope/project/type. Pin a memory so it is ALWAYS injected at session start and never decays or compacts. The only way out is unpin.
memory_pin { id, scope?: "project"|"global", project?, type? }total-recall memory pin <id> [--scope project|global] [--project <name>] [--type memory|knowledge]scope: "global" makes the pin visible in every project; scope: "project" scopes it to the given (or existing) project; omitted keeps the entry's current project.tiers.pinned.max_content_chars, default 500) — pins are directives, not reference material. Oversized content is rejected, never truncated or auto-summarized: distill the rule and pin the distillation; keep the full detail as a normal warm memory.memory_store with pinned: true stores-and-pins in one call (no pre-existing id required).Call memory_unpin with the entry ID. Release a pinned entry back to the warm tier, where the normal decay/compaction lifecycle resumes.
memory_unpin { id, type? }total-recall memory unpin <id> [--type memory|knowledge]Call memory_history. Show recent tier movements from the compaction log as a timeline.
Call memory_recent with the parsed flags (defaults: limit 20, all tiers, order created). Render the returned entries as a numbered list, newest first:
N. [<timestamp>] <tier> · <entry_type> · <project> — <preview>
The order field echoes which timestamp was used. Note that updated/accessed reflect activity (compaction bumps updated_at, retrieval bumps last_accessed_at), while created reflects authoring time. For full text of any entry, suggest /total-recall:commands inspect <id>; also offer promote <id> / forget <id> as follow-ups.
Call memory_lineage with the entry ID. Show the full compaction ancestry tree.
Call memory_export. Optionally accept --tiers hot,warm,cold,pinned and --types memory,knowledge to filter.
Call memory_import with the file path.
Determine if path is a file or directory:
kb_ingest_filekb_ingest_dirReport: collection name, document count, chunk count. Suggest a test query to verify.
Call kb_search with the query. Show results with content preview, score, collection, and source path.
If the response includes needsSummary: true, generate a 2-3 sentence summary of the collection's content based on the search results and call kb_summarize with the collection ID and summary. This improves future retrieval.
Call kb_list_collections. Show all collections with document and chunk counts.
Call kb_refresh with the collection ID. Report re-ingestion results.
Call kb_remove with the entry ID. Ask for confirmation first.
Call compact_now. Note: the response is informational only — real compaction is host-orchestrated via the Session End flow (session_context + memory_promote/memory_demote/memory_store/memory_delete). Surface the returned guidance and point the user at the Session End mechanism if they want to actually compact now.
eval_report for live retrieval quality metrics (7-day rolling)--benchmark: call eval_benchmark for synthetic benchmark results--compare <name>: call eval_compare to compare against a saved baseline--snapshot <name>: call eval_snapshot to save current config as a named baselineget: call config_get with the key (or omit for full config)set: call config_set with key and valueLaunch the built-in web management UI. This is a CLI command, not an MCP tool — instruct the user to run:
total-recall ui
Optionally pass flags: --port N (default 5577), --no-open (skip browser auto-open), --host H (bind address, default localhost), --smoke (CI probe: start, check /api/health, exit 0/1). The UI provides six sections: Dashboard · Memory · Knowledge Base · Usage · Insights · Config. The access token is embedded in the served page — no manual credential setup needed.
Call import_host to detect and import memories from host tools (Claude Code, Copilot CLI). Optionally restrict to a specific source.
Update the plugin to the latest published version. The correct mechanism depends on how total-recall is installed:
Detect the install mode. Check whether the current plugin directory is a git checkout or a marketplace tarball snapshot:
if [ -n "$CLAUDE_PLUGIN_ROOT" ] && [ -d "$CLAUDE_PLUGIN_ROOT/.git" ]; then
echo "git-checkout"
else
echo "marketplace-tarball"
fi
If git-checkout: run cd "$CLAUDE_PLUGIN_ROOT" && git pull origin main and report what changed (files updated, new version).
If marketplace-tarball (the common case for users who installed via /plugin): do NOT attempt git pull — the cache directory has no .git and is managed by Claude Code. Instead, instruct the user to update it via Claude Code itself:
/plugin (or the equivalent plugin UI in their Claude Code version) and update total-recall from the strvmarv-total-recall-marketplace marketplace./reload-plugins to apply.Also report the latest published version so the user knows the target — check it with:
npm view @strvmarv/total-recall version
In both cases, end by reminding the user to run /reload-plugins after the update completes.
npx claudepluginhub strvmarv/total-recall-marketplace --plugin total-recallCross-host durable memory for AI agents using the ling-mem CLI. Maintains a three-tier model of who the user is across sessions and hosts (Claude Code, Codex, OpenClaw).
Recall facts from past sessions via memoir. STORE PATH: ALWAYS compute first via `STORE=$(bash "$CLAUDE_PLUGIN_ROOT/scripts/derive-store-path.sh")` (or `$MEMOIR_STORE`). Pass `-s "$STORE"` on every call — never rely on memoir's connected default (frequently stale). PROCEDURE (single-shot default): ONE `summarize --depth 3 -n default` → pick at most 5–7 keys → batch `get`. Only escalate to drill (batched `--keys`) if the depth-3 response shows `total_memories > 1000` AND the query is broad. Never call `summarize --depth 1` separately — `--depth 3` already returns count + full key listing. EXCLUDE `metrics.*` unless args contain `--include-metrics`. NEVER shell out to `memoir recall` (legacy LLM-bundled, slow, requires OPENAI_API_KEY). First reply line MUST be a mode marker `[mode=get|fast|drill|flat|blame|diff]`. DEFAULT ON: invoke for any question or task that may depend on past preferences, decisions, conventions, or knowledge — questions touching prior state, meta/overview asks, design/implementation prompts where output may reflect prior style, SessionStart hints, or any moment you'd otherwise silently apply remembered facts. SKIP only for mechanical single-symbol lookups, throwaway scratch work, or explicit user opt-out. Defer to memoir-onboard for repo-structure questions (it owns `codebase:onboard`). Cost of an unused recall is low; cost of missing a remembered preference is high.
Searches and retrieves memories from Cortex persistent memory using WRRF retrieval. Use for past decisions, patterns, bugs, or architecture context.