From agentdb-learning
Ask the AgentDB bandit which RL algorithm / skill / pattern fits the current task best. Use at task start when there are multiple plausible approaches and you want the data-driven pick.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agentdb-learning:agentdb-routeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Ask the Thompson Sampling bandit which approach to use for the current task.
Ask the Thompson Sampling bandit which approach to use for the current task.
agentdb_learning_route(
task: <description>
candidates?: [<skill_id> | <algo>, ...] // omit to consider everything
context?: { stack, project, ... }
)
Returns: { picked, expectedReward, confidence, alternatives: [...] }
Thompson Sampling: each candidate has a Beta(α, β) posterior over reward. The bandit samples once from each, picks the highest sample. Exploration emerges naturally — uncertain candidates get tried until their posterior tightens.
Four bandit decision points across AgentDB:
The router unifies them: it returns the picked candidate AND a decisionTrace showing which decision points fired.
const { picked } = await agentdb_learning_route(...)
const result = await runWith(picked)
agentdb_bandit_update(arm: picked, reward: result.reward)
The agentdb-feedback skill (this plugin) wraps the close-loop step.
npx claudepluginhub ruvnet/agentdb --plugin agentdb-learningTrain one of AgentDB's 9 RL algorithms on a stream of episodes. Use when the user has accumulated successful/failed episodes and wants to derive a policy, or when a task type is repeated enough to benefit from RL routing.
Implements ReasoningBank's adaptive learning system for AI agents: pattern recognition, strategy optimization, continuous learning, meta-learning, and transfer learning. Use when building self-improving agents or optimizing workflows.
Troubleshoots LLM agent RL training: reward stagnation, KL/entropy blow-ups, eval flat, tool-call failures, credit assignment, benchmark contamination. Routes symptoms to cited fixes from a curated corpus.