From adk-evaluation
Use this skill to simulate the external environment an ADK 2.0 agent operates in — mocked APIs, fake databases, time-controlled clocks — so evals are deterministic and don't hit real services. Triggers on: "ADK environment simulation", "mock APIs for ADK eval", "ADK fake backend", "deterministic ADK testing", "ADK sandboxed eval", "stub external services ADK", "ADK eval without network". Generates fixture stubs and a test harness wiring them in place of real tool implementations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/adk-evaluation:environment-simulationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Stub out the world around your agent so eval runs are deterministic, fast, and free.
Stub out the world around your agent so eval runs are deterministic, fast, and free.
# tests/fixtures/mock_tools.py
from datetime import datetime
class FakeWeatherAPI:
def __init__(self):
self.calls = []
self.responses = {
"Tokyo": {"temperature": 68, "conditions": "clear"},
"Paris": {"temperature": 55, "conditions": "rain"},
}
def get_weather(self, city: str, units: str = "fahrenheit") -> dict:
self.calls.append((city, units))
if city not in self.responses:
return {"error": f"Unknown city: {city}"}
return {**self.responses[city], "city": city, "units": units}
import pytest
from google.adk.agents import LlmAgent
@pytest.fixture
def fake_weather():
return FakeWeatherAPI()
@pytest.fixture
def agent_under_test(fake_weather):
return LlmAgent(
name="weather_assistant",
model="gemini-2.5-flash",
instruction="Answer weather questions.",
tools=[fake_weather.get_weather],
)
async def test_known_city(agent_under_test, fake_weather):
result = await agent_under_test.run("Weather in Tokyo?")
assert "68" in result or "clear" in result
assert ("Tokyo", "fahrenheit") in fake_weather.calls
from freezegun import freeze_time
@freeze_time("2026-01-15 09:00:00")
async def test_morning_briefing(agent_under_test):
result = await agent_under_test.run("What's today?")
assert "January 15, 2026" in result or "2026-01-15" in result
from google.adk.testing import StubLlm
stub = StubLlm(responses=[
{"text": "I'll check the weather.", "tool_calls": [{"name": "get_weather", "args": {"city": "Tokyo"}}]},
{"text": "It's 68°F and clear in Tokyo."},
])
agent = LlmAgent(
name="test",
model=stub,
tools=[fake_weather.get_weather],
)
import json
from pathlib import Path
async def test_snapshot(agent_under_test, snapshot):
transcript = await run_eval_case(agent_under_test, "weather_tokyo")
expected = json.loads(Path("snapshots/weather_tokyo.json").read_text())
assert transcript == expected
pytest --forbid-network or similar)user-simulation-runner to drive the agent inside the simulated environmentcustom-metric-builder to score behavior in the simulationCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub healthcare-ai-consulting-llc/adk-2-toolkit --plugin adk-evaluation