From adk-deployment
Use this skill to tune the ADK 2.0 agent runtime — concurrency, retries, timeouts, callback registration, custom service injection. Triggers on: "ADK runtime config", "tune ADK retries", "ADK timeout config", "ADK Runner customize", "register callbacks ADK", "custom service ADK", "ADK FastAPI integration", "AgentEngineSandboxCodeExecutor". Generates Runner / FastAPI integration code with the right knobs set for your workload profile.
How this skill is triggered — by the user, by Claude, or both
Slash command
/adk-deployment:agent-runtime-configThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Configure the ADK 2.0 runtime for production: concurrency, retries, timeouts, callbacks, custom services.
Configure the ADK 2.0 runtime for production: concurrency, retries, timeouts, callbacks, custom services.
from google.adk.runners import Runner, RunnerConfig
from google.adk.sessions import VertexAiSessionService
from google.adk.artifacts import GcsArtifactService
session_service = VertexAiSessionService(project=..., location=...)
artifact_service = GcsArtifactService(bucket_name="my-artifacts")
runner = Runner(
agent=root_agent,
session_service=session_service,
artifact_service=artifact_service,
config=RunnerConfig(
max_concurrent_invocations=50,
per_invocation_timeout_seconds=120,
tool_call_timeout_seconds=30,
model_call_timeout_seconds=60,
retry_policy={
"max_retries": 3,
"backoff_factor": 2.0,
"retryable_errors": ["RateLimitError", "ServiceUnavailable"],
},
),
)
ADK 2.0 supports injecting custom services into the FastAPI server:
from google.adk.web import create_app
class AuditService:
def log(self, msg): print(f"[AUDIT] {msg}")
app = create_app(
agent=root_agent,
session_service=session_service,
custom_services={"audit": AuditService()},
)
Tools and callbacks can request audit via dependency injection.
from google.adk.callbacks import (
on_before_model_call,
on_after_model_call,
on_before_tool_call,
on_after_tool_call,
on_session_created,
)
@on_before_model_call
async def trim_long_history(ctx, request):
# See context-cache-compress skill
return request
@on_after_tool_call
async def log_tool_usage(ctx, tool_name, args, result):
print(f"{tool_name}({args}) -> {result}")
# Register with runner
runner = Runner(
agent=root_agent,
callbacks=[trim_long_history, log_tool_usage],
)
from google.adk.tools import AgentEngineSandboxCodeExecutor
code_exec = AgentEngineSandboxCodeExecutor(
project="my-project",
location="us-central1",
)
root_agent = LlmAgent(
name="data_analyst",
model="gemini-2.5-pro",
instruction="Use the code_executor tool to run analysis on uploaded CSVs.",
tools=[code_exec],
)
Runs LLM-generated code inside a Vertex sandbox — safe execution without local risk.
| Workload | max_concurrent | per_inv_timeout |
|---|---|---|
| Interactive chat (1-user) | 5 | 120s |
| Multi-tenant API | 50-200 | 60s |
| Batch processing | 10-20 | 600s |
| Streaming voice | depends on connections | n/a (long-lived) |
max_concurrent_invocations is respectedcloud-run-deployer / gke-deployer / vertex-agent-engine-deployer for the hostlogging-callback-setup for observability callbacksProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub healthcare-ai-consulting-llc/adk-2-toolkit --plugin adk-deployment