Skill

line-voice-agent

Build voice agents with the Cartesia Line SDK. Supports 100+ LLM providers via LiteLLM with tool calling, multi-agent handoffs, and real-time interruption handling.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/cartesia-skills:line-voice-agent

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Build production voice agents with the Cartesia Line SDK. This guide covers agent creation, tool patterns, multi-agent workflows, and LLM provider configuration.

Supporting Files

references/advanced-patterns.mdreferences/calls-api.mdreferences/multi-agent-workflows.mdreferences/tool-patterns.mdreferences/troubleshooting.md

SKILL.md

728 lines · ~6k tokens(exceeds 5k compaction limit)

Stats

Stars4

MaintenanceExcellent

Last CommitJun 12, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Line SDK Voice Agent Guide

Build production voice agents with the Cartesia Line SDK. This guide covers agent creation, tool patterns, multi-agent workflows, and LLM provider configuration.

How Line Works

Line is Cartesia's voice agent deployment platform. You write Python agent code using the Line SDK, deploy it to Cartesia's managed cloud via the cartesia CLI, and Cartesia hosts it with auto-scaling. Cartesia handles STT (Ink), TTS (Sonic), telephony, and audio orchestration. Only one deployment per agent is active at a time; once deployed, your agent receives calls automatically.

┌─────────────────────────────────────────────────────────────────┐
│                     Cartesia Line Platform                       │
│  ┌──────────┐    ┌──────────────┐    ┌──────────┐              │
│  │   Ink    │───▶│  Your Agent  │───▶│  Sonic   │              │
│  │  (STT)   │    │  (Line SDK)  │    │  (TTS)   │              │
│  └──────────┘    └──────────────┘    └──────────┘              │
│       ▲                                    │                    │
│       │         Audio Orchestration        │                    │
│       └────────────────────────────────────┘                    │
└─────────────────────────────────────────────────────────────────┘
        ▲                                    │
        │            WebSocket               ▼
┌───────┴────────────────────────────────────┴───────┐
│              Client (Phone / Web / Mobile)          │
└─────────────────────────────────────────────────────┘

Your code handles:

LLM reasoning and conversation flow
Tool execution (API calls, database lookups)
Multi-agent coordination and handoffs

Cartesia handles:

Speech-to-text (Ink)
Text-to-speech (Sonic)
Real-time audio streaming
Turn-taking and interruption detection
Deployment and auto-scaling

Audio Input Options:

Cartesia Telephony - Managed phone numbers
Calls API - Web apps, mobile apps, custom telephony

Prerequisites

Python 3.10+ and uv (recommended package manager)
Cartesia API key — get one at play.cartesia.ai/keys (used by the CLI and for deployment)
LLM API key — for whichever LLM provider your agent calls (e.g. ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY)
Cartesia CLI — install with:
```
curl -fsSL https://cartesia.sh | sh
```

Cartesia CLI Reference

# Authentication
cartesia auth login              # Login with Cartesia API key
cartesia auth status             # Check auth status

# Project Setup
cartesia create [project-name]   # Create project from template
cartesia init                    # Link existing directory to an agent

# Local Development
cartesia chat <port>             # Chat with local agent (text mode)

# Deployment
cartesia deploy                  # Deploy to Cartesia cloud
cartesia status                  # Check deployment status

# Environment Variables (encrypted, stored on Cartesia)
cartesia env set KEY=VALUE       # Set a single env var
cartesia env set --from .env     # Import all vars from .env file
cartesia env rm <name>           # Remove an env var

# Agents & Calls
cartesia agents ls               # List all agents
cartesia deployments ls          # List deployments
cartesia call <phone> [agent-id] # Make outbound call

Full command reference: docs.cartesia.ai/line/cli.

Quick Start

1. Create Project

cartesia auth login
cartesia create my-agent
cd my-agent

2. Write Agent Code

main.py:

import os
from line.llm_agent import LlmAgent, LlmConfig, end_call
from line.voice_agent_app import AgentEnv, CallRequest, VoiceAgentApp

async def get_agent(env: AgentEnv, call_request: CallRequest):
    return LlmAgent(
        model="anthropic/claude-haiku-4-5-20251001",
        api_key=os.getenv("ANTHROPIC_API_KEY"),
        tools=[end_call],
        config=LlmConfig(
            system_prompt="You are a helpful voice assistant.",
            introduction="Hello! How can I help you today?",
        ),
    )

app = VoiceAgentApp(get_agent=get_agent)

if __name__ == "__main__":
    app.run()

3. Test Locally

ANTHROPIC_API_KEY=your-key python main.py
cartesia chat 8000  # Text chat with your running agent

4. Deploy

cartesia env set ANTHROPIC_API_KEY=your-key  # Encrypted, stored on Cartesia
cartesia deploy
cartesia status  # Verify deployment is active

5. Make a Call

cartesia call +1234567890  # Outbound call via CLI

Or trigger calls from the Cartesia dashboard.

Project Structure

Every Line agent project MUST have:

my_agent/
├── main.py          # VoiceAgentApp entry point (REQUIRED)
├── cartesia.toml    # Deployment config, created by cartesia init or cartesia create (REQUIRED)
└── pyproject.toml   # Dependencies: cartesia-line

cartesia.toml declares deployment metadata, the local server address, and the env vars your agent requires:

[cartesia]
name = "My Agent"
description = "What this agent does"
version = "0.1.0"

[cartesia.server]
port = 8000
host = "0.0.0.0"

[cartesia.environment]
required_vars = ["ANTHROPIC_API_KEY"]

Core Concepts

LlmAgent

The main agent class that wraps LLM providers via LiteLLM:

from line.llm_agent import LlmAgent, LlmConfig

agent = LlmAgent(
    model="gemini/gemini-2.5-flash-preview-09-2025",  # LiteLLM model string
    api_key=os.getenv("GEMINI_API_KEY"),              # Provider API key
    tools=[end_call, my_custom_tool],                  # List of tools
    config=LlmConfig(...),                             # Agent configuration
    max_tool_iterations=10,                            # Max tool call loops (default: 10)
    backend=None,                                      # Optional provider backend override
)

LlmConfig

Configuration for agent behavior and LLM sampling:

from line.llm_agent import LlmConfig

config = LlmConfig(
    # Agent behavior
    system_prompt="You are a helpful assistant.",
    introduction="Hello! How can I help?",  # Set to "" to wait for user first

    # Sampling parameters (optional)
    temperature=0.7,
    max_tokens=1024,
    top_p=0.9,
    stop=["\n\n"],
    seed=42,
    presence_penalty=0.0,
    frequency_penalty=0.0,
    # Reasoning models only: "none" | "minimal" | "low" | "medium" | "high"
    reasoning_effort="low",

    # Resilience (optional)
    num_retries=2,           # Default: 2
    timeout=30.0,
    fallbacks=["gpt-5-nano"],  # Fallback models

    # Advanced (optional)
    strict_tool_schemas=True,   # Default: True
    extra={},                   # Provider-specific pass-through kwargs to LiteLLM
)

reasoning_effort is validated against the model: passing it to a model that doesn't support reasoning raises ValueError. Use "none" (or omit it) for non-reasoning models.

Dynamic Configuration from CallRequest

Use LlmConfig.from_call_request() to pull configuration from the incoming call:

async def get_agent(env: AgentEnv, call_request: CallRequest):
    return LlmAgent(
        model="anthropic/claude-sonnet-4-5",
        api_key=os.getenv("ANTHROPIC_API_KEY"),
        tools=[end_call],
        config=LlmConfig.from_call_request(
            call_request,
            fallback_system_prompt="Default system prompt if not in request.",
            fallback_introduction="Default introduction if not in request.",
            temperature=0.7,  # Additional LlmConfig options
        ),
    )

Priority order: CallRequest value > fallback argument > SDK default

VoiceAgentApp

The application harness that manages HTTP endpoints and WebSocket connections:

from line.voice_agent_app import VoiceAgentApp, AgentEnv, CallRequest

async def get_agent(env: AgentEnv, call_request: CallRequest):
    # env.loop - asyncio event loop
    # call_request.call_id - unique call identifier
    # call_request.agent.system_prompt - from request
    # call_request.agent.introduction - from request
    # call_request.metadata - custom metadata dict
    return LlmAgent(...)

app = VoiceAgentApp(get_agent=get_agent)
app.run(host="0.0.0.0", port=8000)

Built-in Tools

Import from line.llm_agent:

from line.llm_agent import (
    end_call, send_dtmf, transfer_call, web_search,
    knowledge_base, mcp_tool, http_server_tool,
)

end_call

End the current call. Tell the LLM to say goodbye before calling this.

tools=[end_call]
# System prompt: "Say goodbye before ending the call with end_call."

send_dtmf

Send DTMF tones (touch-tone buttons). Useful for IVR navigation.

tools=[send_dtmf]
# Buttons: "0"-"9", "*", "#" (strings, not integers!)

transfer_call

Transfer to another phone number (E.164 format required).

tools=[transfer_call]
# Example: +14155551234

web_search

Search the web for real-time information. Uses native LLM web search when available, falls back to DuckDuckGo.

# Default settings
tools=[web_search]

# Custom settings
tools=[web_search(search_context_size="high")]  # "low", "medium", "high"

knowledge_base

Look up information from the agent's knowledge base via a natural-language query. Filters, top_k, and timeout_s are fixed at construction time — the LLM only chooses the query string.

# Default behavior — no filters
tools=[knowledge_base]

# Pre-filter every retrieval, override top_k, or run as a background lookup
tools=[knowledge_base(filters={"category": "billing"}, top_k=10)]
tools=[knowledge_base(description="Look up insurance policy terms.")]
tools=[knowledge_base(is_background=True)]

Tell the user you're looking something up before calling it — retrieval can take a moment. Raises KnowledgeBaseError (import from line) on failure.

mcp_tool

Expose a Model Context Protocol server to the LLM. Requires Python 3.10+ and the mcp package (already a Line dependency).

# Remote HTTP/SSE server
tools=[mcp_tool(name="dmcp", server_url="https://dmcp-server.deno.dev/sse")]

# Local stdio server
tools=[mcp_tool(name="memory", command="npx -y @modelcontextprotocol/server-memory")]

The LLM calls the tool with no arguments to list available tools, or with tool_name and tool_args to invoke one.

http_server_tool

Create an HTTP/webhook tool from JSON schemas — no custom function needed. The LLM fills in the schema fields and the SDK makes the request. Properties with constant_value are hidden from the LLM and injected into every request; ${ENV_VAR} placeholders in auth are resolved from os.environ at build time.

create_ticket = http_server_tool(
    name="create_ticket",
    description="Creates a support ticket for the caller.",
    url="https://api.example.com/v1/{tenant_id}/tickets",  # {param} = path variable
    method="POST",
    request_body_schema={
        "type": "object",
        "required": ["subject", "priority"],
        "properties": {
            "subject": {"type": "string", "description": "Short summary."},
            "priority": {"type": "string", "enum": ["low", "medium", "high"]},
            "source": {"type": "string", "constant_value": "voice_agent"},  # hidden
        },
    },
    query_params_schema=None,   # same shape, scalar types only, for GET query params
    auth={"Authorization": "Bearer ${SUPPORT_API_KEY}"},
    content_type="application/json",  # or "application/x-www-form-urlencoded"
    timeout=5.0,
    is_background=True,  # default True
)

tools=[create_ticket, end_call]

The LLM always receives a structured JSON result, e.g. {"ok": true, "status": 201, "body": "..."} or {"ok": false, "status": 500, "error": "..."}.

Note: some Line docs/READMEs refer to this as webhook_tool; the exported function name is http_server_tool.

Custom Tool Types

Three tool paradigms for different use cases:

Type	Decorator	Use Case	Result Handling
Loopback	`@loopback_tool`	API calls, database lookups	Result sent back to LLM
Passthrough	`@passthrough_tool`	End call, transfer, DTMF	Bypasses LLM, goes to user
Handoff	`@handoff_tool`	Multi-agent workflows	Transfers control to another agent

Tool Type Decision Tree

Does the result need LLM processing?
├─ YES → @loopback_tool
│   └─ Is it long-running (>1s)? → @loopback_tool(is_background=True)
│       └─ Yield interim status, then final result
├─ NO, deterministic action → @passthrough_tool
│   └─ Yields OutputEvent objects directly (AgentSendText, AgentEndCall, etc.)
└─ Transfer to another agent → @handoff_tool or agent_as_handoff()

Loopback Tools

Results are sent back to the LLM to inform the next response:

from typing import Annotated
from line.llm_agent import loopback_tool, ToolEnv

@loopback_tool
async def get_order_status(
    ctx: ToolEnv,
    order_id: Annotated[str, "The order ID to look up"],
) -> str:
    """Look up the current status of an order."""
    order = await db.get_order(order_id)
    return f"Order {order_id} status: {order.status}, ETA: {order.eta}"

Parameter syntax:

First parameter MUST be ctx: ToolEnv
Use Annotated[type, "description"] for LLM-visible parameters
Tool description comes from the docstring
Optional parameters need default values (not just Optional[T])

@loopback_tool
async def search_products(
    ctx: ToolEnv,
    query: Annotated[str, "Search query"],
    category: Annotated[str, "Product category"] = "all",  # Optional with default
    limit: Annotated[int, "Max results"] = 10,
) -> str:
    """Search the product catalog."""
    ...

Passthrough Tools

Results bypass the LLM and go directly to the user/system:

from line.events import AgentSendText, AgentTransferCall
from line.llm_agent import passthrough_tool, ToolEnv

@passthrough_tool
async def transfer_to_support(
    ctx: ToolEnv,
    reason: Annotated[str, "Reason for transfer"],
):
    """Transfer the call to the support team."""
    yield AgentSendText(text="Let me transfer you to our support team now.")
    yield AgentTransferCall(target_phone_number="+18005551234")

Output event types (from line.events):

AgentSendText(text="...") - Speak text to user
AgentEndCall() - End the call
AgentTransferCall(target_phone_number="+1...") - Transfer call
AgentSendDtmf(button="5") - Send DTMF tone

Handoff Tools

Transfer control to another agent. See Multi-Agent Workflows.

Context Management

LlmAgent exposes a history object for injecting and transforming the conversation history the LLM sees.

agent = LlmAgent(model="gemini/gemini-2.5-flash-preview-09-2025", api_key=...)

# Inject a custom entry (defaults to role="user"; pass role="system" for a system note)
agent.history.add_entry("The customer's name is Alice and she has a premium account.")

# Anchor an insertion relative to an existing event
agent.history.add_entry("Reminder: stay concise.", role="system", after=some_event)

# Replace a segment of history with new events (filtering, summarization, etc.)
agent.history.update(new_events, start=first_event, end=last_event)

Entries are inserted lazily and survive across turns. Inside a tool you can call agent.history.add_entry(...) to persist rich context fetched from an external API.

Note: some Line READMEs show agent.add_history_entry(...) / agent.set_history_processor(...). The implemented API is agent.history.add_entry(...) and agent.history.update(...).

Model Selection Strategy

Use FAST models for the main conversational agent:

gemini/gemini-2.5-flash-preview-09-2025 (recommended)
anthropic/claude-haiku-4-5-20251001
gpt-5-nano

Use POWERFUL models only via background tool calls for complex reasoning:

anthropic/claude-opus-4-5
gpt-5.2

This pattern keeps conversations responsive while accessing deep reasoning when needed. See the Two-Tier Agent Pattern in Advanced Patterns for implementation.

LLM Providers

Line SDK uses LiteLLM model strings. Common formats:

Provider	Format	Example
OpenAI	`model_name`	`gpt-5.2`, `gpt-5-nano`
Anthropic	`anthropic/model_name`	`anthropic/claude-sonnet-4-5`, `anthropic/claude-haiku-4-5-20251001`
Google Gemini	`gemini/model_name`	`gemini/gemini-2.5-flash-preview-09-2025`
Azure OpenAI	`azure/deployment_name`	`azure/my-deployment`

Set the appropriate API key environment variable:

OPENAI_API_KEY
ANTHROPIC_API_KEY
GEMINI_API_KEY
AZURE_API_KEY

Full list: https://docs.litellm.ai/docs/providers

Common Patterns

Agent with Custom Tools

from typing import Annotated
from line.llm_agent import LlmAgent, LlmConfig, loopback_tool, end_call, ToolEnv

@loopback_tool
async def check_appointment(
    ctx: ToolEnv,
    date: Annotated[str, "Date in YYYY-MM-DD format"],
) -> str:
    """Check available appointment slots for a given date."""
    slots = await calendar.get_available_slots(date)
    return f"Available slots on {date}: {', '.join(slots)}"

@loopback_tool
async def book_appointment(
    ctx: ToolEnv,
    date: Annotated[str, "Date in YYYY-MM-DD format"],
    time: Annotated[str, "Time in HH:MM format"],
    name: Annotated[str, "Customer name"],
) -> str:
    """Book an appointment slot."""
    result = await calendar.book(date, time, name)
    return f"Appointment booked for {name} on {date} at {time}. Confirmation: {result.id}"

async def get_agent(env: AgentEnv, call_request: CallRequest):
    return LlmAgent(
        model="gemini/gemini-2.5-flash-preview-09-2025",
        api_key=os.getenv("GEMINI_API_KEY"),
        tools=[check_appointment, book_appointment, end_call],
        config=LlmConfig(
            system_prompt="""You are an appointment scheduling assistant.
Help users check availability and book appointments.
Always confirm the booking details before finalizing.""",
            introduction="Hi! I can help you schedule an appointment. What date works for you?",
        ),
    )

Wait for User to Speak First

Set introduction="" to have the agent wait for the user:

config=LlmConfig(
    system_prompt="You are a helpful assistant.",
    introduction="",  # Empty string = wait for user
)

Form Filling Pattern

See the form filler example for collecting structured data via voice. Key pattern:

@loopback_tool
async def record_answer(
    ctx: ToolEnv,
    answer: Annotated[str, "The user's answer"],
) -> dict:
    """Record an answer to the current question."""
    # Process and validate answer
    # Return next question or completion status
    return {"next_question": "What is your email?", "is_complete": False}

Common Mistakes to Avoid

Missing end_call tool - If not included (or a similar custom tool), the agent cannot end the call on its own and must wait for the user to hang up

Raising exceptions in tools - Return user-friendly error strings:

# BAD
raise ValueError("Invalid order ID")

# GOOD
return "I couldn't find that order. Please check the ID and try again."

Forgetting ctx parameter - First parameter must be ctx: ToolEnv:

# GOOD
@loopback_tool
async def my_tool(ctx: ToolEnv, order_id: Annotated[str, "Order ID"]): ...

Forgetting event in handoff tools - Handoff tools MUST have event parameter:

# GOOD
@handoff_tool
async def my_handoff(ctx: ToolEnv, param: Annotated[str, "desc"], event): ...

Missing Annotated descriptions - LLM needs parameter descriptions:

# GOOD
async def my_tool(ctx, order_id: Annotated[str, "The order ID to look up"]): ...

Blocking on long operations - Use is_background=True and yield interim status:

@loopback_tool(is_background=True)
async def slow_search(ctx: ToolEnv, query: Annotated[str, "Query"]):
    yield "Searching..."  # Immediate feedback
    result = await slow_operation()
    yield result

Using sync APIs directly - Wrap sync calls with asyncio.to_thread():
```
result = await asyncio.to_thread(sync_api_call, params)
```
Using slow models for main conversation - Use fast models (haiku, flash, mini) for the main agent, powerful models only via background tools.

Reference Documentation

In this skill:

Tool Patterns - Deep dive on tool implementation
Multi-Agent Workflows - Handoffs, wrappers, guardrails
Advanced Patterns - Background tools, state, events
Calls API - WebSocket integration for web/mobile apps
Troubleshooting - Common issues and debugging

On docs.cartesia.ai:

SDK Overview — architecture and installation
Tools Guide — tool types in depth
Agents Guide — LlmAgent, custom agents, conversation loop
Events Reference — input/output events
CLI Reference — deploy, env, agents, calls

Key Imports

# Core
from line.llm_agent import LlmAgent, LlmConfig
from line.voice_agent_app import VoiceAgentApp, AgentEnv, CallRequest

# Built-in tools
from line.llm_agent import (
    end_call, send_dtmf, transfer_call, web_search,
    knowledge_base, mcp_tool, http_server_tool,
)

# Tool decorators
from line.llm_agent import loopback_tool, passthrough_tool, handoff_tool

# Tool context
from line.llm_agent import ToolEnv

# Multi-agent
from line.llm_agent import agent_as_handoff

# Knowledge base (errors / client)
from line import KnowledgeBase, KnowledgeBaseError

# Events (for passthrough/handoff tools and custom agents)
from line.events import (
    AgentSendText,
    AgentEndCall,
    AgentTransferCall,
    AgentSendDtmf,
    AgentUpdateCall,
    AgentSendCustom,
    CustomHistoryEntry,
    HistoryEvent,
)

Key Reference Files

When implementing Line SDK agents, reference these example files in the cartesia-ai/line repo:

examples/basic_chat/main.py - Simplest agent pattern (web_search)
examples/form_filler/ - Loopback tools with state
examples/chat_supervisor/main.py - Background tools with two-tier model strategy
examples/transfer_agent/main.py - Multi-agent handoffs
examples/transfer_phone_call/main.py - IVR navigation & phone transfers
examples/guardrails_wrapper/ - Wrapping an agent with guardrails
examples/sales_with_leads/ - Stateful lead extraction + research
examples/echo/tools.py - Custom handoff tools
example_integrations/ - Exa, Tavily, Cerebras, Browserbase integrations

Related Cartesia skill

For direct HTTP/WebSocket integration (TTS/STT/voices in your own backend, SDKs, optional MCP)—not Line deployment—use cartesia-api in this repository.

line-voice-agent

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

line-voice-agent

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Line SDK Voice Agent Guide

How Line Works

Prerequisites

Cartesia CLI Reference

Quick Start

1. Create Project

2. Write Agent Code

3. Test Locally

4. Deploy

5. Make a Call

Project Structure

Core Concepts

LlmAgent

LlmConfig

Dynamic Configuration from CallRequest

VoiceAgentApp

Built-in Tools

end_call

send_dtmf

transfer_call

web_search

knowledge_base

mcp_tool

http_server_tool

Custom Tool Types

Tool Type Decision Tree

Loopback Tools

Passthrough Tools

Handoff Tools

Context Management

Model Selection Strategy

LLM Providers

Common Patterns

Agent with Custom Tools

Wait for User to Speak First

Form Filling Pattern

Common Mistakes to Avoid

Reference Documentation

Key Imports

Key Reference Files

Related Cartesia skill

Similar Skills

Line SDK Voice Agent Guide

How Line Works

Prerequisites

Cartesia CLI Reference

Quick Start

1. Create Project

2. Write Agent Code

3. Test Locally

4. Deploy

5. Make a Call

Project Structure

Core Concepts

LlmAgent

LlmConfig

Dynamic Configuration from CallRequest

VoiceAgentApp

Built-in Tools

end_call

send_dtmf

transfer_call

web_search

knowledge_base

mcp_tool

http_server_tool

Custom Tool Types

Tool Type Decision Tree