How a Message Executes
When a user sends a message, the Runtime processes it through a structured pipeline before returning a response. The cycle repeats until the agent produces a final text response or reaches the iteration limit. The default is 10 tool call iterations per turn, configurable in the agent’s execution settings.Channels
The Runtime accepts messages from the following channels: Realtime (WebSocket or persistent connection):| Channel | Description |
|---|---|
| Web Chat | Browser-based chat widget |
| SDK | JavaScript/mobile SDK over WebSocket |
| API | Direct REST API call |
| Voice | Generic voice, Twilio, LiveKit, and real-time voice pipeline |
| A2A | Agent-to-Agent protocol for cross-service agent calls |
| Channel | Description |
|---|---|
| Meta Cloud, Gupshup, Infobip, Netcore | |
| Slack | Slack app |
| Microsoft Teams | MS Teams bot |
| Messenger | Meta Messenger |
| Instagram Direct | |
| Telegram | Telegram bot |
| SMS | Twilio SMS |
| Inbound email | |
| Zendesk | Zendesk ticket and chat |
| Genesys | Genesys Cloud contact center |
| HTTP Async | Generic webhook-based integration |
Tool Execution
Tools extend agent capabilities. When the LLM requests a tool call, the Runtime dispatches it to the appropriate executor and returns the result to the reasoning loop.| Tool type | What it does |
|---|---|
| HTTP | Calls an external REST or GraphQL API with optional auth injection |
| MCP | Connects to a Model Context Protocol server; tools are discovered from the server’s capability manifest |
| Code | Executes JavaScript or Python in an isolated sandbox |
| Connector | Uses a named integration (Salesforce, Jira, and others) with credential injection |
| Workflow | Invokes a registered workflow; supports both synchronous and long-running async execution |
| Knowledge Base | Queries a SearchAI knowledge base and returns ranked results |
| Async Webhook | Sends a request to an external system and suspends the session until a callback is received |
Execution Pipeline
When a tool executes, the Runtime processes it in sequence:- Resolves the tool binding from the deployment configuration.
- Validates inputs against declared parameter types before the call.
- Makes the external call with the appropriate authentication.
- Processes the result — available in conversation context for reasoning agents, or as session variables for agents with steps.
- Handles errors using
ON_ERRORhandlers: retry logic, fallback responses, or escalation triggers.
Session Management
Each conversation is represented as a session. A session stores the conversation history, variables, agent state, and execution metadata.Session Lifecycle
| State | Description |
|---|---|
| Active | Processing a request |
| Idle | Waiting for the next user message |
| Completed | Conversation ended normally |
| Failed | Terminated due to a runtime error |
| Abandoned | Ended due to user inactivity |
| Escalated | Transferred to a human agent |
| Archived | Retained for historical reference |
Conversation Window and Compaction
The Runtime maintains a sliding window over the conversation history to control how much context is sent to the LLM on each turn. The default window is 40 messages. When the window fills, the Runtime can compact older turns into a summary rather than discarding them. This preserves context from earlier in long conversations without increasing token usage. Compaction is disabled by default and can be enabled in Runtime Config in Studio.| Setting | Default | Description |
|---|---|---|
| Conversation window | 40 messages | Maximum messages sent to the LLM per turn |
| Compaction | Disabled | When enabled, summarizes older turns as the window fills |
| Compaction threshold | 80% | How full the window must be before compaction triggers |
Concurrency
The Runtime uses a configurable strategy to handle multiple messages arriving within the same session:| Strategy | Behavior | When to use |
|---|---|---|
| Serial (default) | Messages are queued and processed one at a time, in order. | Most conversational agents — ensures each message has full context from the previous one. |
| Preemptive | A new message cancels in-progress execution and starts fresh. | Real-time interfaces where the user may correct themselves mid-response. |
| Parallel | Multiple messages process simultaneously. | Batch operations where messages are independent. |
Multi-Agent Orchestration
The Runtime executes multi-agent topologies defined in ABL. When routing rules match, the Runtime transitions the active thread to the target agent, forwards context, and manages the return path.| Pattern | What happens at runtime |
|---|---|
| Supervisor | Receives every message; evaluates HANDOFF rules top-to-bottom; routes to first match |
| Handoff | Transfers conversation to the target agent; optionally returns control when RETURN: true |
| Delegate | Sends a task to a sub-agent; blocks the parent until the sub-agent completes or times out |
| Fan-out | Dispatches multiple agents in parallel; merges results when all complete |
| Escalation | Transfers the conversation to a human agent via a connected agent desktop |
Thread Hierarchy
When a supervisor hands off to a specialist, the Runtime creates a thread within the existing session — not a new session. Threads form a stack: handoffs push new threads, completions pop back to the parent. The user experiences one continuous conversation regardless of how many agents participate. Each thread maintains its own conversation history and gathered variables, but can read data from parent threads. For configuration, syntax reference, and examples, see Multi-Agent Orchestration.Observability
Every execution path emits structured trace events. Traces are accessible from the Sessions page in Studio.| Event type | What it captures |
|---|---|
llm_call | LLM invocation — model used, token count, latency |
tool_call | Tool invocation request with input parameters |
tool_result | Tool execution result or error |
decision | Routing or flow decision outcome |
handoff | Agent-to-agent transfer |
state_change | Session variable update |
guardrail_eval | Guardrail policy evaluation result |
error | Runtime or execution error |
Rate Limits
The Runtime enforces per-tenant rate limits on a rolling 1-minute window.| Dimension | Default limit |
|---|---|
| Requests | 100 / min |
| LLM tokens | 100,000 / min |
| Concurrent sessions | 50 |
| Tool calls | 200 / min |
| Messages per session | 30 / min |
429. The response includes a Retry-After header (seconds to wait) and X-RateLimit-Remaining. Contact your account team to adjust limits for your plan.
Limits Reference
| Limit | Value |
|---|---|
| Request body size | 1 MB |
| WebSocket message size | 512 KB |
| Tool iterations per turn | 10 (default) |
| Conversation window | 40 messages |
| Session TTL (inactivity) | 24 hours |
| Conversation history retention | 90 days |
Troubleshooting
Agent stops responding mid-conversation Open the session in Studio and check the Traces tab. Look for anerror event or a tool_result with a non-200 status. Tool failures that aren’t handled by an ON_FAILURE block can cause the reasoning loop to stall.
Earlier context is missing in long conversations
Once the conversation window fills (40 messages by default), older messages are no longer sent to the LLM. Enable compaction in Runtime Config to preserve earlier context as a summary rather than dropping it.
Tool calls return unexpected results
Check the tool_call and tool_result trace events in Studio for the exact request sent and response received. Verify that the tool’s auth profile and endpoint are correct in Settings → Connections.
HTTP 429 — Too Many Requests
The Retry-After header tells you how many seconds to wait before retrying. If you hit limits consistently during evaluation, check whether multiple concurrent test sessions are sharing the same tenant quota.
Agents route to the wrong specialist
HANDOFF rules evaluate top-to-bottom and the first match wins. Check rule ordering in the Supervisor — overly broad conditions placed above specific ones will capture requests before the intended rule is reached. See Orchestration Troubleshooting.