Documentation Index
Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Testing, deployment & operations
Studio provides a complete set of pages for evaluating agent quality, deploying agents to production environments, and monitoring live conversations. These capabilities span the Evaluate and Operate groups in the project sidebar.
Evaluations
Evaluations provide a systematic framework for testing and measuring agent quality. Studio’s evaluations system lets you define test personas, create realistic scenarios, configure automated evaluators, bundle them into evaluation sets, and run evaluations against your agents.
Evaluations page
Navigate to Evaluate > Evaluations from the project sidebar. The evaluations page is organized into five tabs:
| Tab | Purpose |
|---|
| Personas | Define synthetic user profiles that simulate real customers |
| Scenarios | Create test conversation scripts and situations |
| Evaluators | Configure automated judges that score agent responses |
| Eval Sets | Bundle personas, scenarios, and evaluators into reusable test suites |
| Runs | Execute evaluations and review results |
A Quick Eval button in the header provides a shortcut for running a fast, ad-hoc evaluation without setting up a full eval set.
Personas
Personas represent the types of users who interact with your agents. Each persona defines a user profile that shapes how test conversations unfold.
To create a persona, select the Personas tab, click Create Persona, and fill in:
- Name — a descriptive label (e.g., “Frustrated Customer,” “Technical Expert,” “New User”).
- Description — background information about this user type.
- Traits — behavioral characteristics that influence conversation style (e.g., impatient, detail-oriented, non-technical).
- Goals — what this persona is trying to achieve when talking to the agent.
- Context — additional background information (account type, history, preferences).
Tip: Create personas that represent your actual user segments. Include edge-case personas (e.g., users with accessibility needs, users who speak in short phrases, users who provide irrelevant information) to test agent robustness.
Scenarios
Scenarios define specific situations or conversation flows to test. To create a scenario, select the Scenarios tab, click Create Scenario, and configure:
- Name — a descriptive title (e.g., “Password Reset Request,” “Product Return with Missing Receipt”).
- Description — the situation being tested.
- Initial message — the opening message that starts the test conversation.
- Expected flow — key steps or outcomes the conversation should reach.
- Success criteria — what constitutes a successful resolution.
- Variables — dynamic data used in the scenario (order numbers, account IDs, etc.).
Evaluators
Evaluators are automated judges that score agent performance on specific dimensions. To create an evaluator, select the Evaluators tab, click Create Evaluator, and configure:
- Name — the quality dimension being measured (e.g., “Helpfulness,” “Accuracy,” “Tone”).
- Description — what this evaluator assesses.
- Scoring rubric — criteria for each score level (e.g., 1-5 scale with descriptions for each level).
- Evaluation prompt — the instructions given to the LLM judge for scoring.
Studio provides templates for common evaluation dimensions: Helpfulness, Accuracy, Tone, Completeness, and Efficiency.
Eval sets
Eval sets bundle personas, scenarios, and evaluators into reusable test suites. To create an eval set, select the Eval Sets tab, click Create Eval Set, and configure a name, description, and selections for personas, scenarios, and evaluators.
The evaluation engine runs every combination of persona and scenario, scoring each conversation with all selected evaluators.
Tip: Start with a small eval set (2-3 personas, 3-5 scenarios, 2-3 evaluators) to validate your setup before scaling to larger test suites.
Running evaluations
To start a run, select the Runs tab, click Run Evaluation (or use Quick Eval for an ad-hoc run), select the eval set, choose the target agent and environment, and click Start.
Each evaluation run follows this pipeline:
- Test data generation — creates conversations from persona-scenario combinations.
- Conversation execution — simulates conversations with the agent using each persona-scenario pair.
- LLM judge evaluation — runs each evaluator against completed conversations to generate scores.
- Recommendation generation — analyzes results and produces improvement recommendations.
Active runs display progress indicators showing total conversations to execute, completed vs. remaining, and current pipeline step.
Reviewing results
Completed runs show an overview with overall score averages, score distribution (high, medium, low), and pass/fail rates based on configured thresholds. Drill into individual conversations to see full transcripts, per-evaluator scores with justifications, and highlighted issues.
Comparison view — compare results across multiple runs to track improvement over time or A/B test different agent configurations.
Heatmap view — scores across all persona-scenario combinations, making it easy to spot patterns such as a persona that consistently scores low or a scenario causing failures.
Quick Eval provides a streamlined path for fast evaluations: select an agent, choose a few scenarios (or let the system auto-generate them), and run. Quick Eval skips the full eval set configuration and produces results faster, making it useful for iterative development.
Deployment & channels
Deployment moves your agents from development into environments where real users can interact with them. Studio manages the deployment pipeline, environment configuration, channel setup, and API key management.
Deployments page
Navigate to Operate > Deployments from the project sidebar. The page is organized into three tabs:
| Tab | Purpose |
|---|
| Environments | Manage deployment environments and active deployments |
| Channels | Configure communication channels (web, voice, API) |
| API Keys | Create and manage API keys for programmatic access |
Environments
Environments represent deployment targets where agents run. Common environments include development, staging, and production. Each environment card displays the environment name, status badge (active, inactive, or deploying), entry agent, and created timestamp.
Creating a deployment:
- Select the Environments tab.
- Click Deploy Agent.
- Configure the environment, entry agent (the initial conversation entry point), and version (defaults to the current working copy).
- Click Deploy.
Promoting deployments — to promote a deployment from one environment to another (e.g., staging to production), click the promote action on the deployment card, select the target environment, optionally copy environment-specific variables, and confirm.
Environment variables — each environment can have its own set of configuration variables that override project-level defaults. Use environment variables for API endpoints that differ between staging and production, feature flags, and environment-scoped credentials.
Channels
Channels define how users connect to your deployed agents. Studio supports multiple channel types:
Web channel — deploy a web chat widget that can be embedded in your website. Configure the widget appearance (chat bubble position and color, welcome message, branding options) and copy the embed code snippet.
Voice channel — connect agents to voice interfaces. Configure speech-to-text provider, text-to-speech voice selection, latency targets, and voice interaction parameters.
API channel — expose agents through a REST API. Configure authentication method, rate limits, and response format preferences. Use the provided endpoint and keys for programmatic access.
Each channel card displays the channel type and name, active/inactive status, and configuration summary. Click a card to view or edit its detailed configuration.
API keys
The API Keys tab manages keys used for programmatic access to deployed agents.
To create an API key, click Create API Key, configure a name, permissions scope, and optional expiration date. Copy the generated key immediately — it is shown only once. From the keys list you can view active keys with usage metadata, revoke keys, and rotate keys by creating a new key and revoking the old one.
Warning: Store API keys securely. Do not embed them in client-side code or commit them to version control.
Deployment pipeline
The deployment pipeline supports a structured flow from development to production:
- Development — agents run in a development environment for initial testing.
- Staging — pre-production environment for integration testing and validation.
- Production — live environment serving real users.
Agent versions can be promoted through these stages, with each promotion creating an auditable record.
Operations
Operations pages provide tools for monitoring, troubleshooting, and intervening in live agent conversations.
Session browser
Navigate to Operate > Sessions from the project sidebar. The session browser shows all conversations between users and agents in your project.
Sessions list — conversations are displayed in a sortable, filterable table with columns for Session ID (click to copy), Agent Name, Created At, Message Count, and Trace Event Count. Use the date range filter (Last 24h, Last 48h, This Week, Last 7 Days, This Month, Last 30 Days, All), column sorting, and pagination (20 per page) to find specific sessions.
The sessions page provides two tabs: Conversations (the session table view) and Traces (a dedicated trace viewer for exploring execution traces across all sessions).
Session detail view — click any session row to open the session detail page:
- Conversation tab — full conversation transcript, agent conversation tree visualization showing branching across agents in multi-agent projects, and session summary panel with metadata.
- Trace tab — execution trace timeline showing every action the agent took, including LLM calls, tool invocations, handoffs, state changes, and errors. Each event shows timing information and expandable request/response payloads.
Tip: Use the trace tab to diagnose why an agent behaved unexpectedly. Trace events show the complete decision chain, including which tools were called, what the LLM reasoned, and where handoffs occurred.
Human-in-the-loop inbox
Navigate to Operate > Inbox from the project sidebar. The inbox consolidates all tasks that require human attention.
Task types:
| Type | Description |
|---|
| Approval | A workflow step or agent action requires explicit approval before proceeding |
| Data Entry | The agent needs information that must be provided by a human operator |
| Review | Agent output or a decision requires human review before finalization |
| Decision | A choice point where a human must select the next course of action |
| Escalation | An agent has escalated an issue that it cannot resolve autonomously |
Filter tabs at the top of the inbox let you view all tasks or filter to a specific type. Each tab shows a count badge. Task cards display the title, priority indicator, SLA countdown, task type badge, and timestamp.
Click a task card to expand the action panel: approve/reject approvals, fill in data entry forms, mark reviews as reviewed, select from decision options, or resolve escalations. After responding, the task is removed from the inbox and the associated workflow or agent conversation resumes. The inbox polls for new tasks every 5 seconds.
Transfer sessions
Navigate to Operate > Transfer Sessions from the project sidebar. This page monitors active agent transfer sessions — conversations being handed off between agents or between an agent and a human representative.
The transfer session table displays Session ID, Status, Provider (e.g., SmartAssist, Genesys, NICE, Five9), Channel (Chat, Voice, Email, Messaging), and timestamps.
Status values:
| Status | Description |
|---|
| Pending | Transfer initiated, waiting to be picked up |
| Queued | Transfer is in the queue for the target agent or human |
| Active | Transfer is in progress |
| Post-Agent | Transfer completed, in post-processing |
| Ended | Transfer is complete |
Use filter dropdowns (provider, status, channel) to narrow the view. Click a transfer session row to open a detail modal with the full transfer timeline, context and metadata, and management actions.
Alerts
Alerts keep you informed about events that require attention across your project. Navigate to Operate > Alerts from the project sidebar.
Approval alerts
The Approvals tab displays workflow steps that are waiting for human approval. This is a focused view of the approval tasks that also appear in the Inbox. Each approval card shows the workflow name, step name, who requested it, when it was requested, and relevant context. Click a card to approve or reject with an optional comment.
Alert rules
The Alert Rules tab lets you configure automated notifications for important events:
- Agent errors exceed a threshold — get alerted when an agent’s error rate spikes above a configured percentage.
- Session volume changes — notifications for unusual increases or decreases in conversation volume.
- SLA breaches — alerts when human-in-the-loop tasks are not responded to within the configured time window.
- Evaluation score drops — notifications when evaluation scores fall below a minimum threshold.
- Deployment events — alerts when agents are deployed, promoted, or rolled back.
Note: Alert rules are being actively developed. The page provides visibility into the planned notification capabilities.
Notification channels
When alert rules are configured, notifications can be delivered through:
- In-app notifications — displayed within Studio.
- Email — sent to configured recipients.
- Webhook — sent to external systems for integration with tools like Slack, PagerDuty, or custom dashboards.
Related pages