Skip to main content

Documentation Index

Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Configure LLM providers, model catalog, Arch settings, voice services, guardrails, and auth profiles. The AI Configuration settings control how your workspace connects to AI models and services. From here you register provider credentials, manage the model catalog, configure safety guardrails, and set up authentication profiles that tools and agents use at runtime.
  • Navigation: Settings > AI Configuration
  • Required role: Owner or Admin

LLM Providers

The Agent Platform is provider-neutral. Agents can use models from multiple LLM providers, and you can switch between them without changing agent definitions.
  • Navigation: Settings > AI Configuration > LLM Providers
  • Required role: Owner or Admin
The page is organized into three tabs: Credentials, Model Catalog, and Policy.

How Model Resolution Works

Model configuration follows a layered approach. When the runtime executes an agent, it resolves the model through a five-level priority cascade, stopping at the first match:
PrioritySourceDescription
0Deployment overrideModel pinned to a specific deployment, for example an A/B test.
1Agent DSLModel or operation_models declared in the .agent.abl file.
2Agent DB configPer-agent model override in the agent settings UI.
3Project DB configProject-level ModelConfig with tier-to-model mapping.
4Tenant modelWorkspace-level default model for the resolved tier.
If no model is resolved at any level, the request fails with a clear error. The platform never falls back to a hard-coded default.

Fallback Chains

Configure fallback models to handle provider outages or rate limits. When the primary model fails (timeout, rate limit, or provider error), the runtime automatically retries with each fallback in order. The agent receives a response regardless of which model served it, and analytics track which model was used. Configure fallback models:
  1. Go to Settings > AI Configuration > Models.
  2. Select a registered model.
  3. In the Fallback section, add one or more fallback models in priority order.
  4. Click Save.
You can also declare fallback models directly in your agent definition.

Context Window Management

The runtime automatically manages the context window to stay within model limits:
  • Tool result compression — Large tool results are compressed before being added to the conversation.
  • Prior turn truncation — Tool results from previous turns are replaced with short placeholders.
  • Conversation compaction — When conversation history grows beyond the Compaction Threshold, older messages are summarized to reduce token usage while preserving context.

Cost and Token Tracking

Every LLM call tracks token usage and estimated cost, broken down by agent, project, model, and time period:
MetricDescription
Input tokensTokens sent to the model: system prompt, conversation history, tool definitions.
Output tokensTokens generated by the model: response text and tool calls.
Estimated costCalculated from token counts and the model’s known pricing.
Model usedWhich model served the request, important when fallbacks are in play.
LatencyTime-to-first-token and total request duration.

Supported Providers

Each provider requires its own credentials configured in the Credentials tab:
ProviderAuth typeModelsKey strengths
AnthropicAPI keyClaude 4 Opus, Claude 4 Sonnet, Claude 3.5 HaikuStrong reasoning, long context (200K tokens), tool use, vision.
OpenAIAPI keyGPT-4.1, GPT-4o, GPT-4o mini, o3, o4-miniBroad capabilities, vision, and function calling.
Azure OpenAIAPI key + endpointGPT-4, GPT-4o (Azure deployments)Enterprise compliance, regional deployment.
GoogleAPI keyGemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 FlashVery large context windows (1M+ tokens), fast inference.
Amazon BedrockIAM credentialsClaude, Titan, Bedrock-hosted modelsAWS integration, VPC-private access.
Custom / LiteLLMAPI key, Bearer, or OAuth 2.0Any OpenAI-compatible endpointUnified interface to 100+ model providers.

Credentials

The Credentials tab lists all provider credentials registered in your workspace, grouped by provider. Each credential card shows the name, provider, creation date, and the number of models using that credential. Add a credential:
  1. Go to AI Configuration > LLM Providers > Credentials.
  2. Click + Add Credential.
  3. Enter a Name (for example, Production OpenAI), select a Provider, and enter the API Key.
  4. Click Add Credential.
API keys are encrypted at rest and never displayed in plaintext after the initial entry.
Delete a credential: Click the delete icon on the credential card. Deleting a credential disconnects all models linked to it. Those models display a No Keys status in the Model Catalog until a new credential is connected.

Model Catalog

The Model Catalog tab lists all models registered in your workspace. Each row shows the provider, model name, model ID, linked credentials, and status. The top of the page shows the current Default Model with a Change Default button. Set the default model: The default model is used when no other model resolves through the model resolution chain. Click the star icon on any model row, or click Change Default in the default model banner. Add a model from the catalog:
  1. Go to AI Configuration > LLM Providers > Model Catalog.
  2. Click + Add Model, browse or search for a provider model.
  3. Review and configure the available model settings.
  4. Click Add to Workspace.
Add a custom model:
  1. Go to AI Configuration > LLM Providers > Model Catalog.
  2. Click + Add Model, then select the Custom Model tab.
  3. Fill in the following fields:
FieldDescription
Display NameA label for the model in the platform.
Model IDThe model identifier, for example gpt-4o or claude-sonnet-4-20250514.
ProviderThe provider this model belongs to.
TierAssign the model to a tier: Fast, Balanced, Powerful, or Voice.
TemperatureControls response randomness. Default: 0.7.
Max TokensMaximum tokens the model can generate per response. Default: 4096.
Endpoint URLOptional. For custom or self-hosted endpoints.
  1. Click Add to Workspace.
Configure model settings: Expand any model row to view and edit its settings.
  • Connections — Shows credentials linked to this model. Click + Add Key to connect a credential. A model with no credentials shows No Keys status and cannot serve requests.
  • Settings — Configure Temperature, Max Output Tokens, Top P, Compaction Threshold, Tier, and Response Mode.
  • Capabilities — Enable supported capabilities: Default for Tier, Tools, Vision, Streaming, and Realtime Voice.
Click Save to apply changes, or Delete to remove the model from the workspace.

Model Tiers

Tiers decouple agent definitions from specific model choices. Agents reference tiers, and the platform resolves the tier to a specific model at runtime:
TierIntended useTypical models
FastClassification, routing, quick responses.Claude 3.5 Haiku, GPT-4o mini, Gemini 2.0 Flash.
BalancedGeneral-purpose tasks balancing quality and cost.Claude 4 Sonnet, GPT-4o, Gemini 2.5 Flash.
PowerfulComplex reasoning, analysis, content generation.Claude 4 Opus, o3, Gemini 2.5 Pro.
VoiceReal-time voice interaction.GPT-4o Realtime, Gemini 2.0 Flash (Live).
Project-level tier overrides: Projects inherit the workspace’s model configuration but can customize how tiers map to operations. Go to Project > Settings > LLM Configuration to override tiers per operation:
OperationDefault tierDescription
extractionBalancedParsing structured data from user input.
validationFastInput validation and format checking.
tool_selectionBalancedChoosing which tool to invoke.
response_genBalancedGenerating user-facing responses.
summarizationBalancedSummarizing conversations or documents.
reasoningPowerfulMulti-step reasoning and analysis.
coordinationBalancedSupervisor routing and orchestration.
realtime_voiceVoiceReal-time voice interactions.
Overrides apply only to the current project. Other projects continue using workspace defaults.

Policy

The Policy tab controls how the platform resolves credentials when multiple credential sources are available. Select a credential policy and click Save Policy:
OptionDescription
Organization FirstTry organization-managed credentials first; fall back to user-provided keys if none are available.
User FirstTry user-provided credentials first; fall back to organization-managed keys.
Organization OnlyOnly organization-managed credentials are allowed. Users cannot bring their own keys.
User OnlyOnly user-provided credentials are allowed. The organization does not centrally manage keys.

Troubleshooting

IssuePotential causesResolution
Model shows No Keys statusNo credential linked to the model.Expand the model row and click + Add Key in the Connections section.
Model validation failsInvalid API key, incorrect endpoint URL, or blocked outbound connections.Confirm the API key with the provider directly. For Azure, verify the endpoint includes the deployment name. Check that outbound HTTPS connections to the provider are allowed.
Agent cannot find a model for the requested tierNo active model assigned to the tier, or the linked credential is inactive.Verify at least one active model is assigned to the tier with a Ready status. Check that the model’s linked credential is active.
Token budget exceededHigh token consumption across operations.Review consumption in Account > Billing & Usage. Switch high-volume operations to a Fast-tier model to reduce usage.

Arch

Arch is the AI Architect component of the Agent Platform. It manages the model and credentials used for agent specification generation and chat-based orchestration.
  • Navigation: Settings > AI Configuration > Arch
  • Required role: Owner or Admin
The page is organized into two tabs: Settings and Audit Logs.

Settings

Current source: The Current Source banner displays the active credential source and model Arch is currently using, for example Model Hub · OpenAI · gpt-5.2. Credential source: Choose how Arch authenticates with the LLM provider:
OptionDescription
Platform CreditsUse platform-provided API credits. Select which model to use in the Settings tab.
Direct API KeyBring your own provider key and keep the model selection in Arch settings.
Model HubUse a model already configured with credentials in your Model Hub.
Arch works best with models that support tool calling and have large context windows.
Select model: Select the model Arch uses from the dropdown. The selected model displays its provider, model ID, capability tags, and tier. If the model hasn’t been tested with Arch, a warning appears. Generation parameters: Fine-tune generation behavior for chat and spec generation:
ParameterDescriptionDefault
Reasoning EffortControls how much reasoning the model applies before responding. Slide between none and high.Low
VerbosityControls the length and detail of generated responses. Slide between low and high.Medium
Max completion tokensMaximum tokens Arch can generate per response.6,000
Click Save Changes to apply updates.

Audit Logs

The Audit Logs tab shows Arch session activity for the selected time range, including cost, error count, and a browsable session list. Summary metrics:
MetricDescription
Sessions (24h)Number of Arch sessions in the last 24 hours.
Total Cost (24h)Estimated cost of Arch sessions in the last 24 hours.
Errors (24h)Number of sessions that encountered errors in the last 24 hours.
Listed SessionsTotal number of sessions shown in the current session list.
Sessions list: The list displays all Arch sessions. Use Has Errors to show only sessions with errors, or Today to show only today’s sessions. Each session row shows the session ID, timestamp, number of turns, and current phase. Click a session to view the full execution flow in the detail panel.

Voice Services

Voice Services manages credentials for speech-to-text (STT) and text-to-speech (TTS) providers that power voice-enabled agent interactions.
  • Navigation: Settings > AI Configuration > Voice Services
  • Required role: Owner or Admin
Voice channels require both Deepgram (STT) and ElevenLabs (TTS) credentials. Voice preview and live voice sessions will fail until these are configured.

Supported Providers

ProviderDescription
Deepgram SpeechDeepgram credentials for speech recognition and synthesis.
Google Cloud SpeechGoogle Cloud speech credentials for recognition and synthesis.
AWS SpeechAWS credentials for Amazon Transcribe and Polly.
Microsoft SpeechAzure / Microsoft Speech credentials for STT and TTS.
Nuance SpeechNuance speech credentials for STT and TTS.
Each provider shows its current configuration status. Providers that haven’t been configured display Not Configured.

Configure a Provider

  1. Go to Settings > AI Configuration > Voice Services.
  2. Find the provider you want to configure and click Configure.
  3. Enter the provider credentials in the configuration dialog.
  4. Click Save.
Configuration fields vary by provider. The following table shows an example for Google Cloud Speech:
FieldDescription
Display NameA label to identify this credential set, for example Google Cloud Speech Credentials.
Service Account JSONPaste the full Google service-account JSON used for Speech-to-Text.
STT Model IDOptional. The Google STT model to use, for example chirp_3, chirp, latest_long, telephony. Default: chirp_3.

Guardrails

Workspace guardrails define organization-wide content safety policies that apply across all agents. They evaluate agent inputs and outputs against configurable safety categories and take a configured action — block, warn, or log — when content violates a policy.
  • Navigation: Settings > AI Configuration > Guardrails
  • Required role: Owner or Admin
The page is organized into two sections: Guardrail Providers and Guardrail Policies.

Guardrail Providers

Guardrail providers are the evaluation services that assess content against your policies. Configure at least one provider before creating policies. Add a provider:
  1. Go to Settings > AI Configuration > Guardrails.
  2. Click Add Provider.
  3. In the dialog, configure the provider using the Form or YAML tab.
  4. Click Add Provider to save.
Form tab fields:
FieldDescription
NameA unique internal identifier, for example OpenAI Moderation.
Display NameA human-readable label, for example Content Safety (OpenAI).
Adapter TypeThe guardrail adapter type, for example OpenAI Moderation.
HostingHosting model for the provider. Default: Cloud API.
Endpoint URLThe provider’s API endpoint URL.
ModelThe model used for evaluation, for example text-moderation-latest.
AuthenticationToggle on to enable authentication. Use an Auth Profile for providers requiring credentials — raw API keys are not accepted.
Default CategoryThe default content category this provider evaluates, for example content_safety.
Default ThresholdSensitivity threshold from 0.0 to 1.0. Default: 0.7. Lower values flag more content; higher values flag only high-confidence violations.
Retry configuration:
FieldDescription
Max RetriesRetry attempts if the provider call fails. Default: 3.
Backoff StrategyInterval in milliseconds between retries. Default: 1000.
EnabledToggle on to enable this provider for evaluation.
Provider health monitoring: The platform periodically checks provider health. When a provider becomes unhealthy, its circuit breaker activates. After repeated failures, the circuit breaker opens and stops sending requests. After the reset timeout, it allows a test request through. When a provider’s circuit breaker is open, the platform follows the configured fail mode:
  • Fail-open — Content is delivered without guardrail evaluation. Violations may go undetected.
  • Fail-closed — Content is blocked until the provider recovers. Safer, but may interrupt service.

Guardrail Policies

Guardrail policies define the rules applied to agent inputs and outputs for a specific project. Select the target project from the Project dropdown in the Guardrail Policies section. Content safety categories:
CategoryDescription
Hate speechContent promoting hatred or violence against groups based on protected characteristics.
Sexual contentSexually explicit material inappropriate for the agent’s use case.
ViolenceGraphic descriptions of violence or instructions for causing harm.
Self-harmContent that encourages or instructs self-harm.
PII detectionPersonally identifiable information in outputs: names, emails, phone numbers, SSNs.
ProfanityOffensive language and slurs.
Prompt injectionAttempts to override agent instructions through crafted inputs.
Off-topicContent outside the agent’s intended domain.
Hallucination riskResponses that may contain fabricated information (requires knowledge base context).
Each category has a configurable threshold from 0.0 to 1.0. Lower thresholds flag more content; higher thresholds flag only high-confidence violations. Create a policy:
  1. Select the target project from the Project dropdown.
  2. Click Create policy.
  3. Enter a policy name and description.
  4. Select the categories to evaluate.
  5. For each category, set:
    • Threshold — Sensitivity level (0.0 = flag everything, 1.0 = flag only high-confidence violations).
    • Action — What happens when content is flagged:
      • Block — Prevent the response from reaching the end user. A fallback message is shown instead.
      • Warn — Deliver the response with a warning annotation visible to operators.
      • Log — Record the violation for analytics without affecting the response.
  6. Set whether the policy evaluates inbound messages, outbound responses, or both.
  7. Click Save.
Link policies to projects: Workspace guardrail policies are available to all projects, but each project must activate them:
  1. Open the project in Studio.
  2. Go to Project settings > Guardrails.
  3. Toggle on the workspace guardrail policies to enforce for this project.
  4. Optionally adjust thresholds at the project level. Project settings override workspace defaults when more restrictive.
How workspace and project guardrails interact:
  • Workspace policies evaluate first. If a workspace guardrail blocks content, the project guardrail is not consulted.
  • Project policies add specificity — use them for domain-specific rules, for example blocking financial advice in a customer support agent.
  • The most restrictive action wins. If the workspace policy says warn but the project policy says block for the same category, the content is blocked.

Auth Profiles

Auth profiles store authentication configurations that agent tools use to call external services securely. Profiles are managed centrally at the workspace level and reused across all projects.
  • Navigation: Settings > AI Configuration > Auth Profiles
  • Required role: Owner or Admin
The page is organized into two tabs: All Profiles and Integrations.

All Profiles

The All Profiles tab lists all auth profiles configured for the workspace. Use the toolbar to search by profile name, filter by authentication type, or filter by status (active or inactive). Add a profile: Click Add Profile. Two profile types are available:
TypeDescription
Custom ProfileConfigure a custom authentication scheme with manually specified credentials and settings.
Integration ProfileConfigure authentication for a supported third-party integration.
Supported auth types:
TypeDescription
OAuth 2.0Authorization code, client credentials, and implicit flows.
API keyHeader or query parameter injection.
Bearer tokenStatic or dynamically refreshed bearer tokens.
Basic authUsername and password combinations.
CustomArbitrary header or body injection for non-standard auth schemes.
Auth profile credentials are encrypted at rest and never exposed in logs or agent responses.
Guardrail providers that require credentials must reference an auth profile. Raw API keys are not accepted directly in the guardrail provider configuration.

Integrations

The Integrations tab lists auth profiles configured for supported third-party integrations. Create these using the Integration Profile option from the Add Profile menu.

Using Auth Profiles

Once created, reference an auth profile by name in tool configurations and guardrail provider settings. The platform resolves the profile at runtime and injects the correct credentials without exposing them in agent definitions, logs, or traces.