Documentation Index
Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Configure LLM providers, model catalog, Arch settings, voice services, guardrails, and auth profiles.
The AI Configuration settings control how your workspace connects to AI models and services. From here you register provider credentials, manage the model catalog, configure safety guardrails, and set up authentication profiles that tools and agents use at runtime.
- Navigation: Settings > AI Configuration
- Required role: Owner or Admin
LLM Providers
The Agent Platform is provider-neutral. Agents can use models from multiple LLM providers, and you can switch between them without changing agent definitions.
- Navigation: Settings > AI Configuration > LLM Providers
- Required role: Owner or Admin
The page is organized into three tabs: Credentials, Model Catalog, and Policy.
How Model Resolution Works
Model configuration follows a layered approach. When the runtime executes an agent, it resolves the model through a five-level priority cascade, stopping at the first match:
| Priority | Source | Description |
|---|
| 0 | Deployment override | Model pinned to a specific deployment, for example an A/B test. |
| 1 | Agent DSL | Model or operation_models declared in the .agent.abl file. |
| 2 | Agent DB config | Per-agent model override in the agent settings UI. |
| 3 | Project DB config | Project-level ModelConfig with tier-to-model mapping. |
| 4 | Tenant model | Workspace-level default model for the resolved tier. |
If no model is resolved at any level, the request fails with a clear error. The platform never falls back to a hard-coded default.
Fallback Chains
Configure fallback models to handle provider outages or rate limits. When the primary model fails (timeout, rate limit, or provider error), the runtime automatically retries with each fallback in order. The agent receives a response regardless of which model served it, and analytics track which model was used.
Configure fallback models:
- Go to Settings > AI Configuration > Models.
- Select a registered model.
- In the Fallback section, add one or more fallback models in priority order.
- Click Save.
You can also declare fallback models directly in your agent definition.
Context Window Management
The runtime automatically manages the context window to stay within model limits:
- Tool result compression — Large tool results are compressed before being added to the conversation.
- Prior turn truncation — Tool results from previous turns are replaced with short placeholders.
- Conversation compaction — When conversation history grows beyond the Compaction Threshold, older messages are summarized to reduce token usage while preserving context.
Cost and Token Tracking
Every LLM call tracks token usage and estimated cost, broken down by agent, project, model, and time period:
| Metric | Description |
|---|
| Input tokens | Tokens sent to the model: system prompt, conversation history, tool definitions. |
| Output tokens | Tokens generated by the model: response text and tool calls. |
| Estimated cost | Calculated from token counts and the model’s known pricing. |
| Model used | Which model served the request, important when fallbacks are in play. |
| Latency | Time-to-first-token and total request duration. |
Supported Providers
Each provider requires its own credentials configured in the Credentials tab:
| Provider | Auth type | Models | Key strengths |
|---|
| Anthropic | API key | Claude 4 Opus, Claude 4 Sonnet, Claude 3.5 Haiku | Strong reasoning, long context (200K tokens), tool use, vision. |
| OpenAI | API key | GPT-4.1, GPT-4o, GPT-4o mini, o3, o4-mini | Broad capabilities, vision, and function calling. |
| Azure OpenAI | API key + endpoint | GPT-4, GPT-4o (Azure deployments) | Enterprise compliance, regional deployment. |
| Google | API key | Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash | Very large context windows (1M+ tokens), fast inference. |
| Amazon Bedrock | IAM credentials | Claude, Titan, Bedrock-hosted models | AWS integration, VPC-private access. |
| Custom / LiteLLM | API key, Bearer, or OAuth 2.0 | Any OpenAI-compatible endpoint | Unified interface to 100+ model providers. |
Credentials
The Credentials tab lists all provider credentials registered in your workspace, grouped by provider. Each credential card shows the name, provider, creation date, and the number of models using that credential.
Add a credential:
- Go to AI Configuration > LLM Providers > Credentials.
- Click + Add Credential.
- Enter a Name (for example, Production OpenAI), select a Provider, and enter the API Key.
- Click Add Credential.
API keys are encrypted at rest and never displayed in plaintext after the initial entry.
Delete a credential:
Click the delete icon on the credential card. Deleting a credential disconnects all models linked to it. Those models display a No Keys status in the Model Catalog until a new credential is connected.
Model Catalog
The Model Catalog tab lists all models registered in your workspace. Each row shows the provider, model name, model ID, linked credentials, and status. The top of the page shows the current Default Model with a Change Default button.
Set the default model:
The default model is used when no other model resolves through the model resolution chain. Click the star icon on any model row, or click Change Default in the default model banner.
Add a model from the catalog:
- Go to AI Configuration > LLM Providers > Model Catalog.
- Click + Add Model, browse or search for a provider model.
- Review and configure the available model settings.
- Click Add to Workspace.
Add a custom model:
- Go to AI Configuration > LLM Providers > Model Catalog.
- Click + Add Model, then select the Custom Model tab.
- Fill in the following fields:
| Field | Description |
|---|
| Display Name | A label for the model in the platform. |
| Model ID | The model identifier, for example gpt-4o or claude-sonnet-4-20250514. |
| Provider | The provider this model belongs to. |
| Tier | Assign the model to a tier: Fast, Balanced, Powerful, or Voice. |
| Temperature | Controls response randomness. Default: 0.7. |
| Max Tokens | Maximum tokens the model can generate per response. Default: 4096. |
| Endpoint URL | Optional. For custom or self-hosted endpoints. |
- Click Add to Workspace.
Configure model settings:
Expand any model row to view and edit its settings.
- Connections — Shows credentials linked to this model. Click + Add Key to connect a credential. A model with no credentials shows No Keys status and cannot serve requests.
- Settings — Configure Temperature, Max Output Tokens, Top P, Compaction Threshold, Tier, and Response Mode.
- Capabilities — Enable supported capabilities: Default for Tier, Tools, Vision, Streaming, and Realtime Voice.
Click Save to apply changes, or Delete to remove the model from the workspace.
Model Tiers
Tiers decouple agent definitions from specific model choices. Agents reference tiers, and the platform resolves the tier to a specific model at runtime:
| Tier | Intended use | Typical models |
|---|
| Fast | Classification, routing, quick responses. | Claude 3.5 Haiku, GPT-4o mini, Gemini 2.0 Flash. |
| Balanced | General-purpose tasks balancing quality and cost. | Claude 4 Sonnet, GPT-4o, Gemini 2.5 Flash. |
| Powerful | Complex reasoning, analysis, content generation. | Claude 4 Opus, o3, Gemini 2.5 Pro. |
| Voice | Real-time voice interaction. | GPT-4o Realtime, Gemini 2.0 Flash (Live). |
Project-level tier overrides:
Projects inherit the workspace’s model configuration but can customize how tiers map to operations. Go to Project > Settings > LLM Configuration to override tiers per operation:
| Operation | Default tier | Description |
|---|
extraction | Balanced | Parsing structured data from user input. |
validation | Fast | Input validation and format checking. |
tool_selection | Balanced | Choosing which tool to invoke. |
response_gen | Balanced | Generating user-facing responses. |
summarization | Balanced | Summarizing conversations or documents. |
reasoning | Powerful | Multi-step reasoning and analysis. |
coordination | Balanced | Supervisor routing and orchestration. |
realtime_voice | Voice | Real-time voice interactions. |
Overrides apply only to the current project. Other projects continue using workspace defaults.
Policy
The Policy tab controls how the platform resolves credentials when multiple credential sources are available.
Select a credential policy and click Save Policy:
| Option | Description |
|---|
| Organization First | Try organization-managed credentials first; fall back to user-provided keys if none are available. |
| User First | Try user-provided credentials first; fall back to organization-managed keys. |
| Organization Only | Only organization-managed credentials are allowed. Users cannot bring their own keys. |
| User Only | Only user-provided credentials are allowed. The organization does not centrally manage keys. |
Troubleshooting
| Issue | Potential causes | Resolution |
|---|
| Model shows No Keys status | No credential linked to the model. | Expand the model row and click + Add Key in the Connections section. |
| Model validation fails | Invalid API key, incorrect endpoint URL, or blocked outbound connections. | Confirm the API key with the provider directly. For Azure, verify the endpoint includes the deployment name. Check that outbound HTTPS connections to the provider are allowed. |
| Agent cannot find a model for the requested tier | No active model assigned to the tier, or the linked credential is inactive. | Verify at least one active model is assigned to the tier with a Ready status. Check that the model’s linked credential is active. |
| Token budget exceeded | High token consumption across operations. | Review consumption in Account > Billing & Usage. Switch high-volume operations to a Fast-tier model to reduce usage. |
Arch
Arch is the AI Architect component of the Agent Platform. It manages the model and credentials used for agent specification generation and chat-based orchestration.
- Navigation: Settings > AI Configuration > Arch
- Required role: Owner or Admin
The page is organized into two tabs: Settings and Audit Logs.
Settings
Current source:
The Current Source banner displays the active credential source and model Arch is currently using, for example Model Hub · OpenAI · gpt-5.2.
Credential source:
Choose how Arch authenticates with the LLM provider:
| Option | Description |
|---|
| Platform Credits | Use platform-provided API credits. Select which model to use in the Settings tab. |
| Direct API Key | Bring your own provider key and keep the model selection in Arch settings. |
| Model Hub | Use a model already configured with credentials in your Model Hub. |
Arch works best with models that support tool calling and have large context windows.
Select model:
Select the model Arch uses from the dropdown. The selected model displays its provider, model ID, capability tags, and tier. If the model hasn’t been tested with Arch, a warning appears.
Generation parameters:
Fine-tune generation behavior for chat and spec generation:
| Parameter | Description | Default |
|---|
| Reasoning Effort | Controls how much reasoning the model applies before responding. Slide between none and high. | Low |
| Verbosity | Controls the length and detail of generated responses. Slide between low and high. | Medium |
| Max completion tokens | Maximum tokens Arch can generate per response. | 6,000 |
Click Save Changes to apply updates.
Audit Logs
The Audit Logs tab shows Arch session activity for the selected time range, including cost, error count, and a browsable session list.
Summary metrics:
| Metric | Description |
|---|
| Sessions (24h) | Number of Arch sessions in the last 24 hours. |
| Total Cost (24h) | Estimated cost of Arch sessions in the last 24 hours. |
| Errors (24h) | Number of sessions that encountered errors in the last 24 hours. |
| Listed Sessions | Total number of sessions shown in the current session list. |
Sessions list:
The list displays all Arch sessions. Use Has Errors to show only sessions with errors, or Today to show only today’s sessions. Each session row shows the session ID, timestamp, number of turns, and current phase. Click a session to view the full execution flow in the detail panel.
Voice Services
Voice Services manages credentials for speech-to-text (STT) and text-to-speech (TTS) providers that power voice-enabled agent interactions.
- Navigation: Settings > AI Configuration > Voice Services
- Required role: Owner or Admin
Voice channels require both Deepgram (STT) and ElevenLabs (TTS) credentials. Voice preview and live voice sessions will fail until these are configured.
Supported Providers
| Provider | Description |
|---|
| Deepgram Speech | Deepgram credentials for speech recognition and synthesis. |
| Google Cloud Speech | Google Cloud speech credentials for recognition and synthesis. |
| AWS Speech | AWS credentials for Amazon Transcribe and Polly. |
| Microsoft Speech | Azure / Microsoft Speech credentials for STT and TTS. |
| Nuance Speech | Nuance speech credentials for STT and TTS. |
Each provider shows its current configuration status. Providers that haven’t been configured display Not Configured.
- Go to Settings > AI Configuration > Voice Services.
- Find the provider you want to configure and click Configure.
- Enter the provider credentials in the configuration dialog.
- Click Save.
Configuration fields vary by provider. The following table shows an example for Google Cloud Speech:
| Field | Description |
|---|
| Display Name | A label to identify this credential set, for example Google Cloud Speech Credentials. |
| Service Account JSON | Paste the full Google service-account JSON used for Speech-to-Text. |
| STT Model ID | Optional. The Google STT model to use, for example chirp_3, chirp, latest_long, telephony. Default: chirp_3. |
Guardrails
Workspace guardrails define organization-wide content safety policies that apply across all agents. They evaluate agent inputs and outputs against configurable safety categories and take a configured action — block, warn, or log — when content violates a policy.
- Navigation: Settings > AI Configuration > Guardrails
- Required role: Owner or Admin
The page is organized into two sections: Guardrail Providers and Guardrail Policies.
Guardrail Providers
Guardrail providers are the evaluation services that assess content against your policies. Configure at least one provider before creating policies.
Add a provider:
- Go to Settings > AI Configuration > Guardrails.
- Click Add Provider.
- In the dialog, configure the provider using the Form or YAML tab.
- Click Add Provider to save.
Form tab fields:
| Field | Description |
|---|
| Name | A unique internal identifier, for example OpenAI Moderation. |
| Display Name | A human-readable label, for example Content Safety (OpenAI). |
| Adapter Type | The guardrail adapter type, for example OpenAI Moderation. |
| Hosting | Hosting model for the provider. Default: Cloud API. |
| Endpoint URL | The provider’s API endpoint URL. |
| Model | The model used for evaluation, for example text-moderation-latest. |
| Authentication | Toggle on to enable authentication. Use an Auth Profile for providers requiring credentials — raw API keys are not accepted. |
| Default Category | The default content category this provider evaluates, for example content_safety. |
| Default Threshold | Sensitivity threshold from 0.0 to 1.0. Default: 0.7. Lower values flag more content; higher values flag only high-confidence violations. |
Retry configuration:
| Field | Description |
|---|
| Max Retries | Retry attempts if the provider call fails. Default: 3. |
| Backoff Strategy | Interval in milliseconds between retries. Default: 1000. |
| Enabled | Toggle on to enable this provider for evaluation. |
Provider health monitoring:
The platform periodically checks provider health. When a provider becomes unhealthy, its circuit breaker activates. After repeated failures, the circuit breaker opens and stops sending requests. After the reset timeout, it allows a test request through.
When a provider’s circuit breaker is open, the platform follows the configured fail mode:
- Fail-open — Content is delivered without guardrail evaluation. Violations may go undetected.
- Fail-closed — Content is blocked until the provider recovers. Safer, but may interrupt service.
Guardrail Policies
Guardrail policies define the rules applied to agent inputs and outputs for a specific project. Select the target project from the Project dropdown in the Guardrail Policies section.
Content safety categories:
| Category | Description |
|---|
| Hate speech | Content promoting hatred or violence against groups based on protected characteristics. |
| Sexual content | Sexually explicit material inappropriate for the agent’s use case. |
| Violence | Graphic descriptions of violence or instructions for causing harm. |
| Self-harm | Content that encourages or instructs self-harm. |
| PII detection | Personally identifiable information in outputs: names, emails, phone numbers, SSNs. |
| Profanity | Offensive language and slurs. |
| Prompt injection | Attempts to override agent instructions through crafted inputs. |
| Off-topic | Content outside the agent’s intended domain. |
| Hallucination risk | Responses that may contain fabricated information (requires knowledge base context). |
Each category has a configurable threshold from 0.0 to 1.0. Lower thresholds flag more content; higher thresholds flag only high-confidence violations.
Create a policy:
- Select the target project from the Project dropdown.
- Click Create policy.
- Enter a policy name and description.
- Select the categories to evaluate.
- For each category, set:
- Threshold — Sensitivity level (0.0 = flag everything, 1.0 = flag only high-confidence violations).
- Action — What happens when content is flagged:
- Block — Prevent the response from reaching the end user. A fallback message is shown instead.
- Warn — Deliver the response with a warning annotation visible to operators.
- Log — Record the violation for analytics without affecting the response.
- Set whether the policy evaluates inbound messages, outbound responses, or both.
- Click Save.
Link policies to projects:
Workspace guardrail policies are available to all projects, but each project must activate them:
- Open the project in Studio.
- Go to Project settings > Guardrails.
- Toggle on the workspace guardrail policies to enforce for this project.
- Optionally adjust thresholds at the project level. Project settings override workspace defaults when more restrictive.
How workspace and project guardrails interact:
- Workspace policies evaluate first. If a workspace guardrail blocks content, the project guardrail is not consulted.
- Project policies add specificity — use them for domain-specific rules, for example blocking financial advice in a customer support agent.
- The most restrictive action wins. If the workspace policy says warn but the project policy says block for the same category, the content is blocked.
Auth Profiles
Auth profiles store authentication configurations that agent tools use to call external services securely. Profiles are managed centrally at the workspace level and reused across all projects.
- Navigation: Settings > AI Configuration > Auth Profiles
- Required role: Owner or Admin
The page is organized into two tabs: All Profiles and Integrations.
All Profiles
The All Profiles tab lists all auth profiles configured for the workspace. Use the toolbar to search by profile name, filter by authentication type, or filter by status (active or inactive).
Add a profile:
Click Add Profile. Two profile types are available:
| Type | Description |
|---|
| Custom Profile | Configure a custom authentication scheme with manually specified credentials and settings. |
| Integration Profile | Configure authentication for a supported third-party integration. |
Supported auth types:
| Type | Description |
|---|
| OAuth 2.0 | Authorization code, client credentials, and implicit flows. |
| API key | Header or query parameter injection. |
| Bearer token | Static or dynamically refreshed bearer tokens. |
| Basic auth | Username and password combinations. |
| Custom | Arbitrary header or body injection for non-standard auth schemes. |
Auth profile credentials are encrypted at rest and never exposed in logs or agent responses.
Guardrail providers that require credentials must reference an auth profile. Raw API keys are not accepted directly in the guardrail provider configuration.
Integrations
The Integrations tab lists auth profiles configured for supported third-party integrations. Create these using the Integration Profile option from the Add Profile menu.
Using Auth Profiles
Once created, reference an auth profile by name in tool configurations and guardrail provider settings. The platform resolves the profile at runtime and injects the correct credentials without exposing them in agent definitions, logs, or traces.