Model Overview - Kore.ai Docs

Deploy and manage AI models in AI for Process. Connect external providers, deploy open-source models, fine-tune models on your data, and monitor model usage across workflows. Model Hub centralizes AI model management in AI for Process. Use it to connect external provider models, deploy open-source models, fine-tune models on your data, and expose models as API endpoints for use in Prompts and Workflows.

Model Type	Description	Best For
External Models	Commercial models from OpenAI, Anthropic, Google, Azure, Cohere, and Amazon Bedrock. Connect via guided setup or custom API endpoint.	Production workloads requiring proven reliability.
Open-Source Models	30+ curated models plus any Hugging Face text generation model. Deploy platform-hosted models, import from Hugging Face, or upload model files.	Cost control, customization, data privacy.
Fine-Tuned Models	Custom models trained on your enterprise data. Train and deploy models for specific use cases.	Domain-specific tasks, consistent outputs.

External Models

External Models connects models hosted outside AI for Process. Once connected, models are available in Prompt Studio, Workflows, Model Traces & Analytics, Audit Logs, and Billing. You can connect external models in two ways:

Easy Integration — Guided setup for OpenAI, Anthropic, Google, Cohere, or Amazon Bedrock.
API Integration — Custom model connection using an endpoint URL, authentication, and request-response configuration.

For the full list of supported providers and model variants, see Supported Models. To add an external model, go to Models > External Models > Add a model and follow the setup wizard for Easy Integration or API Integration. For complete setup steps, see External Models.

Open-Source Models

Open-Source Models lets you deploy platform-hosted models, import models from Hugging Face, or upload model files from your local machine. Platform-hosted models can be optimized before deployment using CTranslate2 or vLLM. Three ways to add a model:

Deploy a platform-hosted model — Select from 30+ supported open-source models and optionally apply optimization.
Deploy from Hugging Face — Import any public or private Hugging Face model using a connected account.
Import from local files — Upload a base model or adapter model as a .zip file.

For the full list of platform-hosted models, see Supported Models. For complete deployment steps, see Open-Source Models.

Fine-Tuned Models

Fine-Tuned Models lets you train a custom model on your data and deploy it for use in Prompts and Workflows. The fine-tuning process covers:

General details — name, description, tags.
Base model — select a platform-hosted model or import from Hugging Face.
Fine-tuning configuration — type (Full fine-tune, LoRA, or QLoRA), epochs, batch size, learning rate.
Datasets — training, evaluation, and optional test dataset (JSONL, CSV, or JSON).
Hardware — select GPU configuration.
Weights & Biases integration (optional) — real-time monitoring of training metrics.

Fine-tuning types supported by model size:

Base model parameters	Supported fine-tuning types
< 1B	Full fine-tune, LoRA, QLoRA
≥ 1B and < 5B	LoRA, QLoRA
≥ 5B and ≤ 8B	QLoRA

For complete fine-tuning steps, see Fine-Tuned Models.

Model Parameters

When deploying open-source or fine-tuned models, configure the following generation parameters:

Parameter	Description
Temperature	Controls output randomness. Higher values produce more varied responses.
Maximum length	Maximum number of tokens to generate.
Top p	Alternative to temperature; restricts sampling to top probability mass.
Top k	Restricts sampling to the top k highest-probability tokens.
Stop sequences	Sequences at which the model stops generating.
Inference batch size	Number of concurrent requests to process per batch.
Min / Max replicas	Scaling boundaries for the deployed model.
Scale up / down delay	Wait time before auto-scaling up or down.

For external models added via API Integration, you define the request body and map variables directly in the configuration.

Tool Calling Support

In the Text-to-Text node, models can autonomously call external workflows during execution — this is called Workflow Calling. You can attach up to three workflows to an AI node. The model determines whether to respond directly or invoke a workflow based on the request context. Workflow calling is only available for models that support it. When a compatible model is selected in the AI node, the Workflow calling available tab appears in the Properties panel. Configure Workflow Calling:

Set Exit node execution after to define the maximum number of model calls before exiting to the failure path.

Model Selection Guide

Use this guide to choose the right model type for your use case:

Use case	Recommended model type
Use a commercial model quickly	External Model — Easy Integration
Connect a custom or proprietary API	External Model — API Integration
Privacy-sensitive or on-premises use	Open-Source — platform-hosted
Specialized domain tasks	Fine-Tuned Model
Fine-tuning on limited resources	QLoRA (parameter-efficient)
High-throughput large model inference	Open-Source with vLLM optimization
Low-latency small model inference	Open-Source with CTranslate2

Structured Output

Certain open-source models can return responses as structured JSON using the response_format parameter. This makes outputs predictable and easy to parse downstream. You can use structured output in two ways:

API calls — Add the response_format parameter directly to your model endpoint request.
Workflow builder — Define the JSON schema in the Text-to-Text node. AI for Process attaches it automatically.

Requirements:

Supported on v2/chat/completions endpoints only. Older v1/completions endpoints do not support structured output.
Not supported for models optimized with CTranslate2, fine-tuned models, Hugging Face imports, or locally imported models.
Supported schema data types: string, number, boolean, integer, object, array, enum, and anyOf.

If a model supports both tool calls and JSON Schema, tool calls take precedence and the schema is ignored.

For the list of models that support structured output, see Supported Models for Structured Output.

Model Endpoint and API Keys

After deploying an open-source or fine-tuned model, AI for Process generates an API endpoint for external inferencing. The endpoint is available in three formats (cURL, Python, and Node.js). You receive an email notification when the endpoint is ready. Generate an API key:

Open the deployed model and click the API Keys tab.
Click Create a new API key, enter a name, and click Generate key.
Click Copy and close to save the key.

API keys are scoped per deployment. For external models, each connection can have its own API key, and usage is tracked independently per connection. Model endpoint timeout: Set a timeout between 30 and 180 seconds (default: 60 seconds). If a request is not completed within the limit, the endpoint returns a timeout error.

Timeout precedence: Workflow timeout > Node timeout > Model timeout.

Monitoring

Track model usage and performance from two locations:

Workflow Monitor — Go to Workflows > select a deployed workflow > Workflow runs. Use the Model runs tab to view AI node executions, response times (P90, P99), failure rates, and model-level metrics.
Model Traces & Analytics — Available for external models connected via Easy Integration or API Integration. Tracks token usage, latency, and billing per connection.

Use the Weights & Biases integration to monitor fine-tuning metrics such as training loss, validation loss, and hardware utilization in real time.

Documentation Index

​External Models

​Open-Source Models

​Fine-Tuned Models

​Model Parameters

​Tool Calling Support

​Model Selection Guide

​Structured Output

​Model Endpoint and API Keys

​Monitoring

External Models

Open-Source Models

Fine-Tuned Models

Model Parameters

Tool Calling Support

Model Selection Guide

Structured Output

Model Endpoint and API Keys

Monitoring