Deploy and manage AI models in AI for Process. Connect external providers, deploy open-source models, fine-tune models on your data, and monitor model usage across workflows. Model Hub centralizes AI model management in AI for Process. Use it to connect external provider models, deploy open-source models, fine-tune models on your data, and expose models as API endpoints for use in Prompts and Workflows.Documentation Index
Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
| Model Type | Description | Best For |
|---|---|---|
| External Models | Commercial models from OpenAI, Anthropic, Google, Azure, Cohere, and Amazon Bedrock. Connect via guided setup or custom API endpoint. | Production workloads requiring proven reliability. |
| Open-Source Models | 30+ curated models plus any Hugging Face text generation model. Deploy platform-hosted models, import from Hugging Face, or upload model files. | Cost control, customization, data privacy. |
| Fine-Tuned Models | Custom models trained on your enterprise data. Train and deploy models for specific use cases. | Domain-specific tasks, consistent outputs. |
External Models
External Models connects models hosted outside AI for Process. Once connected, models are available in Prompt Studio, Workflows, Model Traces & Analytics, Audit Logs, and Billing. You can connect external models in two ways:- Easy Integration — Guided setup for OpenAI, Anthropic, Google, Cohere, or Amazon Bedrock.
- API Integration — Custom model connection using an endpoint URL, authentication, and request-response configuration.
Open-Source Models
Open-Source Models lets you deploy platform-hosted models, import models from Hugging Face, or upload model files from your local machine. Platform-hosted models can be optimized before deployment using CTranslate2 or vLLM. Three ways to add a model:- Deploy a platform-hosted model — Select from 30+ supported open-source models and optionally apply optimization.
- Deploy from Hugging Face — Import any public or private Hugging Face model using a connected account.
- Import from local files — Upload a base model or adapter model as a
.zipfile.
Fine-Tuned Models
Fine-Tuned Models lets you train a custom model on your data and deploy it for use in Prompts and Workflows. The fine-tuning process covers:- General details — name, description, tags.
- Base model — select a platform-hosted model or import from Hugging Face.
- Fine-tuning configuration — type (Full fine-tune, LoRA, or QLoRA), epochs, batch size, learning rate.
- Datasets — training, evaluation, and optional test dataset (JSONL, CSV, or JSON).
- Hardware — select GPU configuration.
- Weights & Biases integration (optional) — real-time monitoring of training metrics.
| Base model parameters | Supported fine-tuning types |
|---|---|
| < 1B | Full fine-tune, LoRA, QLoRA |
| ≥ 1B and < 5B | LoRA, QLoRA |
| ≥ 5B and ≤ 8B | QLoRA |
Model Parameters
When deploying open-source or fine-tuned models, configure the following generation parameters:| Parameter | Description |
|---|---|
| Temperature | Controls output randomness. Higher values produce more varied responses. |
| Maximum length | Maximum number of tokens to generate. |
| Top p | Alternative to temperature; restricts sampling to top probability mass. |
| Top k | Restricts sampling to the top k highest-probability tokens. |
| Stop sequences | Sequences at which the model stops generating. |
| Inference batch size | Number of concurrent requests to process per batch. |
| Min / Max replicas | Scaling boundaries for the deployed model. |
| Scale up / down delay | Wait time before auto-scaling up or down. |
Tool Calling Support
In the Text-to-Text node, models can autonomously call external workflows during execution — this is called Workflow Calling. You can attach up to three workflows to an AI node. The model determines whether to respond directly or invoke a workflow based on the request context. Workflow calling is only available for models that support it. When a compatible model is selected in the AI node, the Workflow calling available tab appears in the Properties panel. Configure Workflow Calling:- Set Exit node execution after to define the maximum number of model calls before exiting to the failure path.
Model Selection Guide
Use this guide to choose the right model type for your use case:| Use case | Recommended model type |
|---|---|
| Use a commercial model quickly | External Model — Easy Integration |
| Connect a custom or proprietary API | External Model — API Integration |
| Privacy-sensitive or on-premises use | Open-Source — platform-hosted |
| Specialized domain tasks | Fine-Tuned Model |
| Fine-tuning on limited resources | QLoRA (parameter-efficient) |
| High-throughput large model inference | Open-Source with vLLM optimization |
| Low-latency small model inference | Open-Source with CTranslate2 |
Structured Output
Certain open-source models can return responses as structured JSON using theresponse_format parameter. This makes outputs predictable and easy to parse downstream.
You can use structured output in two ways:
- API calls — Add the
response_formatparameter directly to your model endpoint request. - Workflow builder — Define the JSON schema in the Text-to-Text node. AI for Process attaches it automatically.
- Supported on
v2/chat/completionsendpoints only. Olderv1/completionsendpoints do not support structured output. - Not supported for models optimized with CTranslate2, fine-tuned models, Hugging Face imports, or locally imported models.
- Supported schema data types:
string,number,boolean,integer,object,array,enum, andanyOf.
If a model supports both tool calls and JSON Schema, tool calls take precedence and the schema is ignored.
Model Endpoint and API Keys
After deploying an open-source or fine-tuned model, AI for Process generates an API endpoint for external inferencing. The endpoint is available in three formats (cURL, Python, and Node.js). You receive an email notification when the endpoint is ready. Generate an API key:- Open the deployed model and click the API Keys tab.
- Click Create a new API key, enter a name, and click Generate key.
- Click Copy and close to save the key.
Timeout precedence: Workflow timeout > Node timeout > Model timeout.
Monitoring
Track model usage and performance from two locations:- Workflow Monitor — Go to Workflows > select a deployed workflow > Workflow runs. Use the Model runs tab to view AI node executions, response times (P90, P99), failure rates, and model-level metrics.
- Model Traces & Analytics — Available for external models connected via Easy Integration or API Integration. Tracks token usage, latency, and billing per connection.