Skip to main content

Documentation Index

Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Prompts and LLM Configuration

Use LlmModel and Prompt to control which model your agent uses, how it generates responses, and what instructions it follows.

Prerequisites

  • AgenticAI Core SDK installed and configured.
  • A valid connection configured for your LLM provider (OpenAI, Anthropic, or Azure OpenAI).

Configure the LLM model

Basic configuration

from agenticai_core.designtime.models.llm_model import LlmModel, LlmModelConfig

llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="Default Connection",
    max_timeout="60 Secs",
    max_iterations="25",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )
)

Builder pattern

Use LlmModelBuilder and LlmModelConfigBuilder for a fluent configuration style:
from agenticai_core.designtime.models.llm_model import (
    LlmModelBuilder, LlmModelConfigBuilder
)

# Build config
config_dict = LlmModelConfigBuilder() \
    .set_temperature(0.7) \
    .set_max_tokens(1600) \
    .set_top_p(0.9) \
    .build()

config = LlmModelConfig(**config_dict)

# Build model
llm_dict = LlmModelBuilder() \
    .set_model("gpt-4o") \
    .set_provider("Open AI") \
    .set_connection_name("Default") \
    .set_model_config(config) \
    .build()

llm = LlmModel(**llm_dict)

Supported providers

OpenAI

llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="OpenAI Connection",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        frequency_penalty=0.0,
        presence_penalty=0.0,
        top_p=1.0
    )
)

Anthropic (Claude)

llm = LlmModel(
    model="claude-3-5-sonnet-20240620",
    provider="Anthropic",
    connection_name="Anthropic Connection",
    modelConfig=LlmModelConfig(
        temperature=1.0,
        max_tokens=1024,
        top_p=0.7,
        top_k=5  # Anthropic-specific
    )
)

Azure OpenAI

llm = LlmModel(
    model="gpt-4",
    provider="Azure OpenAI",
    connection_name="Azure Connection",
    modelConfig=LlmModelConfig(
        temperature=0.8,
        max_tokens=2048
    )
)

LLM parameters

Temperature (0.0–2.0)

Controls output randomness. Lower values produce more predictable responses; higher values produce more varied ones.
RangeBehaviorUse for
0.0–0.3Deterministic, focusedFactual queries, data extraction
0.4–0.7BalancedGeneral-purpose agents
0.8–1.5Creative, diverseBrainstorming, content generation
1.6–2.0Highly randomExperimental use cases
# Factual task
config = LlmModelConfig(temperature=0.1)

# Balanced
config = LlmModelConfig(temperature=0.7)

# Creative
config = LlmModelConfig(temperature=1.2)

Max tokens

Sets the maximum number of tokens the model generates per response.
Response typeRecommended range
Short answers500–1000
Detailed responses1000–2000
Long-form content2000–4000
config = LlmModelConfig(
    max_tokens=1600  # Moderate response length
)

Top P (0.0–1.0)

Nucleus sampling parameter — controls the token pool the model samples from.
  • 0.1–0.5: Focused, less diverse sampling.
  • 0.6–0.9: Balanced diversity.
  • 0.95–1.0: Maximum diversity.
config = LlmModelConfig(top_p=0.9)

Penalties (−2.0 to 2.0)

Reduce repetition in responses:
  • frequency_penalty: Penalizes tokens that appear frequently in the output.
  • presence_penalty: Encourages the model to introduce new topics.
config = LlmModelConfig(
    frequency_penalty=0.5,  # Penalize frequent tokens
    presence_penalty=0.3    # Encourage topic diversity
)

Configure prompts

System prompt

Sets the base role for the agent:
from agenticai_core.designtime.models.prompt import Prompt

prompt = Prompt(
    system="You are a helpful assistant."
)

Custom prompt

Provides detailed instructions and context beyond the system role:
prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an intelligent banking assistant designed to help
    customers manage their financial needs efficiently and securely.

    ## Your Capabilities
    - Check account balances
    - Process transactions
    - Answer banking policy questions
    - Provide loan information

    ## Customer Context
    You have access to:
    {{memory.accountInfo.accounts}}

    Use this information for quick responses.
    """
)

Instructions

Pass structured rules as a list. Use instructions for compliance, tone, and handling guidelines — especially for sensitive domains:
prompt = Prompt(
    system="You are a banking assistant.",
    custom="Help customers with account management.",
    instructions=[
        """### Security Protocols
        - Never ask for passwords, PINs, or CVV numbers
        - If request seems suspicious, politely decline""",

        """### Speaking Style
        - Use natural, conversational language
        - Keep responses concise
        - Provide key information first""",

        """### Handling Requests
        1. Greet the customer warmly
        2. Identify their need
        3. Execute the request efficiently
        4. Summarize and ask if anything else needed"""
    ]
)
Security guidance: Always include a security instruction block for apps that handle sensitive data:
instructions=[
    """### Security
    - Never ask for passwords, PINs, CVV, or OTPs
    - Verify unusual requests
    - Escalate suspicious activity"""
]
Voice agent guidance: For voice or audio agents, add a speaking style instruction:
instructions=[
    """### Speaking Style
    - Use natural, conversational language
    - Avoid markdown formatting
    - Speak numbers clearly
    - Use pauses with commas
    - Keep responses concise"""
]

Template variables

Prompts support runtime variable substitution using {{variable}} syntax:
VariableDescription
{{app_name}}Application name.
{{app_description}}Application description.
{{agent_name}}Current agent name.
{{memory.store.field}}Access memory store data.
{{session_id}}Current session identifier.
prompt = Prompt(
    custom="""You are acting as {{agent_name}} for the application "{{app_name}}".

    Application Description:
    {{app_description}}

    Customer Account Information:
    {{memory.accountInfo.accounts}}

    Use the above context to provide quick, accurate responses.
    """
)

Orchestrator prompts

For supervisor or orchestrator agents, define routing rules in the custom prompt:
supervisor_prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an AI Supervisor for "{{app_name}}".

    ### Your Team
    You manage multiple workers:
    - BillingAgent: Handles payments and billing
    - SupportAgent: General customer support
    - TechnicalAgent: Technical issues

    ### Routing Rules
    1. **Small-talk**: Route to user with friendly response
    2. **Direct Routing**: Match requests to worker expertise
    3. **Follow-up**: Route responses to same worker
    4. **Route to user**: When unrelated or complete
    5. **Multi-Intent**: Break into sequential requests
    """
)

Task-specific configurations

Match your LlmModelConfig to the nature of the agent’s task: Factual tasks — use low temperature for consistent, accurate responses:
LlmModelConfig(
    temperature=0.1,  # Low for consistency
    max_tokens=800
)
Creative tasks — use higher temperature for varied output:
LlmModelConfig(
    temperature=1.0,  # Higher for creativity
    max_tokens=2000
)
Balanced (general-purpose):
LlmModelConfig(
    temperature=0.7,
    max_tokens=1600,
    top_p=0.9
)

Optimization tips

Cost
  • Use smaller models for simple, repetitive tasks.
  • Set max_tokens to the minimum needed for the expected response length.
  • Set max_iterations to limit unnecessary tool calls.
  • Configure reasonable timeouts to avoid runaway sessions.
Quality
  • Use the latest model versions for your provider.
  • Increase max_tokens when detailed responses are required.
  • Lower temperature for tasks that require consistency.
  • Increase max_iterations for complex multi-step workflows.