Skip to main content

Documentation Index

Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Back to NLP Topics NLP training ensures your assistant accurately identifies user intent. The platform uses multiple engines—ML, FM, KG, Traits, and Ranking & Resolver—each suited to different scenarios.

NLP Preprocessing

Before intent detection, every utterance is preprocessed:
StepDescription
TokenizationSplit utterance into sentences, then words. TreeBank Tokenizer for English.
toLower()Convert to lowercase (not for German). ML and KG engines only.
Stop word removalRemove low-signal words. Language-specific list; optional, disabled by default.
StemmingReduce to stem (e.g., “Running” → “run”). Output may not be a real word.
LemmatizationReduce to base dictionary form (e.g., “housing” → “house”).
N-gramsCombine co-occurring words for context (e.g., “New York City” as a tri-gram).

Scoping Your Assistant

Before training, define your assistant’s scope:
  1. Define the problem — what the assistant must accomplish; align with BAs and developers.
  2. List intents — identify key results for each; focus on user needs.
  3. Sketch example conversations — user utterances and responses; include edge cases and follow-ups.
  4. Brainstorm alternate utterances — include idioms and slang for each intent.

Choosing an Engine

EngineBest For
MLLarge corpus; diverse utterances; flexible and auto-learning. Recommended as the primary training method.
KGQuery-type intents; document-based answers; many intents with limited alternate utterances.
FMIdiomatic/command-like sentences; acceptable tolerance for false positives.

NLP Configuration in the Platform

Go to Automation > Natural Language:
SectionPurpose
TrainingAdd ML utterances, synonyms, concepts, patterns.
Engine TuningSet recognition confidence levels, thresholds.
Advanced SettingsAuto-training settings, negative intent patterns.
NLP Version 3 (default for new VAs from v10.0):
  • Improved Traits Engine accuracy.
  • Transformer and KAEN models for English; Transformer for other languages.
  • Enables Zero-shot and Few-shot ML models.
  • As of January 21, 2024, all existing VAs are on Version 3.
For per-engine training and configuration: