Skip to main content
Evaluation Metrics let supervisors define and monitor performance indicators for measuring the quality of agent-customer interactions. The system supports six measurement types, each designed for specific evaluation needs using AI-driven analysis or rule-based methods.

Key Benefits

BenefitDescription
AI-powered intelligenceGenAI-based adherence reduces dependency on large training datasets
Comprehensive coverageSix measurement types address diverse evaluation scenarios
Multilingual supportEnhanced support across languages and interactions
Automated QAReduces manual review workload through intelligent analysis
Real-time validationAPI integration ensures data accuracy and compliance
Flexible configurationStatic and dynamic evaluation options

Access Evaluation Metrics

Navigate to Quality AI > CONFIGURE > Evaluation Forms > Evaluation Metrics. By Manual Dropdown The dashboard shows:
ColumnDescription
NameMetric name
Metric TypeMeasurement type
Evaluation FormsAssociated evaluation forms
Ellipsis iconEdit and delete options
SearchQuick search to find metrics
New Evaluation MetricsOption to create new metrics

Create a New Evaluation Metric

  1. Select the Evaluation Metrics tab.
  2. Select + New Evaluation Metric.
  3. Choose and configure your measurement type.
By Manual Dropdown

Key Configuration Options

OptionDescription
Metric nameDescriptive identifier for future reference
LanguageMulti-language support configuration
Evaluation questionReference prompt for audits and interaction reviews
Adherence typeStatic (universal) or Dynamic (trigger-based) detection

Detection Methods

FeatureGenAI-BasedDeterministic
MechanismLLM contextual understandingSemantic similarity matching
TrainingZero-shot promptsSample utterance training
FlexibilityHigh contextual adaptationPrecise pattern recognition
SetupDescription-basedUtterance-based

Measurement Types

By Question

Evaluates adherence to specific questions asked or answered during interactions. Key features:
  • Static Adherence-applies to all conversations
  • Dynamic Adherence-conditional evaluation triggered by specific events
  • GenAI Detection-contextual understanding with no training samples required
  • Deterministic Detection-semantic matching against predefined patterns
  • Flexible thresholds-set different similarity scores per use case
Common use cases: Script adherence, greeting compliance, policy verification, response quality. For full configuration details, see By Question.

By Speech

Analyzes speech characteristics during voice interactions. Key features
  • Crosstalk-detects overlapping speech with configurable thresholds
  • Dead Air-monitors silence periods (configurable duration)
  • Speaking Rate-tracks Words Per Minute (WPM)
Use cases: Voice quality, conversation flow analysis, speaking pace optimization. For full configuration details, see By Speech.

By Value

Verifies customer-specific information shared by an agent against trusted data sources. Key features:
  • API integration-real-time verification with CRM and external systems
  • Business rules engine-five rule types (first/last value, negotiated, strict matching, custom)
  • Compliance tracking-detects deviations from expected values
  • Audit trails-logs validation results for supervisory review
Use cases: Pricing accuracy, interest rate verification, account balance confirmation, compliance validation. For full configuration details, see By Value.

By Dialog Task

Assesses completion and quality of specific tasks or workflows within a conversation. Key features:
  • Dialog agent selection-choose which dialog agent to evaluate
  • Evaluation scope-entire conversation or time-bound segment
  • Time parameters-configurable in seconds (voice) or message count (chat)
Use cases: Workflow adherence, task completion verification, dialog flow optimization. For full configuration details, see By Dialog Task.

By Playbook Adherence

Measures how well interactions follow predefined playbooks or procedures. Key features:
  • Entire Playbook-assesses adherence across all playbook components
  • Specific Steps-targets evaluation at specific stages or steps
  • Percentage thresholds-define minimum adherence levels required
Use cases: Process compliance, procedure adherence, enforcement of standards. For full configuration details, see By Playbook Adherence.

By AI Agent

Uses AI Agents for sophisticated, multistep evaluations with autonomous decision-making. Key features:
  • Complex analysis-multi-step reasoning across conversation elements
  • Domain expertise-supports specialized evaluation contexts (compliance, technical support)
  • Contextual understanding-nuanced evaluation requiring full conversation context
  • Advanced decision-making-goes beyond pattern matching for judgment calls
Use cases: Complex compliance assessments, technical troubleshooting evaluation, sophisticated quality analysis. For full configuration details, see By AI Agent.

By Manual Evaluation

Manual Evaluation metrics enable QA teams to assess agent performance through human-led reviews, especially in scenarios where automated detection is less reliable. QA managers configure these metrics in the form, assigning a weight only in points. Key features:
  • Human-Driven Assessment-metrics are evaluated exclusively by QA auditors without Auto QA involvement.
  • Points-Based Only-available only within points-based evaluation forms to ensure accurate scoring allocation.
  • No AI Dependency-independent of GenAI, deterministic detection, triggers, and adherence thresholds.
  • Clear Visual Identification-displays distinctly across Audit screens, Conversation Mining, Heatmaps, and Reports with the suffix (Manual Evaluation Metric).
Use Cases: Manual Evaluation is ideal for assessing complex soft skills (such as tone, empathy, and negotiation), regulatory scenarios requiring human judgment, dispute handling quality, escalation decisions, and high-risk or edge-case interactions. For full configuration details, see By Manual Evaluation.

By Hold

Evaluates how effectively agents manage customer hold scenarios during voice interactions, ensuring proper communication, timing, and resumption behavior. Key Features:
  • Static Adherence-applies consistently to all conversations with hold events
  • Event-driven Evaluation-triggers automatically when hold events occur via telephony integration
  • Multi-instance Detection-evaluates multiple hold events within a single interaction
  • GenAI Detection-contextual, flexible evaluation using LLM-based understanding
  • Deterministic Detection-embedding-based semantic matching against predefined utterances
  • Configurable Sub-criteria-assess hold notification, duration compliance, and call resumption
  • Flexible Thresholds-defines similarity scores, hold duration limits, and evaluation windows
  • Weighted Scoring-assigns percentage-based contributions to each sub-criterion
Use Cases: Hold etiquette compliance, agent coaching, customer experience improvement, regulatory adherence, and interaction quality monitoring during hold scenarios. For full configuration details, see By Hold.

Edit or Delete Evaluation Metrics

  1. Search for and select the metric.
  2. Select the three-dot menu (⋮) next to the metric name.
  3. Select Edit to modify or Delete to remove.
  4. For percentage-based metrics, adjust weights so they total 100%.
  5. Select Update to save changes.
Edit Metric Fields