Support for Third-Party ASR, TTS, and Voice Biometrics

Automatic Speech Recognition (ASR)

Kore.ai supports the following third-party service providers for ASR services:

ASR	On-Prem / Cloud	Languages	Regions	Word Error Rate (WER)	Comments
Google	Cloud	https://cloud.google.com/speech-to-text/docs/speech-to-text-supported-languages	1. https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages 2. https://cloud.google.com/speech-to-text/v2/docs/locations	4-9%	1. Good for shorter utterances like ‘yes’, and ‘no’. 2. Good for number inputs, and alphanumeric inputs (for example, IDs, SSN, etc.). 3. Supports class tokens, so that output format can be formatted up to some extent. 4. Hints, Hint-hosts are supported. 5. Extensive language support.
Deepgram	Cloud and On-Prem	https://developers.deepgram.com/docs/models-languages-overview	Supports all regions across the globe	3.44%	1. Hints are supported 2. Custom language models can be prepared with the help of the Deepgram technical team 3. Smart formatting feature detects the type of input (like numbers, dates, etc.) and gives transcription in the respective format.
Azure	Cloud and On-Prem	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions?tabs=geographies	5-10%	1. Preferred ASR provider. It should be the default for new accounts. 2. Low WER, lots of flexibility with custom models. 3. Hints are supported. 4. Extensive language support. 5. Custom language models can be created/deployed through DIY (through the Azure portal).
Nvidia Riva (Nvidia)	On-Prem	https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-overview.html	–	6.67%
Amivoice ASR ( Advanced Media Inc)	Cloud	https://docs.amivoice.com/en/amivoice-api/manual/supported-languages/	While their API is accessible globally, the physical processing and data storage for their cloud platform are primarily based in Japan.	N/A
Amazon Transcribe	Cloud	https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html	https://docs.aws.amazon.com/general/latest/gr/transcribe.html	2.60%
Gnani.AI	Cloud and On-Prem	https://gnani-ai.github.io/API-service/	Gnani.ai offers ASR that can be deployed in a region of customer choice if you opt for a private cloud or an on-premises solution	2%

Text to Speech (TTS)

Kore.ai supports the following third-party service providers for TTS services:

TTS	On-Prem / Cloud	Languages	Regions	Comments
Google	Cloud	https://cloud.google.com/text-to-speech/docs/list-voices-and-types	Google Cloud TTS is a cloud-based service, so it operates within Google Cloud’s global infrastructure. (https://cloud.google.com/about/locations)
Azure	Cloud and On-Prem	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions?tabs=geographies	1. Extensive language support. 2. An extensive number of voices. 3. Custom voice preparation can be done through the portal. 4. SSML support (limited to MS Azure-supported tags).
Open AI TTS	Cloud	https://platform.openai.com/docs/guides/text-to-speech/supported-languages		1. Human-like voices. 2. Limited number of voices.
Eleven Labs	Cloud	https://elevenlabs.io/docs/capabilities/text-to-speech		1. Human-like voices. 2. Speed, temperature, and stability can be controlled through call control parameters. 3. Voice cloning is possible with voice samples ranging from 30 seconds to 1 minute.
AWS	Cloud	https://docs.aws.amazon.com/polly/latest/dg/supported-languages.html	https://docs.aws.amazon.com/general/latest/gr/pol.html
Gnani.AI	Cloud and On-Prem	https://gnani-ai.github.io/API-service/
playHT	Cloud and On-Prem	https://play.ht/faq/
Deepgram	Cloud and On-Prem	https://developers.deepgram.com/docs/language		1. Limited number of languages. 2. Human-like voices.
Nvidia Riva TTS	On-Prem

Voice Biometrics

Kore.ai supports the following third-party service providers for voice biometrics:

Voice Biometric Vendor	Voice Biometric Engine	On-Prem / Cloud	Comments
ID R&D	ID Voice	–	–

Support for Third-Party ASR, TTS, and Voice Biometrics

Automatic Speech Recognition (ASR)

Kore.ai supports the following third-party service providers for ASR services:

ASR	On-Prem / Cloud	Languages	Regions	Word Error Rate (WER)	Comments
Google	Cloud	https://cloud.google.com/speech-to-text/docs/speech-to-text-supported-languages	1. https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages 2. https://cloud.google.com/speech-to-text/v2/docs/locations	4-9%	1. Good for shorter utterances like ‘yes’, and ‘no’. 2. Good for number inputs, and alphanumeric inputs (for example, IDs, SSN, etc.). 3. Supports class tokens, so that output format can be formatted up to some extent. 4. Hints, Hint-hosts are supported. 5. Extensive language support.
Deepgram	Cloud and On-Prem	https://developers.deepgram.com/docs/models-languages-overview	Supports all regions across the globe	3.44%	1. Hints are supported 2. Custom language models can be prepared with the help of the Deepgram technical team 3. Smart formatting feature detects the type of input (like numbers, dates, etc.) and gives transcription in the respective format.
Azure	Cloud and On-Prem	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions?tabs=geographies	5-10%	1. Preferred ASR provider. It should be the default for new accounts. 2. Low WER, lots of flexibility with custom models. 3. Hints are supported. 4. Extensive language support. 5. Custom language models can be created/deployed through DIY (through the Azure portal).
Nvidia Riva (Nvidia)	On-Prem	https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-overview.html	–	6.67%
Amivoice ASR ( Advanced Media Inc)	Cloud	https://docs.amivoice.com/en/amivoice-api/manual/supported-languages/	While their API is accessible globally, the physical processing and data storage for their cloud platform are primarily based in Japan.	N/A
Amazon Transcribe	Cloud	https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html	https://docs.aws.amazon.com/general/latest/gr/transcribe.html	2.60%
Gnani.AI	Cloud and On-Prem	https://gnani-ai.github.io/API-service/	Gnani.ai offers ASR that can be deployed in a region of customer choice if you opt for a private cloud or an on-premises solution	2%

Text to Speech (TTS)

Kore.ai supports the following third-party service providers for TTS services:

TTS	On-Prem / Cloud	Languages	Regions	Comments
Google	Cloud	https://cloud.google.com/text-to-speech/docs/list-voices-and-types	Google Cloud TTS is a cloud-based service, so it operates within Google Cloud’s global infrastructure. (https://cloud.google.com/about/locations)
Azure	Cloud and On-Prem	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt	https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions?tabs=geographies	1. Extensive language support. 2. An extensive number of voices. 3. Custom voice preparation can be done through the portal. 4. SSML support (limited to MS Azure-supported tags).
Open AI TTS	Cloud	https://platform.openai.com/docs/guides/text-to-speech/supported-languages		1. Human-like voices. 2. Limited number of voices.
Eleven Labs	Cloud	https://elevenlabs.io/docs/capabilities/text-to-speech		1. Human-like voices. 2. Speed, temperature, and stability can be controlled through call control parameters. 3. Voice cloning is possible with voice samples ranging from 30 seconds to 1 minute.
AWS	Cloud	https://docs.aws.amazon.com/polly/latest/dg/supported-languages.html	https://docs.aws.amazon.com/general/latest/gr/pol.html
Gnani.AI	Cloud and On-Prem	https://gnani-ai.github.io/API-service/
playHT	Cloud and On-Prem	https://play.ht/faq/
Deepgram	Cloud and On-Prem	https://developers.deepgram.com/docs/language		1. Limited number of languages. 2. Human-like voices.
Nvidia Riva TTS	On-Prem

Voice Biometrics

Kore.ai supports the following third-party service providers for voice biometrics:

Voice Biometric Vendor	Voice Biometric Engine	On-Prem / Cloud	Comments
ID R&D	ID Voice	–	–

Support for Third-Party ASR, TTS, and Voice Biometrics

Automatic Speech Recognition (ASR)

Text to Speech (TTS)

Voice Biometrics

Related Link

Support for Third-Party ASR, TTS, and Voice Biometrics

Automatic Speech Recognition (ASR)

Text to Speech (TTS)

Voice Biometrics

Related Link