Generative answers provide the best user experience and accuracy, especially for multilingual use cases.
Using a 16k LLM model with top 10 chunks is recommended in Generative Answers
It is recommended to edit the Generative AI prompt according to your use case, e.g., mentioning details of the use case and defining how you want the LLM to answer(“Answer like a helpful assistant/answer like a customer service representative/etc). You can also describe the profile of who is asking the question to improve the LLM response further.
It is not recommended to ask the LLM to make any logical deductions or mathematical calculations.
Response time for answers depends on the number of business rules, total content indexed, chunk size, no of chunks sent to LLM, prompt, LLM model used, etc. Typically, you can expect 5-10 seconds per query in our cloud.
Extractive Answers
The chunk size is not customizable and depends on the format of the document.
This extraction model has certain known limitations and doesn’t work with all different types of formats.
The extracted content in the chunks is presented as it is to the user as the answer.
Implement manual review or validation processes to ensure the correctness of the answers.
Due to the way chunks are generated for extractive answers, they may lack or have limited contextual knowledge.
The quality of extractive answers heavily depends on the quality and relevance of the source text.
When both models are enabled
Two sets of chunks are generated for the source content as per the chunking strategy for both models.
When a generative answer is presented to the user, only the chunks generated by the Generative model(Plain text extraction model) are used for the answer.
When an extractive answer is presented to the user, only the chunks generated by the Extractive model(Rule-based extraction model) are used for the answer.
The precedence of the models can be selected by the user in Answer Snippets.
On this Page
Best Practices and Points to Remember
Generative Answers
Generative answers provide the best user experience and accuracy, especially for multilingual use cases.
Using a 16k LLM model with top 10 chunks is recommended in Generative Answers
It is recommended to edit the Generative AI prompt according to your use case, e.g., mentioning details of the use case and defining how you want the LLM to answer(“Answer like a helpful assistant/answer like a customer service representative/etc). You can also describe the profile of who is asking the question to improve the LLM response further.
It is not recommended to ask the LLM to make any logical deductions or mathematical calculations.
Response time for answers depends on the number of business rules, total content indexed, chunk size, no of chunks sent to LLM, prompt, LLM model used, etc. Typically, you can expect 5-10 seconds per query in our cloud.
Extractive Answers
The chunk size is not customizable and depends on the format of the document.
This extraction model has certain known limitations and doesn’t work with all different types of formats.
The extracted content in the chunks is presented as it is to the user as the answer.
Implement manual review or validation processes to ensure the correctness of the answers.
Due to the way chunks are generated for extractive answers, they may lack or have limited contextual knowledge.
The quality of extractive answers heavily depends on the quality and relevance of the source text.
When both models are enabled
Two sets of chunks are generated for the source content as per the chunking strategy for both models.
When a generative answer is presented to the user, only the chunks generated by the Generative model(Plain text extraction model) are used for the answer.
When an extractive answer is presented to the user, only the chunks generated by the Extractive model(Rule-based extraction model) are used for the answer.
The precedence of the models can be selected by the user in Answer Snippets.