A snippet is a brief and most relevant text excerpt or a paraphrased answer from the data ingested in “Sources” and appears at the top of the results. It provides a quick answer to a user’s query, eliminating the need for the user to sift through the search results. The answer snippets are in the form of textual answers along with a link to the URL from where the snippet is extracted or the source of truth.
Answer Snippet Functionality
When answer snippets are enabled for a SearchAssist application, it extracts chunks from the ingested data using a rule-based chunking technique and token count. These chunks are then used to provide answers to the search queries using two different models discussed below. For more info on chunks and how they are extracted from the source data, refer to the Chunk Browser.
SearchAssist offers two models to provide answer snippets.
- Extractive Model: In the case of extractive snippets, the chunk most relevant to the user query is displayed to the user. So, an extractive snippet is an answer extracted from the source data which is displayed as it is.
- Generative Model: In this model, SearchAssist uses third-party large language models to paraphrase the answer snippets. The top five chunks that are most relevant to the user query are sent to the third-party LLMs. The paraphrased answer received from the LLM is then displayed as the answer to the user. The snippet can be derived from single or multiple sources of ingested data and is paraphrased to give a cohesive and relevant answer.
To configure answer snippets, go to the Answer Snippets page under the Indices tab.
Answer snippets are generated on the basis of the models enabled and the priority assigned to the models. For example, if only the extractive model is enabled for an application, it will display the snippets generated from the source data.
You can enable one or both models to display answer snippets. Click the Train button after enabling Answer Snippets to generate the chunks from the ingested data. When both models are enabled, the model with the highest priority is first used to generate snippets. If it does not provide any snippet, the model, next in priority, is used to generate answer snippets. The priority is determined by the position of the models on the left. The model at the top of the list is the model with the highest priority. For example, in the case shown below, the Extractive model is set to a higher priority.
To change the priority of the models, use the drag dots icon and place the models as required.
Extractive Model Configuration
In this model, the chunks are extracted from the ingested data at design time and relevant snippets are displayed when a user queries from the related data. Each extracted chunk has a title and content field, which are stored in the index fields: snippet_title and snippet_content respectively. These fields are used to calculate a score that defines the match of the user query and the snippets. You can fine-tune the weightage given to these fields and the similarity score to choose the most suitable chunk to display.
To enable the extractive model of answer snippets, use the slider at the top of the page.
To use this model, configure the Similarity Score and Weightage for the snippets-related index fields.
The Similarity Score is the minimum expected score of the match between the user query and the extracted answer snippet. It defines how closely should a snippet match the user query to qualify as an answer. Higher the value of this field, the closer the match.
The weights section can be used to assign a weightage to the snippet_title and snippet_content fields. The average of the weights of the two fields is used in calculating a similarity score between the user query and a snippet. For example, if you assign a weightage of 9 to snippet_title and a weightage of 5 to the snippet_content, there will be a high probability that the snippet whose title matches the user query better will be returned as the answer snippet.
snippet_title – Field used to save the title of an extracted snippet.
snippet_content – Field used to save the content of an extracted snippet.
If this similarity score calculated using the weights of the title and content fields is greater than or equal to the similarity threshold set above, the snippet qualifies as an answer snippet to be displayed to the user. If none of the snippets meet the similarity threshold, no snippet will be displayed.
Generative Model Configuration
This model leverages the power of large language models to generate an answer snippet from the most relevant chunks. Currently, SearchAssist supports using OpenAI and Azure OpenAI LLMs to generate the answer snippets.
To enable the Generative Model for answer snippets, the application must be integrated with one or more of the supported LLMs.
For integration, go to the Integrations page under the Manage tab. Refer to this documentation for step-by-step instructions.
After you have successfully integrated one or more of the supported third-party LLMs, you can configure it to be used for generating Answer Snippets. Go to the Answer Snippet configuration page and select the model to be enabled from the drop-down menu. The drop-down will list all the configured LLMs.
To enable or disable the selected model, use the slider button. If an enabled integration is removed from the Integrations pages, it is automatically disabled for answer snippets.
Set the following configuration properties for the selected integration.
Temperature: This field describes the level of randomness and creativity in answer generation. A lower value of this field suggests that the generated text will be more focussed and consistent whereas a higher value indicates more creative and varied text. The value can range from 0.1 to 1.0 and can be incremented or decremented in multiples of 0.1. The default value for this field is set to 0.5.
Number of Chunks: This is the number of chunks of data to be sent to the LLM for answer generation. This can vary depending upon the type of input data to the LLM engine. This field can take values from 1 to 5 with 5 as the default value. A higher value of this field implies better coverage and more precise answers but at the same time, making multiple API calls to process each chunk can introduce additional latency and increase the cost to the user.
Prompt Editor: The Prompt Editor is a feature that allows you to tailor prompts for answer generation in the Language Model (LLM). A prompt is an instruction sent to the LLM model that enables it to return desired responses. This feature enables you to customize prompts for specific use cases, ensuring more relevant and domain-specific answers. You can define your own set of instructions, provide use-case-specific few-shot examples, and define specifications for the response to get from LLM, like tone, answer length, verbiage, etc.
The prompt editor comes with two built-in prompts: a Default Prompt that provides answers adhering to the information in the sources and a Multilingual Prompt that can be used to get answers from the content in multiple languages. You can edit the built-in prompts or write a new prompt.
To add a new prompt, click the Create Prompt button, enter a name and description of the prompt, and click Create. This will create a new prompt that you can edit and use for answer generation.
To edit an existing prompt, click the edit icon in the widget.
Guidelines for Prompt Customization
Prompt Structure
It is recommended that the prompt be defined as a set of messages where each message has two properties: role and content. The ‘role’ can take one of three values: ‘system’, ‘user’, or the ‘assistant’. The ‘content’ contains the text of the message from the role. Assigning roles to the messages allows you to establish a context and guide the model’s understanding of the inputs and the expected output. Click on the info icon to view the guidelines for the prompt format.
Dynamic variables
SearchAssist provides two dynamic variables that provide contextual information to the prompt.
- chunks – Chunks selected by SearchAssist to be sent to the LLM model for answer generation.
- query – User query for which the answer is to be generated.
To use these variables in the prompt, enclose them in double curly braces. Note that it is mandatory to use both of these variables in the prompt for it to be considered a valid prompt.
Points to remember
- Both system variables must be present in the prompt.
- The system message should define the answer format such that the generated answers provide the chunk_id associated with them. To do so, add an instruction like this in the prompt, “Your response should adhere to the following format: “Some relevant answer[chunk_id]. Another relevant data[chunk_id].”.
Best Practices for Writing Prompts
- Add clear and detailed instructions for a precise answer.
- Provide examples.
- Specify the approximate length of the response.
Sample Answer Prompt
[ { "role": "system", "content": "You are an AI system responsible for generating answers and references based on user-provided context. The user will provide context, and your task is to answer the user's query at the end. Your response should adhere to the following format: **'Some relevant answer[chunk_id] Another relevant data[chunk_id]'**. In this format, you must strictly include the relevant answer or information followed by the chunk_id, which serves as a reference to the source of the data within the provided context. Importantly, place only the correct chunk_ids within square brackets. These chunk_ids must be located exclusively at the end of each content and indicated explicitly with a key 'chunk id.' Do not include any other text, words, or characters within square brackets. Your responses should also be properly formatted with all necessary special characters like new lines, tabs, and bullets, as required for clarity and presentation. If there are multiple answers present in the provided context, you should include all of them in your response. You should only provide an answer if you can extract the information directly from the content provided by the user. If you have partial information, you should still provide a partial answer. Always send relevant and correct chunk_ids with the answer fragments. You must not fabricate or create chunk_ids; they should accurately reference the source of each piece of information. If you cannot find the answer to the user's query within the provided content, your response should be 'I don't know.'.GENERATE ANSWERS AND REFERENCES EXCLUSIVELY BASED ON THE CONTENT PROVIDED BY THE USER. IF A QUERY LACKS INFORMATION IN THE CONTEXT, YOU MUST RESPOND WITH 'I don't know' WITHOUT EXCEPTIONS. Please generate the response in the same language as the user's query and context. To summarize, your task is to generate well-formatted responses, including special characters like new lines, tabs, and bullets when necessary, and to provide all relevant answers from the provided context while ensuring accuracy and correctness. Each answer fragment should be accompanied by the appropriate chunk_id, and you should never create chunk_ids. Answer in the same language as the user's query and context. }, { "role": "user", "content": "From the content that I have provided, solve the query : ' {{query}} ' 'EXCLUDE ANY irrelevant information, such as 'Note:' from the response. NEVER FABRICATE ANY CHUNK_ID." } ] |
Multilingual Support for Generative Snippets
SearchAssist supports Generative Snippets in multiple languages. This implies that it generates answers in the language of the query. For example, the answer would also be generated in Spanish if the user query is in Spanish. To enable Multilingual support, take the following steps.
- Ensure that the specific language is added and enabled under Index Language. Note that SearchAssist supports multiple languages but for Generative snippets currently only English, Spanish, and German languages are supported.
- Add appropriate instructions in the prompt. You can select the built-in Multilingual Prompt which already has the required instructions to support multilingual answers.
Simulation and Testing
SearchAssist also provides you with a simulator to test the snippets engine and find the most suitable configuration or model for your business needs. It provides an answer snippet flow analysis both as text and in JSON format. You can enable one or more models, assess the performance of the results to the user queries using the simulator and choose the most appropriate model.
To start the simulator, click the Simulate button on the top right corner of the page.
Enter a sample query and click Test. Depending on the model enabled for the answer snippet and the priority set, the simulator displays the results from the models as shown in the sample below.
The model marked as the Presented answer indicates the model that has been set to higher priority and hence will be displayed to the user on the search results page. The simulator also shows the time each model takes to find the snippets and the overall similarity score for the snippet in the case of the extractive model.
Response when the extractive model is set to a higher priority.
Response when the generative model is set to a higher priority.
The answer snippets flow analysis or simulation results are also displayed in a developer-friendly JSON format. This format provides additional information about the snippets and will be further enhanced in the future. Click JSON view to view the information in JSON format.
The response in the JSON format is shown below.
Answer Snippet Support
The following table lists the answer snippet models supported for data from different types of sources.
Azure StoragetxtNoYes
Sources | File Type | Extractive Model | Generative Model |
Web | Static HTML | Yes | Yes |
File | Yes | Yes | |
docx | No | Yes | |
ppt | No | Yes | |
xlsx | No | No | |
txt | No | Yes | |
png | No | No | |
jpeg | No | No | |
jpg | No | No | |
FAQ | Yes | Yes | |
Structure Data | json | Yes | Yes
Note: Only the fields, “title” & “Content”, if available, are used for generating snippets. |
Connectors | |||
Confluence Server | Knowledge Articles | Yes | Yes |
Confluence Cloud | Knowledge Articles | Yes | Yes |
Service Now | Knowledge Articles | Yes | Yes |
Sharepoint | Pages/Articles (aspx) | Yes | Yes |
docx | No | Yes | |
xlsx | No | No | |
pptx | No | Yes | |
Yes | Yes | ||
doc | No | Yes | |
xls | No | No | |
ppt | No | Yes | |
html | Yes | Yes | |
txt | No | Yes | |
csv | No | No | |
sql | No | Yes | |
xhtml | Yes | Yes | |
Zendesk | Tickets/json | No | No |
Knowledge Articles | Yes | Yes | |
Google drive | docx | No | Yes |
doc | No | Yes | |
xlsx | No | No | |
ppt | No | Yes | |
pptx | No | Yes | |
No | Yes | ||
txt | No | Yes | |
html | No | Yes | |
Azure Storage | txt | No | Yes |
Yes | Yes | ||
rtf | No | Yes | |
msword | No | Yes | |
docx | No | Yes | |
doc | No | Yes | |
xlsx | No | No | |
xls | No | No | |
ppt | No | Yes | |
pptx | No | Yes | |
Salesforce | Knowledge articles | No
Extractive Snippets are supported if there is ‘html’ content. |
Yes |
Oracle Knowledge | Knowledge articles | Yes | Yes |
Dropbox | docx | No | Yes |
doc | No | Yes | |
xlsx | No | No | |
ppt | No | Yes | |
pptx | No | Yes | |
Yes | Yes | ||
txt | No | Yes | |
html | Yes | Yes |