Skip to main content
This guide covers the configuration of retrieval strategies, answer generation, and search results in Search AI. These settings determine how content is retrieved from your index and how responses are delivered to users. Navigate to Responses > Retrieval Strategies to access these settings.

Retrieval Strategies

Configure the chunk retrieval strategy and corresponding thresholds for finding relevant content.

Retrieval Methods

Search AI supports two retrieval methods. The choice depends on the nature of your content, the type of queries your users ask, and the precision required.
StrategyDescriptionBest For
Vector RetrievalUses cosine similarity between query vector and chunk vectors. Scores range from 0 (no match) to 1 (complete match)Semantic similarity matching, contextual queries
Hybrid RetrievalCombines keyword-based matching with vector-based scoring — keyword matching captures exact terms and text patterns while vector scoring handles semantic meaning, leveraging strengths of both approachesBalanced precision and recall, recommended when content has both structured terminology and natural language
How to choose
  • Use Vector Retrieval when queries are conversational or conceptual and exact keyword matches are less important.
  • Use Hybrid Retrieval (default) when content contains specific terminology, product names, or structured data where keyword matching adds precision on top of semantic search.
Hybrid Retrieval is the default retrieval strategy.
Hybrid Retrieval is the default retrieval strategy.

Qualification Criteria

ParameterDescriptionRangeDefault
Similarity Score ThresholdMinimum similarity score for a chunk to qualify. Chunks below this score are discarded. Higher values require closer matches.0-10020
Proximity ThresholdHow closely retrieved chunks must be located relative to the highest-ranking chunk. Chunks beyond this threshold are discarded. The lower the value of the proximity threshold, the closer the chunks are.0-5020
Top K ChunksMaximum number of qualified chunks sent to the LLM as context for answer generation.-20
Token Budget for ChunksMaximum tokens allocated for chunks sent to the LLM. The combined total of chunk tokens, prompt, query, conversation context, and expected response must stay within the LLM’s context window.1-1,000,00020,000
When setting the Token Budget for Chunks, account for your LLM’s total context window size minus the tokens used by the prompt, query, and expected response. See Token Management for guidance.

Default Configuration Summary:

  • Retrieval Mechanism: Hybrid Retrieval
  • Similarity Score: 20
  • Proximity Threshold: 20
  • Top K Chunks: 20
  • Token budget for chunks: 20,000

Answer Generation

Configure how responses are composed and delivered to users. Navigate to Responses > Answer Configuration to access these settings.

Answer Components

ComponentDescription
Answer TextThe generated response addressing the user’s question
Snippet ReferenceLink to source as citation for further reading

Answer Types

TypeDescriptionConfiguration
ExtractiveTop chunk retrieved is directly presented as-is without text changesConfigure Response Length (tokens)
GenerativeTop chunks are sent to configured LLM, which generates a paraphrased answerRequires LLM integration and enabled Answer Generation in GenAI Tools

Generative Answer Configuration

Chunk Settings: Token Budget for Chunks: Specifies the total tokens that can be included in chunks sent to the LLM. Default: 20,000. Maximum: 1,000,000. To calculate the right value: subtract the tokens used by the prompt, instructions, and expected response from the LLM’s maximum context window. The remainder is the maximum token budget for chunks. Example: For a 4,096-token context window — if the prompt uses 500 tokens and the response uses 500 tokens, up to 3,096 tokens remain for chunks. At 500 tokens per chunk, that’s 6 chunks maximum. To limit to 3 chunks, set the budget to 1,500. Enable Document Level Processing: When enabled, Search AI sends full documents to the LLM instead of individual chunks. This is useful when relevant information is distributed across multiple chunks and sending only a few may result in incomplete answers. Search AI identifies and sends complete documents associated with the most relevant chunks, up to the defined token budget.
SettingDescriptionDefaultMax
Token Budget for ChunksTotal tokens for chunks sent to LLM20,0001,000,000
Enable Document Level ProcessingSend full documents instead of just chunks for richer contextDisabled-
Token Budget for DocumentsMaximum tokens when sending full documents50,000100,000
Chunk Order Options: The order of data chunks can affect the context and thereby, the results of a user query. The decision to use a specific chunk order should align with the goals of the task and the nature of the data being processed.
OrderDescriptionUse Case
Most to Least RelevantHighest relevance first, then decreasingStandard prioritization
Least to Most RelevantLowest relevance first, most relevant at end followed by queryWhen recency in context matters
LLM Configuration:
SettingDescription
Select Generative ModelChoose from configured LLM models
Answer PromptSelect prompt template for answer generation
TemperatureControls randomness (lower = more deterministic, higher = more creative)
Response LengthExpected answer length in tokens
Feedback Configuration: Enable feedback mechanism to allow users to rate answers. When this is enabled, the web SDK automatically includes the feedback options for the users. Feedback data appears in Answer Insights analytics.

Response Streaming

Enable real-time token-by-token response delivery for Web/Mobile SDK channels, reducing perceived latency for longer answers.
Streaming is configured via prompt settings and is currently supported only for OpenAI and Azure OpenAI models. Not available for API-based responses.
See Enable Response Streaming.

Search Results

Search results display a ranked list of documents or chunks by relevance, presenting each with a title and a snippet. Navigate to Responses > Search Results to enable and configure this feature. Unlike answers — which provide a single, focused response to a query — search results are more useful when broader information is needed

When to Use Search Results vs Answers

Use CaseRecommended
Direct, specific questionsAnswers
Broad topic explorationSearch Results
Complex queries requiring comparisonsSearch Results
Debugging/troubleshooting with multiple sourcesSearch Results

How Search Results Are Generated

When enabled, Search AI processes the user’s query, retrieves the most relevant chunks from the index, organizes them by their corresponding documents, and presents them with relevant metadata for each chunk.
When both search results and extractive answers are enabled, the top search result matches the answer. To avoid redundancy, the highest-ranking result is omitted from the search results list — results start from the next most relevant entry.

Configuration Settings

SettingDescriptionRangeDefault
Number of Search ResultsMaximum chunks displayed1-10020

Filters (Facets)

Filters enable users to narrow results based on specific criteria, useful for large result sets.
Search Results are currently accessible via the Search API only. See the Search API reference for details.
Filter Types:
TypeDescription
StaticFixed, predefined filter values
DynamicValues derived from search results
Filter UI Options:
UI TypeAvailabilitySelection
TabsStatic filters onlySingle value, string fields only
Single SelectDynamic filtersOne value at a time
Multi SelectDynamic filtersMultiple values concurrently

Creating Filters

  1. Provide unique Filter Name
  2. Select Filter Type (Static or Dynamic)
  3. Choose Field for filtering
  4. Select Filter UI style
Filter Rules:
  • Only one tab-style filter can be enabled at a time.
  • Only string fields can be used with tab-style UI.
  • Two filters cannot use the same field concurrently - only one filter per field can be enabled at a time.
  • A filter applies only if the search results contain content for the specified field.

Default Filters

Every application includes default filters that can be updated, deleted, or disabled as needed.

Search Results Access

Currently available via Search API only.