GETTING STARTED
SearchAssist Overview
SearchAssist Introduction
Onboarding SearchAssist
Build your first App
Glossary
Release Notes
What's new in SearchAssist
Previous Versions

CONCEPTS
Managing Sources
Introduction
Files
Web Pages
FAQs
Structured Data 
Connectors
Introduction to Connectors
Azure Storage Connector
Confluence Cloud Connector
Confluence Server Connector
Custom Connector
DotCMS Connector
Dropbox Connector
Google Drive Connector
Oracle Knowledge Connector
Salesforce Connector
ServiceNow Connector
SharePoint Connector
Zendesk Connector
RACL
Virtual Assistants
Managing Indices
Introduction
Index Fields
Traits
Workbench
Introduction to Workbench
Field Mapping
Entity Extraction
Traits Extraction
Keyword Extraction
Exclude Document
Semantic Meaning
Snippet Extraction
Custom LLM Prompts
Index Settings
Index Languages
Managing Chunks
Chunk Browser
Managing Relevance
Introduction
Weights
Highlighting
Presentable
Synonyms
Stop Words
Search Relevance
Spell Correction
Prefix Search
Custom Configurations
Personalizing Results
Introduction
Answer Snippets
Introduction
Extractive Model
Generative Model
Enabling Both Models
Simulation and Testing
Debugging
Best Practices and Points to Remember
Troubleshooting Answers
Answer Snippets Support Across Content Sources
Result Ranking
Facets
Business Rules
Introduction
Contextual Rules
NLP Rules
Engagement
Small Talk
Bot Actions
Designing Search Experience
Introduction
Search Interface
Result Templates
Testing
Preview and Test
Debug Tool
Running Experiments
Introduction
Experiments
Analyzing Search Performance
Overview
Dashboard
User Engagement
Search Insights
Result Insights
Answer Insights

ADMINISTRATION
General Settings
Credentials
Channels
Team
Collaboration
Integrations
OpenAI Integration
Azure OpenAI Integration
Custom Integration
Billing and Usage
Plan Details
Usage Logs
Order and Invoices
Smart Hibernation

SearchAssist APIs
API Introduction
API List

SearchAssist SDK

HOW TOs
Use Custom Fields to Filter Search Results and Answers
Add Custom Metadata to Ingested Content
Write Painless Scripts
Configure Business Rules for Generative Answers

Retrieval Strategies

Retrieval Strategy defines the algorithm or process of retrieving the most relevant information from the ingested content. SearchAssist supports the following retrieval strategies.

Vector Search Retriever

A vector search retriever is a component that retrieves relevant documents or items based on the semantic similarity to a given query. This retriever uses vector embeddings to represent both the query and the documents in a high-dimensional vector space. The similarity between the query vector and document vectors is then calculated using a distance metric, such as cosine similarity, to rank the documents by relevance.

When to Use

  • When the indexed data contains generic words which a model can understand easily
  • Whenever high-quality embeddings are available (eg, OpenAI, Cohere, etc.)
  • Multi-lingual search
  • Handling complex queries: It is effective at handling complex queries that may have synonyms or different forms of the same concept.

Hybrid Search Retriever

Hybrid search works by fusing the search results of both keyword-based and vector searches and then re-ranking them(RAG Fusion). This approach leverages the strengths of both methods to address their weaknesses. For instance, keyword search excels at precise matching and handling low-frequency vocabulary, while vector search is good at understanding similar semantics.

You can tweak the weightage for vector search or keyword search by using a parameter alpha. alpha = 0 (keyword search), alpha = 1 (vector search), alpha = 0.5 (default) (equal weight for keyword and vector search)

When to Use

  • When you need to balance precision and relevance: Hybrid search allows you to combine the precision of keyword search with the contextual understanding of vector search, providing a balance between the two. This is particularly useful in applications where you want to ensure that the search results are not only relevant to the query but also precise in terms of the terms used.
  • When dealing with domain-specific language or jargon: Keyword search excels at matching specific terms or industry jargon, which can be essential in certain domains. Hybrid search can help ensure that these specific terms are not missed, even if the semantic meaning of the query is different.

Limitations

  • Whenever we use Hybrid Search Retriever the scores given by the RAG Fusion algorithm are only used for re-ranking the chunks. The final score of each chunk will be the vector score returned for it; if the chunk is not part of the vector search results then it will be the keyword match score.

DocSearchRetriever

Whenever we do vector search or hybrid search we search across all the chunks extracted from the source. When the data grows more, the chunk lookup scope is also huge which may lead to false positives in some cases. To limit this scope, we can first find the relevant source documents for the given query and then limit the chunk lookup to the only matched source documents. 

Within doc search retriever, we will have different combinations.

  • DocSearchRetriever – Doc search followed by Hybrid Search RAG Fusion
  • DocSearchRetrieverHybridLegacy – Doc search followed by Hybrid Search Legacy
  • DocSearchRetrieverPureVector – Doc search followed by pure vector search

When to Use

  • Large-scale document search
  • When the accuracy is poor with other retrievers
  • When we have good accuracy in search results

Retrieval Strategies

Retrieval Strategy defines the algorithm or process of retrieving the most relevant information from the ingested content. SearchAssist supports the following retrieval strategies.

Vector Search Retriever

A vector search retriever is a component that retrieves relevant documents or items based on the semantic similarity to a given query. This retriever uses vector embeddings to represent both the query and the documents in a high-dimensional vector space. The similarity between the query vector and document vectors is then calculated using a distance metric, such as cosine similarity, to rank the documents by relevance.

When to Use

  • When the indexed data contains generic words which a model can understand easily
  • Whenever high-quality embeddings are available (eg, OpenAI, Cohere, etc.)
  • Multi-lingual search
  • Handling complex queries: It is effective at handling complex queries that may have synonyms or different forms of the same concept.

Hybrid Search Retriever

Hybrid search works by fusing the search results of both keyword-based and vector searches and then re-ranking them(RAG Fusion). This approach leverages the strengths of both methods to address their weaknesses. For instance, keyword search excels at precise matching and handling low-frequency vocabulary, while vector search is good at understanding similar semantics.

You can tweak the weightage for vector search or keyword search by using a parameter alpha. alpha = 0 (keyword search), alpha = 1 (vector search), alpha = 0.5 (default) (equal weight for keyword and vector search)

When to Use

  • When you need to balance precision and relevance: Hybrid search allows you to combine the precision of keyword search with the contextual understanding of vector search, providing a balance between the two. This is particularly useful in applications where you want to ensure that the search results are not only relevant to the query but also precise in terms of the terms used.
  • When dealing with domain-specific language or jargon: Keyword search excels at matching specific terms or industry jargon, which can be essential in certain domains. Hybrid search can help ensure that these specific terms are not missed, even if the semantic meaning of the query is different.

Limitations

  • Whenever we use Hybrid Search Retriever the scores given by the RAG Fusion algorithm are only used for re-ranking the chunks. The final score of each chunk will be the vector score returned for it; if the chunk is not part of the vector search results then it will be the keyword match score.

DocSearchRetriever

Whenever we do vector search or hybrid search we search across all the chunks extracted from the source. When the data grows more, the chunk lookup scope is also huge which may lead to false positives in some cases. To limit this scope, we can first find the relevant source documents for the given query and then limit the chunk lookup to the only matched source documents. 

Within doc search retriever, we will have different combinations.

  • DocSearchRetriever – Doc search followed by Hybrid Search RAG Fusion
  • DocSearchRetrieverHybridLegacy – Doc search followed by Hybrid Search Legacy
  • DocSearchRetrieverPureVector – Doc search followed by pure vector search

When to Use

  • Large-scale document search
  • When the accuracy is poor with other retrievers
  • When we have good accuracy in search results