Skip to main content
The Dev Tools section in Search AI provides advanced configuration options and developer utilities to optimize search performance, extend functionality, and integrate custom solutions. Navigation: Dev Tools menu in Search AI

Advanced Configuration

Advanced Configurations allow you to fine-tune retrieval and answer optimization settings for specific requirements.

Accessing Advanced Configuration

  1. Navigate to Dev Tools > Advanced Configurations
  2. Search for the configuration you want to modify
  3. Select or provide the appropriate values

Available Configurations

ConfigurationDescriptionUse Case
Re-Rank ChunksSelect the reranking feature and Re-Ranker modelImprove result relevance by reordering chunks based on semantic similarity
Re-Rank Chunk FieldsSelect fields used to rerank chunksCustomize which chunk attributes influence reranking
Maximum Re-Rank ChunksSet maximum chunks sent for rerankingBalance performance vs. quality by limiting reranking scope
Enable Exact KNN MatchingEnable Exact K-Nearest Neighbors matchingImprove precision for vector similarity searches
Single-Use URLs for Uploaded DocumentsEnable secure, temporary access to uploaded documentsEnhance security for sensitive document access

Configuration Details

Re-Ranking

Re-ranking improves search quality by applying a secondary model to reorder initially retrieved chunks based on deeper semantic analysis. Search AI supports the following re-rankers.
  • Cross Encoder Re-Ranker - Uses the cross-encoder/ms-marco-MiniLM-L-6-v2 model. It’s lightweight, fast, and most suitable for English Language.
  • BGE Re-Ranker - Uses BAAI/bge-reranker-v2-m3 model. It’s a lightweight re-ranking model and possesses multilingual capabilities.
  • MixedBread Re-Ranker - Uses the mixedbread-ai/mxbai-rerank-large-v1 model, which is resource intensive to run and has a higher latency but delivers the highest accuracy and performance.
This feature doesn’t require training of the application.
SettingPurpose
Re-Rank ChunksEnable/disable reranking and select model
Re-Rank Chunk FieldsDefine which fields (title, content, metadata) to use for reranking. By default, Chunk Title, Chunk Text, and Record Title are used. Note that selecting different fields impacts the results generated by the re-ranker.
Maximum Re-Rank ChunksMaximum number of chunks to be sent for reranking (Values: 5 - 20, default: 20)
Increasing the number of fields or chunks for reranking may lead to higher latency due to the added complexity in computation, data retrieval, and processing load. If you are using the generative model and re-ranking is also enabled, the overall latency includes the latency induced by the re-ranker and the LLM.

KNN Matching

KNN stands for K-Nearest Neighbors. In RAG applications, KNN matching retrieves the most relevant information from indexed data based on semantic similarity, i.e., finding the closest chunks matching a given query. There are two types of KNN matching methods:
  1. Exact KNN - Finds the truly exact neighbors by comparing the query with every vector of the indexed content. This type of matching guarantees higher accuracy, and precision but can be computationally expensive and can affect performance.
  2. Approximate KNN - Uses different techniques to find the nearest neighbors quickly rather than comparing with every vector. This method may provide a close match rather than the best one, but it’s faster and more scalable for large datasets.
Exact KNN (K-Nearest Neighbors) matching provides more precise vector similarity searches compared to approximate methods.
ConsiderationImpact
AccuracyHigher precision in finding similar vectors
PerformanceMay increase query latency for large datasets
Use CaseRecommended when accuracy is prioritized over speed
Enabling Exact KNN Match can introduce some latency and thereby increase the average Response Time.

Document Security

When a file is ingested into Search AI for indexing, a signed URL is automatically generated for the uploaded document on the server. These URLs are used as references or citations when search results or answers are derived from the corresponding document. The signed URL provides secure, temporary access to the document. It is valid for a single use or 5 minutes, whichever comes first. This ensures controlled access and prevents unauthorized sharing.
This applies only to uploaded documents. References for data from connectors or web pages include a direct link to the corresponding web page or third-party application.

Encoding Model

An encoding model defines how text is split into tokens before being processed. This configuration defines the encoding model used for token counting across indexing, extraction, and runtime operations in Search AI. The tokenizer determines:
  • How many tokens a piece of text occupies — directly affecting whether content fits within model context windows
  • Where content gets split — affecting the semantic coherence of chunks
  • How batches are sized — affecting throughput and cost during indexing
Different tokenizers produce different token counts for the same input text. Using a tokenizer that doesn’t match the underlying model leads to inaccurate budgeting, suboptimal chunking, and potential truncation errors.
By default, apps created in v11.23.1 or later use GPT-4o (o200k_base) encoding models. For apps created before this version, update the Encoding Model and manually retrain the app.
Supported Models
  • GPT-4oo200k_base (Recommended)
  • GPT-4 / GPT-3.5cl100k_base
  • GPT-2gpt2 (Legacy default model)
This change applies across the Search AI application, including:
  • Extraction and Document Processing — used when splitting content into chunks (text/CSV, HTML, Markdown, and AI Vision content; conversation chunking for Slack, Teams, and other message-based connectors)
  • Enrichment and Vector Generation — used for embedding batch sizing and LLM-stage document truncation
  • Answer Generation — used for token counting during search and response generation (token budget calculation, top-k chunk selection, response limits)

Enable Click Tracking

The Enable Click Tracking configuration allows tracking of user interactions with links in Search AI responses. When enabled, all chunk and citation URLs in answers are automatically wrapped with tracking links. Each click is recorded and mapped to the corresponding query and result, providing insights into user engagement. Upon click, the following data is captured: Upon click, the following data is captured:
FieldDescription
Query IDThe query that produced the response containing the clicked link
Citation IDThe specific citation the user clicked
Chunk IDThe content chunk that the citation references
User IDThe user who performed the click
TimestampDate and time of the click event
Source TypeThe type of source the citation points to
  • Tracking is handled server-side, so no client-side instrumentation or code changes are required
  • Works consistently across web, mobile, and API integrations
Exporting Analytics The captured click data feeds into the analytics pipeline and can be exported via the public API.
  • Use the Answer Analytics API to download complete click analytics data
  • Use the Click Analytics API for aggregated insights such as:
    • Time-series trends
    • Average click statistics (clicks per search, chunks clicked, total searches with clicks)
    • Average click position
    • Click position distribution
    • Top queries with no clicks

Toolkit - Developer Utilities

The Toolkit provides SDKs and utilities for content processing, data extraction, performance evaluation, and custom connector development.

Available Tools

Evaluation Tools

ToolPurposeKey Features
RAG EvaluatorEvaluate RAG system performanceMeasures search quality using RAGAS and CEQA frameworks; API integration; flexible results storage
RAG Evaluator on GitHub

Integration Tools

ToolPurposeKey Features
Custom Connector SDKBuild custom data source integrationsStandardized data ingestion; metadata integrity; optimized for enterprise RAG applications
Custom Connector SDK on GitHub

Extraction Utilities

ToolPurposeKey Features
HTML to Structured Data ExtractorExtract content from HTML sourcesIdentifies tables of contents; preserves heading-content relationships; outputs JSON
Adobe Extraction UtilityExtract content from PDFsPreserves original layout and structure; intelligent document parsing
Azure Extraction UtilityExtract from Azure-hosted documentsUses Azure AI Document Intelligence; automatic content structuring
Google Document AIBatch process documents from cloud storageAutomates extraction from unstructured/semi-structured documents
Salesforce Custom Extraction UtilityExtract from Salesforce Knowledge BaseRetains hierarchy and relationship structure
UtilityGitHub Link
HTML to Structured DataView Repository
Adobe ExtractionView Repository
Azure ExtractionView Repository
Google Document AIView Repository
Salesforce ExtractionView Repository

Model Optimization

ToolPurposeKey Features
Fine-Tune Embedding UtilityFine-tune embedding modelsUses domain-specific documents; compares pre/post fine-tuning performance
Fine-Tune Embedding Utility on GitHub

Quick Reference

Advanced Configuration Summary

CategoryConfigurations
Retrieval OptimizationRe-Rank Chunks, Re-Rank Chunk Fields, Maximum Re-Rank Chunks
Vector SearchEnable Exact KNN Matching
SecuritySingle-Use URLs for Uploaded Documents

Toolkit Categories

CategoryTools
EvaluationRAG Evaluator
IntegrationCustom Connector SDK
ExtractionHTML, Adobe, Azure, Google Document AI, Salesforce utilities
OptimizationFine-Tune Embedding Utility

When to Use Each Tool

ScenarioRecommended Tool
Measure answer qualityRAG Evaluator
Connect custom data sourceCustom Connector SDK
Ingest HTML documentationHTML to Structured Data Extractor
Process PDF documentsAdobe or Azure Extraction Utility
Batch process cloud documentsGoogle Document AI
Extract Salesforce knowledgeSalesforce Custom Extraction Utility
Improve domain-specific searchFine-Tune Embedding Utility