GETTING STARTED
SearchAssist Overview
SearchAssist Introduction
Onboarding SearchAssist
Build your first App
Glossary
Release Notes
What's new in SearchAssist
Previous Versions

CONCEPTS
Managing Sources
Introduction
Files
Web Pages
FAQs
Structured Data 
Connectors
Introduction to Connectors
SharePoint Connector
Confluence Cloud Connector
Confluence Server Connector
Zendesk Connector
ServiceNow Connector
Salesforce Connector
Azure Storage Connector
Google Drive Connector
Dropbox Connector
Oracle Knowledge Connector
DotCMS Connector
RACL
Virtual Assistants
Managing Indices
Introduction
Index Fields
Traits
Workbench
Introduction to Workbench
Field Mapping
Entity Extraction
Traits Extraction
Keyword Extraction
Exclude Document
Semantic Meaning
Snippet Extraction
Custom LLM Prompts
Index Settings
Index Languages
Managing Chunks
Chunk Browser
Managing Relevance
Introduction
Weights
Highlighting
Presentable
Synonyms
Stop Words
Search Relevance
Spell Correction
Prefix Search
Custom Configurations
Personalizing Results
Introduction
Answer Snippets
Introduction
Extractive Model
Generative Model
Enabling Both Models
Simulation and Testing
Debugging
Best Practices and Points to Remember
Troubleshooting Answers
Answer Snippets Support Across Content Sources
Result Ranking
Facets
Business Rules
Introduction
Contextual Rules
NLP Rules
Engagement
Small Talk
Bot Actions
Designing Search Experience
Introduction
Search Interface
Result Templates
Testing
Preview and Test
Debug Tool
Running Experiments
Introduction
Experiments
Analyzing Search Performance
Overview
Dashboard
User Engagement
Search Insights
Result Insights
Answer Insights

ADMINISTRATION
General Settings
Credentials
Channels
Team
Collaboration
Integrations
OpenAI Integration
Azure OpenAI Integration
Custom Integration
Billing and Usage
Plan Details
Usage Logs
Order and Invoices

SearchAssist APIs
API Introduction
API List

SearchAssist SDK

HOW TOs
Use Custom Fields to Filter Search Results and Answers
Add Custom Metadata to Ingested Content
Write Painless Scripts
Configure Business Rules for Generative Answers

Manage Content of a Web Page

Once you add content to the application, it needs to be updated as the content from websites may not be static. You can manage (schedule periodic web crawling and edit crawling) and ensure that the content is in sync with the data on the website.

Schedule Web Crawling

The scheduler allows you to schedule a job to re-crawl the configured website periodically. To schedule a web crawling job, follow the below steps:

  1. On the Indices page, click Content on the left pane.
  2. On the Content list view page, select the respective source from the list.
  3. On the source dialog box, click the Configuration tab.
  4. On the Configuration tab, turn on the Schedule toggle.
  5. Set the Date, Time, and Frequency.
  6. Turn on the Crawl Everything toggle to crawl all the domains. 
  7. If you wish to crawl only selected domains, then turn off the Crawl Everything toggle.
  8. After you turn off the Crawl Everything toggle, the Allow List toggle is turned on automatically. You can enter the allowed list of URLs in the Allow URLs field.
  9. If you wish to block URLs, then turn off the Allow List toggle.
  10. After you turn off the Allow List toggle, the Block List toggle is turned on automatically. You can enter URLs to block in the Block URLs field. 
  11. Select Crawl Settings
  • JavaScript-rendered
  • Use Cookies
  • Respect robots.txt
  1. Click Save.

Edit Crawler Configuration

To edit a web crawling source, follow the below steps:

  1. On the Indices page, click Content on the left pane.
  2. On the Content list view page, select the respective source from the list.
  3. On the source dialog box, make the required changes.
  4. Click Save.

Manage Content of a Web Page

Once you add content to the application, it needs to be updated as the content from websites may not be static. You can manage (schedule periodic web crawling and edit crawling) and ensure that the content is in sync with the data on the website.

Schedule Web Crawling

The scheduler allows you to schedule a job to re-crawl the configured website periodically. To schedule a web crawling job, follow the below steps:

  1. On the Indices page, click Content on the left pane.
  2. On the Content list view page, select the respective source from the list.
  3. On the source dialog box, click the Configuration tab.
  4. On the Configuration tab, turn on the Schedule toggle.
  5. Set the Date, Time, and Frequency.
  6. Turn on the Crawl Everything toggle to crawl all the domains. 
  7. If you wish to crawl only selected domains, then turn off the Crawl Everything toggle.
  8. After you turn off the Crawl Everything toggle, the Allow List toggle is turned on automatically. You can enter the allowed list of URLs in the Allow URLs field.
  9. If you wish to block URLs, then turn off the Allow List toggle.
  10. After you turn off the Allow List toggle, the Block List toggle is turned on automatically. You can enter URLs to block in the Block URLs field. 
  11. Select Crawl Settings
  • JavaScript-rendered
  • Use Cookies
  • Respect robots.txt
  1. Click Save.

Edit Crawler Configuration

To edit a web crawling source, follow the below steps:

  1. On the Indices page, click Content on the left pane.
  2. On the Content list view page, select the respective source from the list.
  3. On the source dialog box, make the required changes.
  4. Click Save.