Workbench is a SearchAssist tool that converts content into objectively indexed documents. It processes the ingested content in a series of stages known as Index Pipeline. Collectively, Index pipeline converts the ingested content into a document ready for indexing. Each stage performs a specific set of data transformations before passing the content onto the next stage in the pipeline. Each stage also has a stage-specific configuration. You can rearrange or sequence the stages in a preferred order on the basis of your business requirements.
To configure the workbench and introduce Index Pipeline stages for content processing, go to the Workbench page under the Indices tab.
SearchAssist supports the following Index Pipeline stages.
- Field Mapping maps fields in an indexing pipeline document to a target field, sets values, copies values, removes fields, and more.
- Entity Extraction uses NLP techniques to identify named entities from the source field.
- Traits Extraction extracts specific attributes that search users might express in their conversations.
- Custom Script stage allows you to enter customized scripts to perform any field mapping tasks like deleting or renaming fields.
- Keyword Extraction automatically detects important words stored in a field.
- Exclude Document stage drops all the documents that match the specified condition.
- Semantic Meaning is a technique to understand the meaning and interpretation of words, signs, and sentence structure. This stage currently supports web page-related sources only.
- Snippets Extraction helps you to identify relevant snippets from the ingested data.
SearchAssist allows you to develop a custom pipeline corresponding to an Index configuration to suit your business requirements.
Each indexing stage has properties like stage type, stage name, and applicable conditions to choose the documents that must be transformed and the change to be performed. Every time there is a change in any of the stages, train the system before testing to ensure that the latest configuration is being used for indexing. You can also test individual stages of the pipeline by temporarily making the other stages inactive.