1. Introduction

Search functions are just one of many features that customers expect from an online presence. They must be intuitive to use and deliver relevant results.

SmartSearch bundles these requirements and represents a high-performance search solution for them, which can be used on extensive websites. It offers both a high hit quality and an optimal search comfort and thus binds customers to the website.

By means of SmartSearch Connect, the functionalities of SmartSearch Connect and FirstSpirit can be optimally combined. With very little effort it is possible to provide the website created with FirstSpirit with a high-performance search. Changes are directly reflected in the search results.

The goal of SmartSearch Connect is to create a simple and fast link between SmartSearch and FirstSpirit. Care has been taken to create as little installation and configuration effort as possible. To ensure the simplicity of the module it was deliberately avoided to be compatible with every FirstSpirit project. For example, it is currently not yet possible to use projects in which fragments are used.

2. Installation and configuration

To use the functionalities of the SmartSearch Connect module, it is necessary to install and configure different components. The steps required for this are explained in the following subchapters.

2.1. Project component configuration

For the deployment of SmartSearch Connect a project specific configuration is required. This is done via the project component to be added to the used project. To do this, open the ServerManager and select the Project Properties area → Project Components..

alt-Text
Figure 1. Selection for configuring the project component

The main panel shows a list of all existing project components. After clicking Add, select the SmartSearch ConnectProjectApp and confirm your selection with OK. The project component is then added to the list in the main panel and must be configured (see figure Selection for configuring the project component). To do so, select the entry in the list and open the corresponding configuration dialog via Configure (see figure Dialog der Projekt-Komponente).

alt-Text
Figure 2. Dialog der Projekt-Komponente

First of all, a name for the Datagenerator and PreparedSearch to be generated automatically in SmartSearch can be configured here. If the field is left empty, the current project name is used for this purpose.

Via the button Initialize in SmartSearch! below, the above configured Datagenerator and the PreparedSearch can be created at any time ("Initialization"). Here, as well, the currently stored project name is used, should the above field be empty. The result of the call can be obtained from the server log, as this is where the initialization is triggered.

In the configuration of the project it is necessary to set the URL Creator, which is also set in generation used for creating the web page. Next to the URL Creator the prefix used in the generation must be specified. This is the only way to ensure that the URLs of the search hits also refer to the corresponding pages.

In the Media types list, you can select which media should be transferred to SmartSearch using SmartSearch Connect. In this list, only file extensions of textually processable files that are present in the project are displayed. This list may have to be extended during the use of a project.

By default, SmartSearch delivers the URLs of the original resolutions for images. If you want to use URLs of other resolutions, you can select them in the Image resolutions. SmartSearch Connect does not ensure that the images are accessible from the outside. This must be ensured in the project.

Only media referenced in FirstSpirit are processed. This ensures that only media that can be reached from outside are processed.

The Image resolutions list can be used to select resolutions for which URLs are to be generated for images. The URL to the original resolution is always generated. If further resolutions are to be used, for example for a preview in the search results, then this resolution must be selected in the list. The use of remote media is also possible. For this, the resolutions selected in the list must also be present in the remote media project. The prefix for remote media URLs is pulled from the remote project configuration (ServerManager for the project in which the media will be included).

By default, the documents in SmartSearch contain only the metadata of the corresponding element in FirstSpirit. If the metadata is to be inherited, then the check mark Inherit metadata must be set.

Inheritance does not refer to the mechanism to exclude elements from the SmartSearch index. At this point, inheritance in the structure takes effect even without this configuration.

The SmartSearch API provides two different versions of the API. The two versions differ mainly in the processing of hierarchies from FirstSpirit.

Under Configure API Request the use of an extra output channel can be configured. A more detailed description of this can be found in the SmartSearch output channel chapter.

2.2. SmartSearch object template set

SmartSearch Connect was designed to require as few steps as possible to process content from a FirstSpirit project in SmartSearch. This way it is possible to quickly see the first results in the index. For project specific customizations, the Groovy Script of the data generator in SmartSearch can be used. But if this is not sufficient, SmartSearch Connect offers the possibility to pass a SmartSearchObject to SmartSearch by using a template set. For this purpose, a template set must be created in the corresponding FirstSpirit project. This must be named SmaSeJSON and have JSON as the target file extension. The template set can be designed using the usual template syntax. It must be ensured that the output is a valid JSON that meets the SmartSearchObject requirements.

In the configuration of the SmartSearch ConnectProjectApp it can be set how the SmartSearch object should be used. There are three options: * Use only FormData - Here the SmartSearchObject is not used. * Use FormData and SmartSearchObject - Here both the SmartSearch object and the FormData information are considered. * Use only SmartSearchObject - Here only the SmartSearchObject is considered.

The information from the exemplary JSON that is in lines 2-13 is always sent to SmartSearch (including metadata). The information that is below page is only submitted if the FormData information is selected for submission. For datasets, the situation is slightly different. Here FormData is directly at the root level.
Example JSON structure

          
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
{
   "fsType":"PageRef",
   "name":"smart_thermostats",
   "displayName":"Thermostate Übersicht",
   "identifier":"d8c8bd3f-3268-44aa-aa38-91c192b3ce8c",
   "uid":"smart_thermostats",
   "uidType":"SITESTORE_LEAF",
   "fsTitle":"Thermostate Übersicht",
   "fsToJsonVersion":"1.2",
   "fsUrl":"https://www.smartliving.com/Produkte/Thermostate-Übersicht.html"
   "metaFormData":{

   },
   "page":{
      "fsType":"Page",
      "name":"prod_overview",
      "displayName":"Produktübersicht",
      "identifier":"c8efd7ab-4744-412f-b58c-7273b2855790",
      "translated":true,
      "uid":"prod_overview",
      "uidType":"PAGESTORE",

      "template":{
         "fsType":"PageTemplate",
         "name":"Standard",
         "displayName":"Standard",
         "identifier":"1b6428fc-f157-44ce-a3b9-ceb8800a2476",
         "uid":"standard",
         "uidType":"TEMPLATESTORE"
      },
      "formData":{
         "pt_add_sections":{
            "fsType":"CMS_INPUT_TOGGLE",
            "name":"pt_add_sections",
            "value":true
         },
         "pt_create_section":{
            "fsType":"FS_BUTTON",
            "name":"pt_create_section",
            "value":{

            }
         },
         .
         .
         .
   },
   "fsGenericApiVersion":"v1",
   "smartSearchDocument":{
      "title":"Produktübersicht"
   }
}

2.2.1. SmartSearchObject Requirements

Via the SmartSearchObject it is possible to write contents into a SmartSearch Connect document. Contents that are in the SmartSearchObject overwrite the contents that are in the FormData, for example. The following example shows a template set of a pagetemplate. In this case the title of the SmartSearch document would be overwritten with the title from the pagetemplate.

Example of the template set of a page template
{
    "title":"$CMS_VALUE(pt_title)$"
    $CMS_VALUE(#global.getPage().getBodyByName("content"))$
}

For SmartSearch to be able to process the JSON, it must consist of one-dimensional strings and string arrays only. Nesting is not possible.

SmartSearch Connect processes records differently than known from FirstSpirit. The records are sent independently without being included in a page template. Therefore, the JSON of a record may look different in the preview than the JSON sent to SmartSearch.

2.3. Exclude elements from SmartSearch index

SmartSearch Connect writes all elements to the SmartSearch index by default. If the possibility should be given that this can be prevented directly in FirstSpirit, then it is necessary to extend the project.

2.3.1. Hide pages

For hiding pages it is necessary to add an input component to the metadata. This must be a CMS_INPUT TOGGLE and must have the name md_smartsearch_noindex.

Example of the input component in the metadata
<CMS_INPUT_TOGGLE name="md_smartsearch_noindex" type="radio" hFill="yes">
    <LANGINFOS>
        <LANGINFO lang="*" label="Hide"/>
    </LANGINFOS>
    <ON>
        <LANGINFO lang="*" label="On" description="Hide this site from search index."/>
    </ON>
</CMS_INPUT_TOGGLE>

If the metadata are included as described above, they are also displayed in the page store, for example. However, it is only displayed in the site store. Therefore, it makes sense to display the metadata only in the site store by a rule.

Example of the rule to create metadata only in the site store.
<RULE>
        <WITH>
                <EQUAL>
                        <PROPERTY name="STORETYPE" source="#global"/>
                        <TEXT>sitestore</TEXT>
                </EQUAL>
        </WITH>
        <DO>
                <PROPERTY name="VISIBLE" source="md_smartsearch_noindex"/>
        </DO>
</RULE>
Hiding pages also works via folders. Please note that changes to folders only take effect after a full generation or a change to an affected page.

2.3.2. Hide datasets

Because datasets do not have metadata, it is necessary here to add an additional column of type boolean to the corresponding tables in the database schema. It is important to note that the column must be named smartsearch_noindex. In order to edit the columns data, an input component must be added to the table template.

Pages and datasets that already exist in the index but are to be hidden by setting the corresponding field are only deleted from the index after a full generation. Eventing does not take effect at this point.

2.3.3. Removal of data via SmaSeJSON templating

The default handling of first spirit content produces field names and field values depending on the structure and naming of said elements. In some cases it might be necessary to remove this automatically generated content, e.g. if the content is not suposed to be searched or even indexed due to legal/privacy concerns. If the SmaSeJSON template is active, field values can be cleared from the automatically generated data by passing an empty String in the corresponding JSON.

The code in the following example as a result of the SmaSeJSON template will have the effect that any previously automatically generated values for the field FS_pt_supersecret will not be indexed:

{
    "FS_pt_supersecret" : ""
}
Some fields and field values will be generated and updated on the SmartSearch side after the entity was sent by the FirstSpirit module, depending on the project setup. Changes these values in the SmaSeJSON template will potentially not be reflected in the indexed data. If changes are not reflected, it is advised to check the GroovyScript enhancers on corresponding datagenerators as well as the SmartSearch reference regarding fields and field values.

3. Full generation

For the initial filling of SmartSearch a full generation is necessary. For this purpose a new job is created in the job administration of the project configuration and added under the actions SmartSearch Connect push.

alt-Text
Figure 3. Selection of the full generation action

Executing this job creates a new data generator and a new Prepared Search in SmartSearch Connect (name of the FirstSpirit project with the prefix FS_).

4. Partly generation

To keep SmartSearch up to date after a partial generation it is necessary to add a script type action to the job. This script must be included after the actual generation and have the following content:

#! executable-class
com.espirit.smartsearch.initialdeployment.PushPreviousGenerationExecutable
alt-Text
Figure 4. Script for the partial generation

5. Eventing

To keep the contents in SmartSearch as up to date as possible, there is eventing in addition to generation. Here, changes (creation, deletion, modification of elements) are sent to SmartSearch directly after the elements have been released.

In order to trigger event-driven transfer of data in the direction of SmartSearch by editorial adjustments in ContentCreator, certain requirements must be met by the workflow used. For example, dependent objects must be released by the workflow as soon as an object is released (example: when a page change is released, the dependent page references must be released). To achieve this behavior, the BasicWorkflows module can be installed, for example.

6. Media

When processing media through SmartSearch Connect, the procedure differs for images and files. Therefore, the following chapters describe the respective procedure in more detail.

6.1. Images

SmartSearch Connect or SmartSearch does not process images. So no information is pulled from images. Images are mostly needed as previews for search results. Usually a certain resolution is provided for this purpose. For which resolutions URLs are required can be set in the Selection for configuring the project component.

The default fields in SmartSearch include the thumbnail field. This is not filled by SmartSearch Connect. For a thumbnail it is therefore necessary to select the appropriate resolution in the configuration and to use the appropriate resolution in the search hits.

6.1.1. Remote Media

For remote media, the resolution from the remote project must also be available in the local project so that it can be selected in the configuration. The configuration for the remote project is used to create the URL.

6.2. Files

SmartSearch processes files in formats such as PDFs and DOCX. The files themselves are not sent from SmartSearch Connect to SmartSearch. For processing, the files are downloaded from SmartSearch. The corresponding URL is sent by SmartSearch Connect. Thus, for this procedure, it is mandatory that the files for SmartSearch are accessible.

If eventing is being used, this may cause processing of a file to be triggered before the file is reachable by SmartSearch. Thus, regular full deployment is necessary for files.
In a remote project, the following must be considered when using files: In this case, events are generated for files only when the files are created or edited in the remote project. In the case of a full generation, on the other hand, only the files from the source project are taken into account. In this case it is recommended to configure SmartSearch Connect for the remote project as well and to merge the resulting data generator with the source project via a PreparesSearch.

7. What happens in SmartSearch?

If a FirstSpirit project with the exemplary name "Demo" sends data to the SmartSearch for the first time, the elements required for use are automatically created there:

A DataGenerator of type "API" with the name "FS_Demo" (FS_<project name>):

alt-Text
Figure 5. Automatically generated DataGenerator

A PreparedSearch with the name "FS_Demo

alt-Text
Figure 6. Automatically generated PreparedSearch

The corresponding DataGenerator is already preconfigured at this location:

alt-Text
Figure 7. Automatically generated link between PreparedSearch and DataGenerator

7.1. DataGenerator

The DataGenerator(s) generated by the module are permanently set to the status "API Ready", except for the short moments when data is received. If this data reception is very fast, the status change is not visible in the cockpit. Also in the log the reception of data is only visible at loglevel DEBUG, here a corresponding entry for logback-spring.xml:

<logger name="de.arithnea.haupia.datageneration.crawler.api.reactive" level="DEBUG"/>

DataGenerators of type "API" cannot be started and stopped because they are permanently waiting for data.

Data transmitted to the DataGenerator will be transferred to the SOLR index after 30 seconds. This time period can be reduced (for test purposes) to e.g. one second using the environment variable API_CRAWLER_COMMIT_WITHIN on the SmartSearch system. This is not intended for productive use:

export API_CRAWLER_COMMIT_WITHIN=1000

7.2. PreparedSearch

The preconfigured PreparedSearch can be retrieved via REST immediately after receipt of the data. Facets, for example, can be maintained on it as usual. The facet fsType contains the type of the entity exported from FirstSpirit:

alt-Text
Figure 8. Facets language and fsType

A possibly important configuration for using the PreparedSearch would be e.g. adding the field link to the output, so that when querying the interface the result is output around the URLs generated by FirstSpirit:

alt-Text
Figure 9. Added field link
The list of field names per DataGenerator in the cockpit is cached. It may take a few minutes for the cache to be populated after the first data is transferred to SmartSearch.

7.3. Data processing

As described in the chapter Configuration of the project component, the API version to be used can be selected in the configuration of the ProjectApp under the item Configure API Request. The selected version affects the way the data transferred from FirstSpirit is processed by SmartSearch. In both versions, textual content is summarized in content. The difference lies, among other things, in the processing of input components that can be used as facets. These include, for example, date fields and dropdowns.

The way the structure from FirstSpirit is converted to a flat SmartSearch document is the main difference between the two API versions, and is described in the following chapter.

7.3.1. API Version V1

For API version V1, the fields in the document are structured as follows: FS_L*_NAME. FS_ is the prefix to indicate that this element comes from FirstSpirit. L*_ represents the depth of the element in the structure at FirstSpirit. Where * is a numeric value. The NAME is the name of the input component. The input component st_title is located at FS_L5_st_title accordingly on the 5th level. The levels are, for example, the page reference, the page and a paragraph attached to it.

7.3.2. API Version V2

In the API version V2 the fields in the document also start with a FS_. This is followed by the name of the input component. For the depth of the element the names of the input components are combined if necessary. A ___ serves as a separator. The element FS_pt_slider___st_description contains the description that is hooked below a slider. If the name is not unique, e.g. because the same paragraph is hooked multiple times, then the contents are stored as multi value, if the field type allows it. This processing is more intuitive than with API version V1, because with the latter the depth is difficult to predict without knowing the exact template structure in the project and thus the JSON file generated in the background.

It is recommended to use the API version V2. The focus of further development will be on this version.

8. Known problems and limitations

The following sections list known issues for this version.

8.1. FirstSpirit-project name

Currently, the project name in FirstSpirit should not contain any spaces if possible. While the export to SmartSearch with a space character works, there may be some unpredictable side effects, since spaces are not allowed for DataGenerator and PreparedSearch names. This will be changed in the future to use the FirstSpirit project ID and to be able to use a display name in SmartSearch Connect accordingly.

8.2. Renaming projects in FirstSpirit

To make the data generators and PreparedSearches automatically created by the FirstSpirit module in SmartSearch recognizable, follow the naming scheme "FS<Project name>" above.

As a result, the FirstSpirit project must not be renamed, as SmartSearch cannot follow this customization. For SmartSearch, data transferred afterwards would result in the creation of a new data generator and a new PreparedSearch, and thus a new "index".

As described above, this will be adjusted in the future in the direction of using the Project ID, which will then make renaming possible.

8.3. Filter for media

In the module configuration the media types can be maintained, which are to be processed by SmartSearch. With this function there is a restriction, if for a medium in different languages different medium types are deposited. In this case, media that are not stored in the configuration are also transferred if there is a medium in another language that is to be transferred.

8.4. Use of media

Medien, die mittels CMS_INPUT_IMAGEMAP eingebunden werden, werden nicht berücksichtigt. The same applies to media that are included via FS_INDEX and Media Data Access Plugin (Media DAP).

8.5. Subsequent addition of the noindex flag in the database schema.

As described in the chapter Hide datasets, SmartSearch Connect offers the possibility to exclude datasets from the index by means of a flag. If this flag is added only after the first generation with a default value, then this takes effect only after the first changes to the affected datasets. If there are already many records it is recommended to edit them by script.

SmartSearch Connect is a product of e-Spirit AG, Dortmund, Germany. Only a license agreed upon with e-Spirit AG is valid with respect to the user for using the module. Only a license agreed upon with e-Spirit AG is valid for using the module.

10. Help

The Technical Support of the e-Spirit AG provides expert technical support covering any topic related to the FirstSpirit™ product. You can get and find more help concerning relevant topics in our community.

11. Disclaimer

This document is provided for information purposes only. e-Spirit may change the contents hereof without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. e-Spirit specifically disclaims any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. The technologies, functionality, services, and processes described herein are subject to change without notice.