1. Introduction

Search functions are just one of many features that customers expect from an online presence. They must be intuitive to use and deliver relevant results.

SmartSearch bundles these requirements and represents a high-performance search solution for them, which can be used on extensive websites. It offers both a high hit quality and an optimal search comfort and thus binds customers to the website.

By means of SmartSearch Connect, the functionalities of SmartSearch Connect and FirstSpirit can be optimally combined. With very little effort it is possible to provide the website created with FirstSpirit with a high-performance search. Changes are directly reflected in the search results.

The goal of SmartSearch Connect is to create a simple and fast link between SmartSearch and FirstSpirit. Care has been taken to create as little installation and configuration effort as possible. To ensure the simplicity of the module it was deliberately avoided to be compatible with every FirstSpirit project. For example, it is currently not yet possible to use projects in which fragments are used.

2. Installation and configuration

To use the functionalities of the SmartSearch Connect module, it is necessary to install and configure different components. The steps required for this are explained in the following subchapters.

2.1. Project component configuration

For the deployment of SmartSearch Connect a project specific configuration is required. This is done via the project component to be added to the used project. To do this, open the ServerManager and select the Project Properties area → Project Components..

Figure 1. Selection for configuring the project component

The main panel shows a list of all existing project components. After clicking Add, select the SmartSearch ConnectProjectApp and confirm your selection with OK. The project component is then added to the list in the main panel and must be configured (see figure Selection for configuring the project component). To do so, select the entry in the list and open the corresponding configuration dialog via Configure (see figure Dialog der Projekt-Komponente).

Figure 2. Dialog der Projekt-Komponente

First of all, a name for the Datagenerator and PreparedSearch to be generated automatically in SmartSearch can be configured here. If the field is left empty, the current project name is used for this purpose.

Via the button Initialize in SmartSearch! below, the above configured Datagenerator and the PreparedSearch can be created at any time ("Initialization"). Here, as well, the currently stored project name is used, should the above field be empty. The result of the call can be obtained from the server log, as this is where the initialization is triggered.

In the configuration of the project it is necessary to set the URL Creator, which is also set in generation used for creating the web page. Next to the URL Creator the prefix used in the generation must be specified. This is the only way to ensure that the URLs of the search hits also refer to the corresponding pages.

In the Media types list, you can select which media should be transferred to SmartSearch using SmartSearch Connect. In this list, only file extensions of textually processable files that are present in the project are displayed. This list may have to be extended during the use of a project.

By default, SmartSearch delivers the URLs of the original resolutions for images. If you want to use URLs of other resolutions, you can select them in the Image resolutions. SmartSearch Connect does not ensure that the images are accessible from the outside. This must be ensured in the project.

Only media referenced in FirstSpirit are processed. This ensures that only media that can be reached from outside are processed.

The Image resolutions list can be used to select resolutions for which URLs are to be generated for images. The URL to the original resolution is always generated. If further resolutions are to be used, for example for a preview in the search results, then this resolution must be selected in the list. The use of remote media is also possible. For this, the resolutions selected in the list must also be present in the remote media project. The prefix for remote media URLs is pulled from the remote project configuration (ServerManager for the project in which the media will be included).

By default, the documents in SmartSearch contain only the metadata of the corresponding element in FirstSpirit. If the metadata is to be inherited, then the check mark Inherit metadata must be set.

Inheritance does not refer to the mechanism to exclude elements from the SmartSearch index. At this point, inheritance in the structure takes effect even without this configuration.

The SmartSearch API provides two different versions of the API. The two versions differ mainly in the processing of hierarchies from FirstSpirit.

Under Configure API Request the use of an extra output channel can be configured. A more detailed description of this can be found in the SmartSearch output channel chapter.

2.2. SmartSearch object template set

SmartSearch Connect was designed to require as few steps as possible to process content from a FirstSpirit project in SmartSearch. This way it is possible to quickly see the first results in the index. For project specific customizations, the Groovy Script of the data generator in SmartSearch can be used. But if this is not sufficient, SmartSearch Connect offers the possibility to pass a SmartSearchObject to SmartSearch by using a template set. For this purpose, a template set must be created in the corresponding FirstSpirit project. This must be named SmaSeJSON and have JSON as the target file extension. The template set can be designed using the usual template syntax. It must be ensured that the output is a valid JSON that meets the SmartSearchObject requirements.

In the configuration of the SmartSearch ConnectProjectApp it can be set how the SmartSearch object should be used. There are three options: * Use only FormData - Here the SmartSearchObject is not used. * Use FormData and SmartSearchObject - Here both the SmartSearch object and the FormData information are considered. * Use only SmartSearchObject - Here only the SmartSearchObject is considered.

The information from the exemplary JSON that is in lines 2-13 is always sent to SmartSearch (including metadata). The information that is below page is only submitted if the FormData information is selected for submission. For datasets, the situation is slightly different. Here FormData is directly at the root level.
Example JSON structure

   "displayName":"Thermostate Übersicht",
   "fsTitle":"Thermostate Übersicht",




2.2.1. SmartSearchObject Requirements

Via the SmartSearchObject it is possible to write contents into a SmartSearch Connect document. Contents that are in the SmartSearchObject overwrite the contents that are in the FormData, for example. The following example shows a template set of a pagetemplate. In this case the title of the SmartSearch document would be overwritten with the title from the pagetemplate.

Example of the template set of a page template

For SmartSearch to be able to process the JSON, it must consist of one-dimensional strings and string arrays only. Nesting is not possible.

SmartSearch Connect processes records differently than known from FirstSpirit. The records are sent independently without being included in a page template. Therefore, the JSON of a record may look different in the preview than the JSON sent to SmartSearch.

2.3. Exclude elements from SmartSearch index

SmartSearch Connect writes all elements to the SmartSearch index by default. If the possibility should be given that this can be prevented directly in FirstSpirit, then it is necessary to extend the project.

2.3.1. Hide pages

For hiding pages it is necessary to add an input component to the metadata. This must be a CMS_INPUT TOGGLE and must have the name md_smartsearch_noindex.

Example of the input component in the metadata
<CMS_INPUT_TOGGLE name="md_smartsearch_noindex" type="radio" hFill="yes">
        <LANGINFO lang="*" label="Hide"/>
        <LANGINFO lang="*" label="On" description="Hide this site from search index."/>

If the metadata are included as described above, they are also displayed in the page store, for example. However, it is only displayed in the site store. Therefore, it makes sense to display the metadata only in the site store by a rule.

Example of the rule to create metadata only in the site store.
                        <PROPERTY name="STORETYPE" source="#global"/>
                <PROPERTY name="VISIBLE" source="md_smartsearch_noindex"/>
Hiding pages also works via folders. Please note that changes to folders only take effect after a full generation or a change to an affected page.

2.3.2. Hide datasets

Because datasets do not have metadata, it is necessary here to add an additional column of type boolean to the corresponding tables in the database schema. It is important to note that the column must be named smartsearch_noindex. In order to edit the columns data, an input component must be added to the table template.

Pages and datasets that already exist in the index but are to be hidden by setting the corresponding field are only deleted from the index after a full generation. Eventing does not take effect at this point.

2.3.3. Removal of data via SmaSeJSON templating

The default handling of first spirit content produces field names and field values depending on the structure and naming of said elements. In some cases it might be necessary to remove this automatically generated content, e.g. if the content is not suposed to be searched or even indexed due to legal/privacy concerns. If the SmaSeJSON template is active, field values can be cleared from the automatically generated data by passing an empty String in the corresponding JSON.

The code in the following example as a result of the SmaSeJSON template will have the effect that any previously automatically generated values for the field FS_pt_supersecret will not be indexed:

    "FS_pt_supersecret" : ""
Some fields and field values will be generated and updated on the SmartSearch side after the entity was sent by the FirstSpirit module, depending on the project setup. Changes these values in the SmaSeJSON template will potentially not be reflected in the indexed data. If changes are not reflected, it is advised to check the GroovyScript enhancers on corresponding datagenerators as well as the SmartSearch reference regarding fields and field values.

3. Full generation

For the initial filling of SmartSearch a full generation is necessary. For this purpose a new job is created in the job administration of the project configuration and added under the actions SmartSearch Connect push.

Figure 3. Selection of the full generation action

Executing this job creates a new data generator and a new Prepared Search in SmartSearch Connect (name of the FirstSpirit project with the prefix FS_).

After a successful full generation, all documents that were already in the index before the full generation and were not updated by it are deleted. Thus, only current documents are left in the index.

4. Partly generation

To keep SmartSearch up to date after a partial generation it is necessary to add a script type action to the job. This script must be included after the actual generation and have the following content:

#! executable-class
Figure 4. Script for the partial generation

5. Eventing

To keep the contents in SmartSearch as up to date as possible, there is eventing in addition to generation. Here, changes (creation, deletion, modification of elements) are sent to SmartSearch directly after the elements have been released.

In order to trigger event-driven transfer of data in the direction of SmartSearch by editorial adjustments in ContentCreator, certain requirements must be met by the workflow used. For example, dependent objects must be released by the workflow as soon as an object is released (example: when a page change is released, the dependent page references must be released). To achieve this behavior, the BasicWorkflows module can be installed, for example.

6. Media

When processing media through SmartSearch Connect, the procedure differs for images and files. Therefore, the following chapters describe the respective procedure in more detail.

6.1. Images

SmartSearch Connect or SmartSearch does not process images. So no information is pulled from images. Images are mostly needed as previews for search results. Usually a certain resolution is provided for this purpose. For which resolutions URLs are required can be set in the Selection for configuring the project component.

The default fields in SmartSearch include the thumbnail field. This is not filled by SmartSearch Connect. For a thumbnail it is therefore necessary to select the appropriate resolution in the configuration and to use the appropriate resolution in the search hits.

6.1.1. Remote Media

For remote media, the resolution from the remote project must also be available in the local project so that it can be selected in the configuration. The configuration for the remote project is used to create the URL.

6.2. Files

SmartSearch processes files in formats such as PDFs and DOCX. The files themselves are not sent from SmartSearch Connect to SmartSearch. For processing, the files are downloaded from SmartSearch. The corresponding URL is sent by SmartSearch Connect. Thus, for this procedure, it is mandatory that the files for SmartSearch are accessible.

If eventing is being used, this may cause processing of a file to be triggered before the file is reachable by SmartSearch. Thus, regular full deployment is necessary for files.
In a remote project, the following must be considered when using files: In this case, events are generated for files only when the files are created or edited in the remote project. In the case of a full generation, on the other hand, only the files from the source project are taken into account. In this case it is recommended to configure SmartSearch Connect for the remote project as well and to merge the resulting data generator with the source project via a PreparesSearch.

7. Datasets

SmartSearch Connect transfers datasets to SmartSearch via a registered preview page at the corresponding template. If there are multiple templates for a dataset type, the template with the most input fields is automatically selected. The datasets will then be transferred to the index regardless of any potential exclusion of the preview page (see Hide pages). The background here is that datasets can also be included in the search in a single-page application, e.g. in interaction with the CaaS. If certain datasets should not be transferred to the index, this can be prevented by means of Hide datasets or Removal of data via SmaSeJSON templating. In addition, we recommend to exclude content projection pages always from indexing using Hide pages. Otherwise, records will be in the index multiple times. Also, content projections that serve as overview pages are usually not relevant for search.

For technical reasons it is necessary that the template belonging to a dataset has a preview page specified. This implies a usage of the dataset. However, the corresponding page should be excluded from the search index using Hide pages.

8. What happens in SmartSearch?

If a FirstSpirit project with the exemplary name "Demo" sends data to the SmartSearch for the first time, the elements required for use are automatically created there:

A DataGenerator of type "API" with the name "FS_Demo" (FS_<project name>):

Figure 5. Automatically generated DataGenerator

A PreparedSearch with the name "FS_Demo

Figure 6. Automatically generated PreparedSearch

The corresponding DataGenerator is already preconfigured at this location:

Figure 7. Automatically generated link between PreparedSearch and DataGenerator

8.1. DataGenerator

The DataGenerator(s) generated by the module are permanently set to the status "API Ready", except for the short moments when data is received. If this data reception is very fast, the status change is not visible in the cockpit. Also in the log the reception of data is only visible at loglevel DEBUG, here a corresponding entry for logback-spring.xml:

<logger name="de.arithnea.haupia.datageneration.crawler.api.reactive" level="DEBUG"/>

DataGenerators of type "API" cannot be started and stopped because they are permanently waiting for data.

Data transmitted to the DataGenerator will be transferred to the SOLR index after 30 seconds. This time period can be reduced (for test purposes) to e.g. one second using the environment variable API_CRAWLER_COMMIT_WITHIN on the SmartSearch system. This is not intended for productive use:


8.2. PreparedSearch

The preconfigured PreparedSearch can be retrieved via REST immediately after receipt of the data. Facets, for example, can be maintained on it as usual. The facet fsType contains the type of the entity exported from FirstSpirit:

Figure 8. Facets language and fsType

A possibly important configuration for using the PreparedSearch would be e.g. adding the field link to the output, so that when querying the interface the result is output around the URLs generated by FirstSpirit:

Figure 9. Added field link
The list of field names per DataGenerator in the cockpit is cached. It may take a few minutes for the cache to be populated after the first data is transferred to SmartSearch.

8.3. Data processing

As described in the chapter Configuration of the project component, the API version to be used can be selected in the configuration of the ProjectApp under the item Configure API Request. The selected version affects the way the data transferred from FirstSpirit is processed by SmartSearch. In both versions, textual content is summarized in content. The difference lies, among other things, in the processing of input components that can be used as facets. These include, for example, date fields and dropdowns.

The way the structure from FirstSpirit is converted to a flat SmartSearch document is the main difference between the two API versions, and is described in the following chapter.

8.3.1. API Version V1

For API version V1, the fields in the document are structured as follows: FS_L*_NAME. FS_ is the prefix to indicate that this element comes from FirstSpirit. L*_ represents the depth of the element in the structure at FirstSpirit. Where * is a numeric value. The NAME is the name of the input component. The input component st_title is located at FS_L5_st_title accordingly on the 5th level. The levels are, for example, the page reference, the page and a paragraph attached to it.

8.3.2. API Version V2

In the API version V2 the fields in the document also start with a FS_. This is followed by the name of the input component. For the depth of the element the names of the input components are combined if necessary. A ___ serves as a separator. The element FS_pt_slider___st_description contains the description that is hooked below a slider. If the name is not unique, e.g. because the same paragraph is hooked multiple times, then the contents are stored as multi value, if the field type allows it. This processing is more intuitive than with API version V1, because with the latter the depth is difficult to predict without knowing the exact template structure in the project and thus the JSON file generated in the background.

It is recommended to use the API version V2. The focus of further development will be on this version.

9. Usecases

In the sections of this chapter some usecases and best practices are listed.

9.1. Headless scenario in combination with CaaS

When implementing a search solution for headless portals, the handling of item URLs is an important aspect that requires special attention. The prerequisite for all subsequent steps and actions is the implementation of Content-as-a-Service (CaaS) with the appropriate module in the corresponding project.

The difference in using SmartSearch in a headless scenario, as opposed to a page search is that the search result cannot return a unique URL. For this case, using the FirstSpirit UUID is the preferred way. This is already included in the data from SmartSearch and the CaaS. This UUID can be used to retrieve the record in the CaaS that matches the search hit. How these records are prepared for searchers is up to the frontend in this case. Especially if, as in the classic view, selecting a search result should lead to a specific page where the record is included.

In the project configuration, a different 'URLCreator' must first be selected for the project.

Most 'URLCreator’s (e.g. 'Default URLs') create static URLs containing the entire web address, while the 'CaaS Connect Url Factory' creates CaaS paths that refer to the position of the content within the CaaS platform.

If content is transferred to SmartSearch with 'CaaS Connect Url Factory' enabled, it will contain the appropriate URLs in the 'link' field in the search index to retrieve the data in CaaS.

If static URLs (such as those generated by the 'URLCreator' 'Default URLs') are to be available in the search index, you can alternatively proceed as explained below.

To query a PageRef at CaaS, the following URL is required (cf. documentation):


While these are indexed into the 'link' field when using the 'CaaS Connect Url Factory', they must be assembled from various sources of information when using other 'URLCreator’s.

The part 'https://<tenant>-caas-api.e-spirit.cloud/<tenant>/<project_uuid>.release.content/' can be taken from the CaaS module configuration: To do this, call up the menu item 'Project components' in the project configuration and then call up the component 'CaaS Connect Project App'. The required part 'https://<tenant>-caas-api.e-spirit.cloud/<tenant>/<project_uuid>.release.content/' can be found there in the field 'CaaS REST API endpoints' → 'Release Collection API'.

The '<page_ref_uuid>' part corresponds to the UID from the FirstSpirit context (see documentation). This is always available in SmartSearch as the 'identifier' value if the data was transferred via SmartSearch Connect. In the JSON which is transferred by SmartSearch Connect this is present in the following form:

  "fsType": "PageRef",

If a PageRef is released and thus transferred to SmartSearch, the value 'identifier' is available in the data of the corresponding data generator.

The locale which is required in the part '<lang_country>' is not included in the JSON, so it must be added to the transmitted data accordingly.

For this, the SmaSeJSON template must be configured and enabled as explained in the SmartSearch object template set section. In the template set of the corresponding elements, in the following a 'PageRef', the locale can be added as follows:


If a PageRef is now released and thus transferred to SmartSearch, the value 'locale' is available in the data of the corresponding data generator.

With the information from the project configuration and the data transferred to SmartSearch the PageRef and all related information stored in the CaaS can be queried.

For this purpose, a PreparedSeach can be created in SmartSearch, which makes the 'identifier' and 'link' fields available in a search result. Combined with the information from the FirstSpirit project configuration, content can thus be retrieved from the CaaS APIs using a search result instead of a static URL.

10. Known problems and limitations

The following sections list known issues for this version.

10.1. FirstSpirit-project name

Currently, the project name in FirstSpirit should not contain any spaces if possible. While the export to SmartSearch with a space character works, there may be some unpredictable side effects, since spaces are not allowed for DataGenerator and PreparedSearch names.

10.2. Renaming projects in FirstSpirit

To make the data generators and PreparedSearches automatically created by the FirstSpirit module in SmartSearch recognizable, follow the naming scheme "FS<Project name>" above.

As a result, the FirstSpirit project must not be renamed, as SmartSearch cannot follow this customization. For SmartSearch, data transferred afterwards would result in the creation of a new data generator and a new PreparedSearch, and thus a new "index".

10.3. Rename resolutions

In the configuration of the project component, the resolutions processed by SmartSearch Connect can be selected. If the display name of a resolution is renamed in FirstSpirit, the corresponding selection in the configuration is lost and must be set again.

10.4. Filter for media

In the configuration of the project component the media types can be maintained, which are to be processed by SmartSearch. With this function there is a restriction, if for a medium in different languages different medium types are deposited. In this case, media that are not stored in the configuration are also transferred if there is a medium in another language that is to be transferred.

10.5. Use of media

Medien, die mittels CMS_INPUT_IMAGEMAP eingebunden werden, werden nicht berücksichtigt. The same applies to media that are included via FS_INDEX and Media Data Access Plugin (Media DAP).

10.6. Media without processable textual content

If media are transferred to SmartSearch, SmartSearch may not be able to extract any text from them. An example of such a medium would be a PDF file containing only images. In this case, they would not be indexed because media processing writes the content of the medium to the 'content' field, which must not be empty. This can be remedied by ensuring that content is present in the 'content' field of all media using a Groovy Script Enhancer. This can be done, for example, by copying a standard text or the content of another field into the 'content' field if the 'content' field is empty.

10.7. Subsequent addition of the noindex flag in the database schema.

As described in the chapter Hide datasets, SmartSearch Connect offers the possibility to exclude datasets from the index by means of a flag. If this flag is added only after the first generation with a default value, then this takes effect only after the first changes to the affected datasets. If there are already many records it is recommended to edit them by script.

10.8. Structure of documents in SmartSearch index

The structure of the documents in the SmartSearch index depends strongly on the structure of the project. Therefore it is not possible to make a general statement about the structure. However, for understanding, especially during development, it can be very helpful to have a better understanding of the structure. As this is based on the json generated by FirstSpirit, the following scripts can be used to get at the structure of the json. To apply in SiteArchitect, right click on the page reference or record and there click on Run Script→Developer Scripts (public)→beanshellconsole. In the appearing console the corresponding script can be executed.

For page references, the script looks like this:

renderingAgent = context.requireSpecialist(de.espirit.firstspirit.agency.RenderingAgent.TYPE);
templateSource = "$CMS_SET(#global.json.dataRenderDepth, 1)$$CMS_SET(#global.json.formatVersion, 1.2)$$CMS_SET(#global.json.sectionTemplateRendering, false)$$CMS_SET(#global.json.useDefaultHtmlTemplateProvider, true)$$CMS_SET(#global.json.metaDataRendering, true)$$CMS_SET(#global.json.resolveDynamicContent, true)$$CMS_VALUE(json(#global.node))$";
renderObject = renderingAgent.createRenderer(templateSource).linkRoot(e);
jsonOutput = renderObject.render();

For datasets, the script looks like this:

renderingAgent = context.requireSpecialist(de.espirit.firstspirit.agency.RenderingAgent.TYPE);
templateSource = "$CMS_SET(#global.json.dataRenderDepth, 1)$$CMS_SET(#global.json.formatVersion, 1.2)$$CMS_SET(#global.json.sectionTemplateRendering, false)$$CMS_SET(#global.json.useDefaultHtmlTemplateProvider, true)$$CMS_SET(#global.json.metaDataRendering, true)$$CMS_SET(#global.json.resolveDynamicContent, true)$$CMS_VALUE(json(idProvider))$";
import de.espirit.firstspirit.access.store.Store;
renderObject = renderingAgent.createRenderer(templateSource).linkRoot(e.getProject().userService.getStore(Store.Type.SITESTORE, true));
renderObject.additionalContext("idProvider", e);
jsonOutput = renderObject.render();
Some fields, such as the ULR, links of images and the title, are only added by the module and are therefore not present in the output of the scripts.

SmartSearch Connect is a product of e-Spirit AG, Dortmund, Germany. Only a license agreed upon with e-Spirit AG is valid with respect to the user for using the module. Only a license agreed upon with e-Spirit AG is valid for using the module.

12. Help

The Technical Support of the Crownpeak Technology GmbH provides expert technical support covering any topic related to the FirstSpirit™ product. You can get and find more help concerning relevant topics in our community.

13. Disclaimer

This document is provided for information purposes only. Crownpeak Technology GmbH may change the contents hereof without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. Crownpeak Technology GmbH specifically disclaims any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. The technologies, functionality, services, and processes described herein are subject to change without notice.