Please note that "haupia" has been renamed to "SmartSearch" with version 2.4.0. In all instances where there may be discrepancies or confusion, "SmartSearch" should be considered the correct and updated term. This might not be reflected in old scripts and examples, so we apologize for any inconvenience and hope that the instances requiring correction will be minimal. |
1. Introduction
In the SmartSearch context the search index is filled by so called datagenerators. It is possible to add own datagenerators in addition to the provided ones. These are called external datagenerators and may be implemented based on the external datagenerators REST API. Prior to using the external datagenerator REST API two requirements have to be fulfilled. First an external datagenerator has to be created in the SmartSearch cockpit. It is later identified by its name. Second a technical user has to exist with the right to execute a datageneration. All calls to the REST API have to authenticate by using an authentication header (See section 4.2 of RFC 7235) and type “Basic”. If this is not the case or the credentials are not valid, the REST services will return with an HTTP status code of 405 (Not authorized). The REST service works exclusively with JSON data, besides the call to the datagenerator status. It is expected that for all calls the Accept header is set to "application/json". For calls which send data in JSON format a Content-Type header with value "application/json" is required. The usage of the API is similar to using a transaction. At first a begin has to be issued. Then documents are added to the datagenerator. Once all documents have been submitted the session is concluded by calling commit. At any time during the session it is possible to abort it. After the session is over, either by commit or abort, the held resources on the server are freed.
Before the data is synced the configured enhancers are applied. These are configured in the SmartSearch cockpit. |
Most services include the possibility to send an additional message. This message is logged on INFO and also sent to the cockpit to be shown on the datagenerator list page.
The SmartSearch REST API for external datagenerators is available since SmartSearch version 2.0.0.59. |
The SmartSearch REST API for external datagenerators supersedes the SmartSearch 1 Java frontend API (external datagenerator part). |
2. Begin a datageneration
Every datageneration is based on a storage. The storage is used to persist the documents during the data generation until they are synced with the index. When beginning a datageneration it may be chosen to apply a fresh storage or a storage pre-filled with documents from the datagenerators last one. To begin a datageneration the following REST call may be used:
-
Method:
POST
-
URL:
/rest/api/v1/datagenerator/external/{name}/begin
name |
The configured datagenerator name |
The body of the REST call may be a JSON object containing the desired storage creation type. This is optional, if no body is sent, a fresh storage is applied. A full example of a call to the REST service:
POST /rest/api/v1/datagenerator/external/eocpyfmhzr/begin HTTP/1.1 Content-Type: application/json
Accept: application/json
Authorization: Basic dXNlcjpwYXNzd29yZA==
Host: localhost:8181 Content-Length: 64
{
"message" : "Started DG", "creationType" : "COPY_LATEST"
}
The JSON object of the request body is as follows:
creationType |
String |
OPTIONAL The storage creation type. May be one of: NEW, COPY_LATEST |
message |
String |
OPTIONAL Message used for logging and sent to the cockpit. |
If the JSON object doest not contain the storage creation type the default one is applied.
The response call has no content.
There may be only one datageneration active at a time. If it is tried to start a datageneration while another is running the response code is 412 (PRECONDITION FAILED).
The following table contains the main HTTP response codes:
204 |
NO CONTENT |
The datageneration was successfully started. |
400 |
BAD REQUEST |
The given JSON data was either an array or not a valid JSON string. |
404 |
NOT FOUND |
A datagenerator with the name does not exists. |
412 |
PRECONDITION FAILED |
There is already a datageneration running. |
415 |
UNSUPPORTED MEDIA TYPE |
Either the accept header (as well as the content-type header if content is sent) is missing or not set to 'application/json'. |
3. Add a document
Once a datageneration has been started one or more documents needs to be added. A document is represented as JSON object with the key "data" and the document data as value. The data itself is a JSON object. The entries of the object is mapped to a document by using the key as field name and the value as field value. The value itself may be represented in the JSON object as simple string or an array in case there are more than one values. It is of course also possible to represent the value with an array containing one value which is exactly the same as the simple string. These possibilities are all illustrated in this example:
{
"uid": "abc123",
"data": {
"key1": [
"val1",
"val2",
"val3" ],
"key2": "val4",
"key3": [ "val5" ]
}
}
The uid of the document is a required field. It is expected, that the uid is truly unique across the datageneration. If a document already exists with the same id, the “old” document is replaced.
The REST service is defined as follows:
-
Method:
PUT
-
URL:
/rest/api/v1/datagenerator/external/{name}
name |
The configured datagenerator name |
The body of the REST call is required and contains the document to add.
PUT /rest/api/v1/datagenerator/external/ciozgcbkvh HTTP/1.1 Content-Type: application/json
Accept: application/json
Authorization: Basic dXNlcjpwYXNzd29yZA==
Host: localhost:8181 Content-Length: 101
{
"message" : "",
"data" : {
"toldt" : [ "rvhhf", "wdzun", "perda" ]
},
"uid" : "mdzaj"
}
In case of success a status code 204 (NO CONTENT) is returned.
The following table contains the main HTTP response codes:
204 |
NO CONTENT |
The datageneration was successfully added. |
400 |
BAD REQUEST |
The given JSON data was either an array or not a valid JSON string. Also if the uid is missing in the document this status code is returned. |
404 |
NOT FOUND |
The datagenerator name does not exist. |
415 |
UNSUPPORTED MEDIA TYPE |
Either the accept header (as well as the content-type header if content is sent) is missing or not set to 'application/json'. |
4. Commit datageneration
As soon as all documents are added the datagenerator is committed. During this stage the enhancers will iterate over all documents, the storage is synced and the resources are cleaned up. The REST call itself will return immediately thus will not mirror any problems during sync or when the process is finished.
-
Method:
POST
-
URL:
/rest/api/v1/datagenerator/external/{name}/commit
name |
The configured datagenerator name |
The body of the REST call is optional and may contain the message with should be submitted with the commit.
POST /rest/api/v1/datagenerator/external/vlrqjvjhhn/commit HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Basic dXNlcjpwYXNzd29yZA==
Host: localhost:8181 Content-Length: 29
{
"message" : "Commit DG"
}
In case of success a status code 204 (NO CONTENT) is returned.
The following table contains the main HTTP response codes:
204 |
NO CONTENT |
The commit was successfully started. |
404 |
NOT FOUND |
The datagenerator name does not exist. |
415 |
UNSUPPORTED MEDIA TYPE |
Either the accept header (as well as the content-type header if content is sent) is missing or not set to 'application/json'. |
5. Abort datageneration
It is possible to abort a running datageneration at any time. On the server side reserved resources will be freed. The REST call itself will return immediately thus will not mirror any problems during sync or when the process is finished.
-
Method:
POST
-
URL:
/rest/api/v1/datagenerator/external/{name}/abort
name |
The configured datagenerator name |
The body of the REST call is optional and may contain the message with should be submitted with the abort.
POST /rest/api/v1/datagenerator/external/ysxiplvrtd/abort HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Basic dXNlcjpwYXNzd29yZA==
Host: localhost:8181 Content-Length: 28
{
"message" : "Abort DG"
}
In case of success a status code 204 (NO CONTENT) is returned.
The following table contains the main HTTP response codes:
204 |
NO CONTENT |
The abort was successfully started. |
404 |
NOT FOUND |
The datagenerator name does not exist. |
415 |
UNSUPPORTED MEDIA TYPE |
Either the accept header (as well as the content-type header if content is sent) is missing or not set to 'application/json'. |
6. Get datagenerator status
The datagenerator has a specific status on the server. These are for example shown on the datagenerator list page. An external datagenerator may only be started if the current state is IDLE or ERROR otherwise the call to begin will return with PRECONDITION FAILED. To get the current state of a datagenerator this REST service may be used:
-
Method:
GET
-
URL:
/rest/api/v1/datagenerator/external/{name}/status
name |
The configured datagenerator name |
The response is a string representing the status of the datagenerator. The result content type is text/plain and should be requested accordingly:
GET /rest/api/v1/datagenerator/external/xigtbewetb/status HTTP/1.1
Accept: text/plain
Authorization: Basic dXNlcjpwYXNzd29yZA==
Host: localhost:8181
An example response look like this: CRAWLING
In case of success a status code 200 (OK) is returned.
The following table contains the main HTTP response codes:
200 |
OK |
The request was successful. |
404 |
NOT FOUND |
The datagenerator name does not exist. |
415 |
UNSUPPORTED MEDIA TYPE |
Either the accept header (as well as the content-type header if content is sent) is missing or not set to 'application/json'. |
7. Legal notices
SmartSearch is a product of Crownpeak Technology GmbH, Dortmund, Germany.
Only a license agreed upon with Crownpeak Technology GmbH is valid with respect to the user for using the module.
8. Help
The Technical Support of the Crownpeak Technology GmbH provides expert technical support covering any topic related to the FirstSpirit™ product. You can get and find more help concerning relevant topics in our community.
9. Disclaimer
This document is provided for information purposes only. Crownpeak Technology GmbH may change the contents hereof without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. Crownpeak Technology GmbH specifically disclaims any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. The technologies, functionality, services, and processes described herein are subject to change without notice.