1. Introduction
The CaaS platform is the link between FirstSpirit and the customer’s end application. The REST Interface receives information and updates it in the internal persistence layer of the CaaS platform. An update of data in the customer’s end application is done by requests to the REST Interface.
The CaaS platform includes the following components, which are available as docker containers:
REST Interface (caas-rest-api)
The REST Interface is used both for transferring and retrieving data to and from the repository. For this purpose it provides a REST endpoint that can be used by any service. It also supports authentication and authorization.
Between CaaS version 2.11 and 2.13 (inclusive), the authentication and authorization functionality was provided by a separate Security Proxy. |
CaaS repository (caas-mongo)
The CaaS repository is not accessible from the Internet and can be only accessed within the platform via the REST Interface. It serves as a storage for all project data and internal configuration.
2. Technical requirements
The operation of the CaaS platform has to be realized with Kubernetes.
If you do not feel able to operate, configure, monitor, and analyze and resolve operating problems of the cluster infrastructure accordingly, we strongly advise against on-premises operation and refer to our SaaS offering. |
Since the CaaS-platform is delivered as Helm artifact, the Helm client must be available.
It is important that Helm is installed in a secure manner. For more information, refer to the Helm Installation Guide. |
For system requirements please consult the technical data sheet of the CaaS platform .
3. Installation and configuration
The setup of the CaaS platform for operation with Kubernetes is done by using Helm-Charts. These are part of the delivery and already contain all necessary components.
The following subchapters describe the necessary installation and configuration steps.
3.1. Import of the images
The first step in setting up the CaaS platform for operation with Kubernetes requires the import of the images into your central Docker registry (e.g. Artifactory). The images are contained in the file caas-docker-images-16.18.2.zip
in the delivery.
The credentials for cluster access to the repository must be known. |
The steps necessary for the import can be found in the documentation of the registry you are using.
3.2. Configuration of the Helm chart
After the import of the images the configuration of the Helm chart is necessary. This is part of the delivery and contained in the file caas-16.18.2.tgz
. A default configuration of the chart is already make in the values.yaml
file. All parameters specified in this values.yaml
can be overwritten with a manually created custom-values.yaml
by a specific value.
3.2.1. Authentication
All authentication settings for the communication with or within the CaaS platform are specified in the credentials
block of the custom-values.yaml
. So here you will find usernames and default passwords as well as the CaaS Master API Key. It is strongly recommended adjusting the default passwords and the CaaS Master API Key.
All selected passwords must be alphanumeric. Otherwise, problems will occur in connection with CaaS. |
The CaaS Master API Key is automatically created during the installation of CaaS and thus allows the direct use of the REST Interface. |
3.2.2. CaaS repository (caas-mongo)
The configuration of the repository includes two parameters:
- storageClass
-
The possibility of overwriting parameters from the
values.yaml
file mainly affects the parametermongo.persistentVolume.storageClass
.
For performance reasons, we recommend that the underlying MongoDB filesystem is provisioned with XFS. |
- clusterKey
-
For the authentication key of the Mongo Cluster a default configuration is delivered. The key can be defined in the parameter
credentials.clusterKey
. It is strongly recommended that you use the following command to create a new key for productive operation:
openssl rand -base64 756
This value may only be changed during the initial installation. If it is changed at a later time, this can lead to a permanent unavailability of the database, which can only be repaired manually. |
3.2.3. Docker- Registry
An adjustment of the parameters imageRegistry
and imageCredentials
is necessary to configure the used Docker registry.
imageRegistry: docker.company.com/e-spirit
imageCredentials:
username: "username"
password: "special_password"
registry: docker.company.com
enabled: true
3.2.4. Ingress Configurations
Ingress-Definitions control the incoming traffic to the respective component. However, the definitions contained in the chart are not created in the standard configuration. The parameters restApi.ingress.enabled
and restApi.ingressPreview.enabled
allow the ingress configuration for the REST Interface.
The Ingress definitions of the Helm chart assume the NGINX Ingress Controller to be used, since annotations plus the class of this concrete implementation are used. If you are using a different implementation, you must adapt the annotations and the attribute |
The attribute |
restApi:
ingress:
enabled: true
hosts:
- caas.company.com
ingressPreview:
enabled: true
hosts:
- caas-preview.company.com
If the setting options are not sufficient for the specific application, the Ingress can also be generated independently. In this case the corresponding parameter must be set to the value enabled: false
. The following code example provides an orientation for the definition.
apiVersion: networking.k8s.io/v1
child: Ingress
metadata:
labels:
name: caas
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: caas-rest-api
port:
number: 80
host: caas-rest-api.mydomain.com
ingressClassName: my-ingress-caas
3.3. Installation of the Helm-Chart
After the configuration of the Helm-chart it has to be installed into the Kubernetes cluster. The installation is done with the following commands, which must be executed in the directory of the Helm-chart.
kubectl create namespace caas
helm install RELEASE_NAME . --namespace=caas --values /path/to/custom-values.yaml
The name of the release can be chosen freely.
If the namespace is to have a different name, you must replace the specifications within the commands accordingly.
If an already existing namespace is to be used, the creation is omitted and the desired namespace must be specified within the installation command.
Since the containers are first downloaded from the used image registry, the installation can take several minutes. Ideally, however, a period of five minutes should not be exceeded before the CaaS platform is operational.
The status of each component can be obtained with the following command:
kubectl get pods --namespace=caas
Once all components have the status Running
, the installation is complete.
NAME READY STATUS RESTARTS AGE
caas-mongo-0 2/2 Running 0 4m
caas-mongo-1 2/2 Running 0 3m
caas-mongo-2 2/2 Running 0 1m
caas-rest-api-1851714254-13cvn 1/1 Running 0 5m
caas-rest-api-1851714254-13cvn 1/1 Running 0 4m
caas-rest-api-1851714254-xs6c0 1/1 Running 0 4m
3.4. TLS
The communication of the CaaS platform to the outside world is not encrypted by default. If it is to be protected by TLS, there are two configuration options:
- Using an officially signed certificate
-
To use an officially signed certificate, a TLS secret is required, which must be generated first. This must contain the keys
tls.key
and the certificatetls.crt
.The steps necessary to generate the TLS secret are described in the Kubernetes Ingress Documentation.
- Automated certificate management
-
As an alternative to using an officially signed certificate, it is possible to automate the administration using the cert manager. This must be installed within the cluster and takes over the generation, distribution and updating of all required certificates. The configuration of the Cert-Manager allows for example the use and automatic renewal of Let’s-Encrypt-Certificates.
The necessary steps for installation are explained in the Cert-Manager-Documentation.
3.5. Scaling
In order to be able to quickly process the information transferred to CaaS, the CaaS platform must ensure optimal load distribution at all times. For this reason, the REST Interface and the Mongo database are scalable and already configured to deploy at least three instances at a time for failover. This minimum number of instances is mandatory, especially for the Mongo cluster.
3.5.1. REST Interface
The scaling of the REST Interface is done with the help of a Horizontal Pod Autoscaler. Its activation as well as configuration must be done in the custom-values.yaml
file to overwrite the default values defined in the values.yaml
file.
restApi:
horizontalPodAutoscaler:
enabled: false
minReplicas: 3
maxReplicas: 9
targetCPUUtilizationPercentage: 50
The Horizontal Pod Autoscaler allows to scale down or up the REST Interface depending on the current CPU load. The parameter targetCPUUtilizationPercentage
specifies the percentage value from which scaling should take place. At the same time the parameters minReplicas
and maxReplicas
define the minimum and maximum number of possible REST Interfacen instances.
The threshold value for the CPU load should be chosen with care: A wrong configuration can therefore endanger the stability of the system. The official Kubernetes Horizontal Pod Autoscaler-documentation as well as the examples listed in it contain further information on the use of an Horizontal Pod Autoscaler. |
3.5.2. Mongo database
We distinguish horizontal scaling from vertical scaling here. Horizontal scaling means additional instances that handle the traffic. Vertical scaling means allocating more CPU/RAM to existing instances.
Horizontal scaling
Unlike REST Interface, horizontally scaling the Mongo database is only possible manually. Therefore, it cannot be performed automatically using a Horizontal Pod Autoscaler.
Scaling the Mongo database is done using the replicas
parameter. This parameter must be entered in the custom-values.yaml
file to override the default value defined in the values.yaml
file.
At least three instances are required to run the Mongo Cluster, otherwise there is no The chapter Consider Fault Tolerance of the MongoDB documentation describes how many nodes can explicitly fail, until the determination of a new Further information on scaling and replicating the Mongo database is available in the chapters Replica Set Deployment Architectures and Replica Set Elections. |
mongo:
replicas: 3
A downscaling of the Mongo database is not possible without direct intervention and requires a manual reduction of the replicaset of the Mongo database. The MongoDB documentation describes the necessary steps for this. Such intervention increases the risk of failure and is therefore not recommended. |
Vertical scaling
Vertical scaling is done using a Vertical Pod Autoscalers. Vertical Pod Autoscalers are Custom Resources in Kubernetes, so first you need to ensure support in your cluster.
After that, you can configure the following parameters in your custom-values.yaml
:
mongo:
verticalPodAutoscaler:
enabled: false
apiVersion: autoscaling.k8s.io/v1beta2
updateMode: Auto
minAllowed:
cpu: 100m
memory: 500Mi
maxAllowed:
cpu: 1
memory: 2000Mi
Applying the configuration
The updated custom-values.yaml
file must be applied after the configuration changes for the REST Interface or Mongo database with the following command.
helm upgrade -i RELEASE_NAME path/to/caas-<VERSIONNUMBER>.tgz --values /path/to/custom-values.yaml
The release name can be determined with the command |
3.6. Monitoring
The CaaS platform is a microservice architecture and therefore consists of different components. In order to be able to monitor its status properly at any time and to be able to react quickly in the event of an error, integration in a cluster-wide monitoring system is absolutely essential for operation with Kubernetes.
The CaaS platform is already preconfigured for monitoring with Prometheus Operator, since this scenario is widely used in the Kubernetes environment. It includes Prometheus ServiceMonitors for collecting metrics, Prometheus Alerts for notification in case of problems and predefined Grafana dashboards for visualizing the metrics.
3.6.1. Requirements
It is essential to set up monitoring and log persistence for the Kubernetes cluster. Without these prerequisites, there are hardly any analysis possibilities in case of a failure and Technical Support lacks important information. |
- Metrics
-
To install the Prometheus Operator please use the official Helm-Chart, so that cluster monitoring can be set up based on it. For further information please refer to the corresponding documentation.
If you are not running a Prometheus Operator, you must turn off the Prometheus ServiceMonitors and Prometheus Alerts.
- Logging
-
With the use of Kubernetes it is possible to provide various containers or services in an automated and scalable way. To ensure that the logs remain in such a dynamic environment even after an instance has been terminated, an infrastructure must be integrated that persists the instance beforehand.
Therefore, we recommend the use of a central logging system, such as Elastic-Stack. The Elastic or ELK stack is a collection of open source projects that help to persist, search and analyze log data in real time.
Here too, you can use an existing Helm-Chart for the installation.
3.6.2. Prometheus ServiceMonitors
The deployment of the ServiceMonitors provided by the CaaS platform for the REST Interface and the mongo database, is controlled via the custom-values.yaml
file of the Helm-Charts.
The access to the metrics of the REST Interface is secured by API Key and the access to the metrics of the MongoDB by a corresponding MongoDB user. The respective access data is contained in the credentials block of the Please adjust the credentials in your |
Typically, Prometheus is configured to consider only ServiceMonitors with specific labels. The labels can therefore be configured in the custom-values.yaml
file and are valid for all ServiceMonitors of the CaaS Helm chart. Furthermore, the parameter scrapeInterval
allows a definition of the frequency with which the respective metrics are retrieved.
monitoring:
prometheus:
# Prometheus service monitors will be created for enabled metrics. Each Prometheus
# instance has a configured serviceMonitorSelector property, to be able to control
# the set of matching service monitors. To allow defining matching labels for CaaS
# service monitors, the labels can be configured below and will be added to each
# generated service monitor instance.
metrics:
serviceMonitorLabels:
release: "prometheus-operator"
mongo:
enabled: true
scrapeInterval: "30s"
caas:
enabled: true
scrapeInterval: "30s"
The MongoDB metrics are provided via a sidecar container and retrieved with the help of a separate database user. You can configure the database user in the credentials
block of the custom-values.yaml
. The sidecar container is stored with the following standard configuration:
mongo:
metrics:
image: mongodb-exporter:0.11.0
syncTimeout: 1m
3.6.3. Prometheus Alerts
The deployment of the alerts provided by the CaaS platform is controlled via the custom-values.yaml
file of the Helm-Charts.
Prometheus is typically configured to consider only alerts with specific labels. The labels can therefore be configured in the custom-values.yaml
file and apply to all alerts in the CaaS Helm chart:
monitoring:
prometheus:
alerts:
# Labels for the PrometheusRule resource
prometheusRuleLabels:
app: "prometheus-operator"
release: "prometheus-operator"
# Additional Prometheus labels to attach to alerts (or overwrite existing labels)
additionalAlertLabels: {}
caas:
enabled: true
useAlphaAlerts: false
# Namespace(s) that should be targeted by the alerts (supports Go template and regular expressions)
targetNamespace: "{{ .Release.Namespace }}"
3.6.4. Grafana Dashboards
The deployment of the Grafana dashboards provided by the CaaS platform is controlled via the custom-values.yaml
file of the Helm-Charts.
Typically, the Grafana Sidecar Container is configured to consider only configmaps with specific labels and in a defined namespace. The labels of the configmap and the namespace in which it is deployed can therefore be configured in the custom-values.yaml
file:
monitoring:
grafana:
dashboards:
enabled: true
# Namespace that the ConfigMap resource will be created in (supports Go template and regular expressions)
configmapNamespace: "{{ .Release.Namespace }}"
# Additional labels to attach to the ConfigMap resource
configMapLabels: {}
overviewDashboardsEnabled: false
4. Development Environment
Kubernetes and Helm form the basis of all CaaS platform installations. In case of development environments we recommend installing CaaS platform into a separate namespace on your production cluster or any cluster configured similarly. We do not recommend using local CaaS platform instances, even for development.
If you need a local environment on developer machines you have to create a local Kubernetes cluster to be used. One of the following projects may be used to achieve this:
This list does not claim to be exhaustive. Rather, it is intended to give some examples of which we know that operation is generally possible without us permanently using these projects ourselves. |
Each of these projects can be used to manage Kubernetes clusters locally. However, we’re not able to give you support for any of these specific projects. The CaaS platform uses only standard Helm and Kubernetes features and is thus independent of any particular Kubernetes distribution.
Please be sure to configure the following features correctly when using a local Kubernetes cluster:
Kubernetes Image Pull Secrets to resolve the docker images from your local or company Docker registry
disabling monitoring features in
custom-values.yaml
or installing the needed prerequisitestweaking host systems DNS settings to be able to work with Kubernetes Ingress resources or using local port forwards into the cluster
5. REST Interface
5.1. Authentication
Each request to the REST Interface must be authenticated, otherwise it will be rejected. The various authentication options are explained below.
5.1.1. Authentication with API Keys
Each request to the REST Interface must contain an HTTP Authorization header containing the API Key as Bearer token: Authorization: Bearer <key>
. The value of key
is expected to be the value of the key
attribute of the corresponding API Key.
See the Validation of API Keys section below for more information.
5.1.2. Authentication with security token
It is possible to generate a short-lived (up to 24 hours) security token for an API Key. The token contains the same permissions as the API Key which it was generated for. There are two ways to generate and use these tokens:
Query Parameter
A GET request authenticated with an API Key to the /_logic/securetoken?tenant=<db>
endpoint generates a security token. Such a token can be issued only for one specified database, regardless of whether the API Key has permissions on multiple databases. The parameter &ttl=<lifetime in seconds>
is optional. The JSON response contains the security token.
Each request to the REST Interface can optionally be authenticated using a query parameter ?securetoken=<token>
.
Cookie
A GET request authenticated with an API Key to the /_logic/securetokencookie?tenant=<db>
endpoint generates a security token cookie. Such a cookie can be issued only for one specified database, regardless of whether the API Key has permissions on multiple databases. The parameter &ttl=<lifetime in seconds>
is optional. The response includes a set-cookie
header with the security token.
All requests that include this cookie get automatically authenticated.
5.2. Query documents and media
The REST Interface can be used to manage and query content in the form of JSON documents over HTTP. They are stored in so-called collections, which are subordinated to databases. The following three-part URL scheme applies:
https://REST-HOST:PORT/<database>/<collection>/<document>
- database
-
This part of the URL contains the tenant ID.
- collection
-
The collection is composed of the FirstSpirit-project UUID and the respective preview or release state.
- document
-
In this case, the UUID of the FirstSpirit element is used together with the language locale.
Binary content (media) is an exception in that it is stored in so-called buckets. The associated collections always end with the suffix .files
:
https://REST-HOST:PORT/<tenant>/<project>.<release|preview>.files/<document>
Please note that binary content is not transferred to the CaaS buckets in our cloud offering. |
5.2.1. HAL format
The interface returns all results in HAL format. This means that they are not simply raw data, such as traditionally unstructured content in JSON format.
The HAL format offers the advantage of simple but powerful structuring. In addition to the required content, the results contain additional meta-information on the structure of this content.
Example
{
"_size": 5,
"_total_pages": 1,
"_returned": 3,
"_embedded": { CONTENT }
}
In this example a filtered query was sent. Without knowing the exact content, its structure can be read directly from the meta information. At this point, the REST Interface returns three results from a set of five documents corresponding to the filter criteria and displays them on a single page.
If the element to be requested is a medium, the URL only determines its metadata. The HAL format contains corresponding links that refer to the URL with the actual binary content of the medium. For further information please refer to the documentation. |
5.2.2. Page size of queries
The results of REST Interface are always delivered paginated. To control the page size and requested page, the HTTP query parameters pagesize
and page
can be used for GET requests. The default value for the pagesize
parameter is set to 20 in the CaaS platform and the maximum is set to 100. These values can be changed in your custom-values.yaml
in case of on-premises installation. For more information, see the RESTHeart documentation.
5.2.3. Use of filters
Filters are always used when documents are not to be determined by their ID but by their content. In this way, both single and multiple documents can be retrieved.
For example, the query of all English language documents from the products collection has the following structure:
https://REST-HOST:PORT/Database/products?filter={fs_language: "EN"}
Beyond this example there are further filter possibilities. For more information, see query documentation. |
5.3. Storage of documents
The HTTP methods POST, PUT and PATCH can be used for storing documents. Documents can also be deleted with the DELETE method.
The following excerpt shows the creation of a document my-document
within the collection my-collection
, which is located in the database my-db
.
curl --location --request PUT 'https://REST-HOST:PORT/my-db/my-collection/my-document' \
--header 'Authorization: Bearer my-api-key' \
--header 'Content-Type: application/json' \
--data-raw '{
"data": "some-data"
}'
For more information about saving documents, refer to corresponding sections in the RESTHeart documentation.
When saving documents using the POST, PUT or PATCH methods, the write mode |
5.4. Managing databases and collections
Unlike the storage of documents, the management of databases and collections is limited to the HTTP methods PUT and DELETE.
The following excerpt shows the creation of the database my-db
with a PUT request.
curl --location --request PUT 'https://REST-HOST:PORT/my-db' \
--header 'Authorization: Bearer my-api-key'
There are reserved databases that can’t be used for saving content. The reserved database names include |
Further information on database management can be found in the corresponding sections of the RESTHeart documentation.
A collection my-collection
can be created in the database my-db
with a PUT request as follows.
Managing databases is not supported in our SaaS offering due to access restrictions. |
curl --location --request PUT 'https://REST-HOST:PORT/my-db/my-collection' \
--header 'Authorization: Bearer my-api-key'
There are reserved collections that can’t be used for saving content. The reserved collection names include |
For more information on managing collections, see corresponding sections in the RESTHeart documentation.
5.5. Management of API Keys
API Keys, like all other resources in CaaS, can be managed via REST endpoints. In general, it is important to distinguish that API Keys can be managed at two levels: global or local per database. Global API Keys differ from local API Keys by their scope of validity.
When using an API Key for authentication, the CaaS platform always searches the local API Keys first. If no matching API Key is found, the global API Keys are evaluated afterwards.
5.5.1. Global API Keys
Global API Keys are cross-database and are managed in the apikeys
collection of the caas_admin
database. Unlike local API Keys, they allow permissions to be defined for multiple or even all databases.
5.5.2. Local API Keys
Local API Keys are defined per database and are managed accordingly in the apikeys
collection of any database. Unlike global API Keys, local API Keys can only define permissions for resources within the same database.
5.5.3. Authorization Model
The authorization of an API Key is performed on the basis of all of its permission entries. Its permission entries are defined in the permissions
attribute.
The url
attribute of a permission is used to check whether access should be granted. The value is compared with the URL path of an incoming request. What type of comparison is executed depends on the mode of the permission.
There are three different modes:
PREFIX
andREGEX
With mode
PREFIX
, the authorization checks whether theurl
attribute of the permission is a prefix of the URL path of an incoming request.The
REGEX
mode expects a regular expression in theurl
attribute. Using this mode, the authorization checks whether the regular expression pattern matches the URL path of an incoming request.Additionally, for the modes
PREFIX
andREGEX
a general authorization distinction is made when it comes to the type of the API Key. Global API Keys always check against the entire path of the request, while local API Keys check against the part of the path after the database. For more information regarding global and local API Keys see the example Local and global API Key distinction or chapter Management of API Keys.GRAPHQL
The modeGRAPHQL
of a permission authorizes the execution of a specific GraphQL app. During an authorization check theurl
attribute of the permissions must exactly match the URI of a GraphQL app that an incoming request is trying to execute. The URI of a GraphQL app is defined in thedescriptor.uri
attribute of an app definition. See chapter GraphQL for more information.
The following table includes examples that illustrates the authorization distinction that is made for local and global API keys when using the permission mode PREFIX
or REGEX
.
authorization in API Key | type of API Key | request URL path | Allowed |
---|---|---|---|
/ |
global |
/ |
yes |
/project/ |
yes |
||
/project/content/ |
yes |
||
/other-project/ |
yes |
||
/other-project/content/ |
yes |
||
/project/ |
global |
/ |
no |
/project/ |
yes |
||
/project/content/ |
yes |
||
/other-project/ |
no |
||
/other-project/content/ |
no |
||
/ |
local in |
/ |
no |
/project/ |
yes |
||
/project/content/ |
yes |
||
/other-project/ |
no |
||
/other-project/content/ |
no |
||
/content/ |
local in |
/ |
no |
/project/ |
no |
||
/project/content/ |
yes |
||
/other-project/ |
no |
||
/other-project/content/ |
no |
5.5.4. REST endpoints
The following endpoints are available for managing API Keys:
GET /<database>/apikeys
POST /<database>/apikeys
Note: the parameters_id
andkey
are mandatory and must have identical valuesPUT /<database>/apikeys/{id}
Note: the parameterkey
must have the same value as the {id} in the URLDELETE /<database>/apikeys/{id}
To manage API Keys, an API Key can also be used as an authorization method. In this case, the API Key used must have write permission on the corresponding API Keys collection. This is also true for read-only requests and prevents privilege escalation. |
The database is based on the type of API Key.
The following snippet shows the example creation of a local API Key.
curl "https://REST-HOST:PORT/<tenant>/apikeys" \
-H 'Content-Type: application/json' \
-u '<USER>:<PASSWORD>' \
-d $'{
"_id": "1e0909b7-c943-45a5-ae96-79f294249d48",
"key": "1e0909b7-c943-45a5-ae96-79f294249d48",
"name": "New-Apikey",
"description": "Some descriptive text",
"permissions": [
{
"url": "/<collection>",
"permissionMode": "PREFIX",
"methods": [
"GET",
"PUT",
"POST",
"PATCH",
"DELETE",
"HEAD",
"OPTIONS"
]
}
]
}'
In this example, a new ApiKey is created via cURL, which has the appropriate permissions set in the (via the url
attribute) to the specified collection.
To create an API Key with permission mode REGEX simply adjust the above example as follows:
|
The |
5.5.5. Validation of API Keys
Each API Key is validated against a stored JSON schema when created and updated. The JSON schema secures the basic structure of API Keys and can be queried at /<database>/_schemas/apikeys
.
Further validations ensure that no two API Keys can be created with the same key
. Likewise, an API Key must not contain a URL more than once.
If an API Key does not satisfy the requirements, the corresponding request is rejected with HTTP status 400.
If the JSON schema has not been successfully stored in the database before, requests are answered with HTTP status 500.
The |
5.6. Indexes for efficient query execution
The runtime of queries with filters can get longer as the number of documents in a collection grows. If it exceeds a certain value, the query is answered by the REST Interface with HTTP status 408. More efficient execution can be achieved by creating an index on the attributes used in the affected filter queries.
For detailed information on database indices, please refer to the documentation of the MongoDB.
5.6.1. Predefined indexes
If you have CaaS Connect in use, predefined indices are already created that support some frequently used filter queries. The exact definitions can be found at https://REST-HOST:PORT/Database/Collection/_indexes/
.
5.6.2. Customer-specific indexes
If the predefined indices do not cover your use cases and you observe long response times or even request timeouts, you can create your own indexes. The REST Interface can be used to manage the desired indexes. The procedure is described in the RESTHeart documentation.
Please only create the indexes you need. |
5.7. Managing GraphQL apps
It is possible to create, update and delete GraphQL applications with the REST Interface. These applications consist of a definition and a GraphQL schema. The definition must first be created in the reserved collection gql-apps
for each desired GraphQL application. The collection is automatically created when databases are created/updated. However, if it does not yet exist in the database, it must be created beforehand with a PUT request.
When editing the GraphQL app definitions (e.g., Create/update app), the permissions of the API Key are validated. An operation can only be performed if the API Key has access to all databases and collections listed in the definition.
The |
Mutations are currently not supported. |
5.7.1. Create/update app
To create or update a GraphQL app, a PUT request including the app definition is made to the following URL:
https://REST-HOST:PORT/<tenant>/gql-apps/<tenant>___<name>
Creating the definition will provision an endpoint at the following URL:
https://REST-HOST:PORT/graphql/<tenant>___<name>
For more information on how to execute GraphQL queries see chapter GraphQL API. |
There are two ways for creating an GraphQL app:
Using JSON body ("application/json")
The app definition with the sections descriptor, schema and mapping is passed in the request body (content type application/json
) . The descriptor.uri
parameter must match the last path segment of the URL (i.e. <tenant>___<name>) and is not optional in the CaaS - contrary to what is stated in the RESTHeart documentation.
An example of such a GraphQL app definition can be found in the GraphQL example application chapter. |
Using file upload ("multipart/form-data")
The app definition and schema can also be created/updated using a multipart upload (content type multipart/form-data
). This allows storing the app definition and schema as separate files and more importantly, the schema does not need to be JSON encoded when creating/updating the app.
To upload the app definition and schema, each must be present in the request as individual parts:
Part name:
app
Contains the app definition as JSON.
Part name:
schema
Contains the raw text version of the schema in the GraphQL schema definition language.
curl
curl -i -X PUT \
-H "Authorization: Bearer $API_KEY" \
-F app=@my-app-def.json \
-F schema=@my-schema.gql \
https://REST-HOST:PORT/<tenant>/gql-apps/<tenant>___<name>
For fast feedback cycles during development, this can be combined with a tool to watch for file changes to continuously update the GraphQL app:
Uploading files continuously using
fswatch and curl
|
5.7.2. Delete app
To delete a GraphQL app, a DELETE request is made to the following URL:
https://REST-HOST:PORT/<tenant>/gql-apps/<tenant>___<name>
5.7.3. Re-synchronization of existing GraphQL apps
Under certain conditions, such as after restoring individual collections from a backup, it might happen that the previously created GraphQL apps are out of sync. In such a case, all tenant-specific GraphQL apps must be resynchronized.
To trigger the resynchronization of all existing GraphQL apps of all tenants, send a POST request against the /_logic/sync-gql-apps
endpoint.
5.8. Push notifications (change streams)
It is often convenient to be notified about changes in the CaaS platform. For this purpose the CaaS platform offers change streams. This feature allows a websocket connection to be established to the CaaS platform, through which events about the various changes are published.
Change streams are created by putting a definition in the metadata of a collection. If you use CaaS Connect, a number of predefined change streams are already created for you. You also have the option to define your own change streams.
The format of the events corresponds to standard MongoDB events.
When working with websockets, we recommend taking into account connection failures that may occur. Regular |
You can find an example of using change streams in the browser in the appendix. |
5.9. Additional information
Additional information regarding the functionality of the REST interface can be found in the official RESTHeart documentation.
6. GraphQL API
Each of the GraphQL applications defined through the management API (see Managing GraphQL apps) provisions a GraphQL API endpoint. This endpoint can be used to fetch data (see Fetch data).
Mutations are currently not supported. |
6.1. Authentication and authorization
The authentication and authorization process is the same as that of the REST Interface (see Authentication with API Keys for more information). When executing GraphQL queries, the permissions of the API Key are validated. An operation can only be performed if the API Key has access to all databases and collections listed in the definition.
Alternatively, it is possible to explicitly authorize a API Key to be used for the execution of individual GraphQL endpoints. Such API Keys do not need permissions to access the underlying collections. The Authorization Model chapter describes how to define such explicit permissions.
6.2. Fetch data
The GraphQL API can be queried through HTTP endpoints at:
https://REST-HOST:PORT/graphql/<app-uri>
To query data, send a POST request with JSON to the desired endpoint and specify the query in the request body, for example:
curl
curl -i -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{"query": "query($lang: [String!]){products(_language: $lang) {name description categories {name} picture {name binaryUrl width height}}}", "variables": {"lang": ["EN"]}}' \
https://REST-HOST:PORT/graphql/<app-uri>
You can find a more elaborate example of using GraphQL in the appendix. |
Just like page size limits in REST Interface queries, there is a limit to the amount of results in GraphQL queries as well. The default value is set to 20 and the maximum to 100. Please use pagination if the number of required documents exceeds the maximum allowed result size. |
7. Metrics
Metrics are used for monitoring and error analysis of CaaS components during operation and can be accessed via HTTP endpoints. If metrics are available in Prometheus format, corresponding ServiceMonitors are generated for this purpose, see also Prometheus ServiceMonitors.
7.1. REST Interface
Healthcheck
The Healthcheck endpoint provides information about the functionality of the corresponding component in the form of a JSON document. This status is calculated from several checks. If all checks are successful, the JSON response has the HTTP status 200. As soon as at least one check has the value false
, the response has HTTP status 500.
The query is made using the URL: \\http://REST-HOST:PORT/_logic/healthcheck
The functionality of the REST Interface depends on the accessibility of the MongoDB cluster as well as on the existence of a primary node. If the cluster does not have a primary node, it is not possible to perform write operations on the MongoDB. |
HTTP Metrics
Metrics for HTTP requests and responses of the REST Interface can be retrieved as a JSON document or in Prometheus format at the following URL http://REST-HOST:PORT/_metrics
Further information is available in the RESTHeart documentation. |
7.2. MongoDB
The metrics of the MongoDB are provided by a sidecar container. This container accesses the MongoDB metrics with a separate database user and provides them via HTTP.
The metrics can be accessed at the following URL: http://MONGODB-HOST:METRICS-PORT/metrics
.
Please note that the MongoDB metrics are delivered via a separate port. This port is not accessible from outside the cluster and therefore not protected by authentication. |
8. Maintenance
The transfer of data to CaaS can only work if the individual components work properly. If faults occur or an update is necessary, all CaaS components must therefore be considered. The following subchapters describe the necessary steps of an error analysis in case of a malfunction and the execution of a backup or update.
8.1. Error analysis
CaaS is a distributed system and is based on the interaction of different components. Each of these components can potentially generate errors. Therefore, if a failure occurs while using CaaS, it can have several causes. The basic analysis steps for determining the causes of faults are explained below.
- Status of the components
-
The status of each component of the CaaS platform can be checked using the
kubectl get pods --namespace=<namespace>
command. If the status of an instance differs fromrunning
orready
, it is recommended to start debugging at this point and check the associated log files.
If there are problems with the Mongo database, check whether a The chapter Consider Fault Tolerance of the MongoDB documentation describes how to avoid this, how many nodes can explicitly fail until the determination of a new |
- Analysis of the logs
-
In case of problems, the log files are a good starting point for analysis. They offer the possibility to trace all processes on the systems. In this way, any errors and warnings become apparent.
Current log files of the CaaS components can be viewed using
kubectl --namespace=<namespace> logs <pod>
, but only contain events that occurred within the lifetime of the current instance. To be able to analyze the log files after a crash or restart of an instance, we recommend setting up a central logging system.
The log files can only be viewed for the currently running container. For this reason, it is necessary to set up a persistent storage to access the log files of already finished or newly started containers. |
8.2. Backup
The architecture of CaaS consists of different, independent components that generate and process different information. If there is a need for data backup, this must therefore be done depending on the respective component.
A backup of the information stored in CaaS must be performed using the standard mechanisms of the Mongo database. This can either be done by creating a copy of the underlying files or by using mongodump
.
8.3. Update
Operating the CaaS platform with Helm in Kubernetes provides the possibility of updating to the new version without the need for a new installation.
Before updating the Mongo database, a Backup is strongly recommended. |
The helm list --all-namespaces
command first returns a list of all already installed Helm charts. This list contains both the version and the namespace of the corresponding release.
\$ helm list --all-namespaces
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
firstinstance integration 1 2019-12-11 15:51.. DEPLOYED caas-2.10.4 caas-2.10.4
secondinstance staging 1 2019-12-12 09:31.. DEPLOYED caas-2.10.4 caas-2.10.4
To update a release, the following steps must be carried out one after the other:
- Transfer the settings
-
To avoid losing the previous settings, it is necessary to have the
custom-values.yaml
file with which the initial installation of the Helm chart was carried out. - Adoption of further adjustments
-
If there are adjustments to files (e.g. in the
config
directory), these must also be adopted. - Update
-
After performing the previous steps, the update can be started. It replaces the existing installation with the new version without any downtime. To do this, execute the following command, which starts the process:
helm upgrade RELEASE_NAME caas-16.18.2.tgz --values /path/to/custom-values.yaml
9. Appendix
9.1. Examples
9.1.1. Change stream example
<script type="module">
import PersistentWebSocket from 'https://cdn.jsdelivr.net/npm/pws@5/dist/index.esm.min.js';
// Replace this with your API key (needs read access for the preview collection)
const apiKey = "your-api-key";
// Replace this with your preview collection url (if not known copy from CaaS Connect Project App)
// e.g. "https://REST-HOST:PORT/my-tenant-id/f948bb48-4f6b-4a8a-b521-338c9d352f2b.preview.content"
const previewCollectionUrl = new URL("your-preview-collection-url");
const pathSegments = previewCollectionUrl.pathname.split("/");
if (pathSegments.length !== 3) {
throw new Error(`The format of the provided url '${previewCollectionUrl}' is incorrect and should only contain two path segments`);
}
(async function(){
// Retrieving temporary auth token
const token = await fetch(new URL(`_logic/securetoken?tenant=${pathSegments[1]}`, previewCollectionUrl.origin).href, {
headers: {'Authorization': `Bearer ${apiKey}`}
}).then((response) => response.json()).then((token) => token.securetoken).catch(console.error);
// Establishing WebSocket connection to the change stream "crud"
// ("crud" is the default change stream that the CaaS Connect module provides)
const wsUrl = `wss://${previewCollectionUrl.host + previewCollectionUrl.pathname}`
+ `/_streams/crud?securetoken=${token}`;
const pws = new PersistentWebSocket(wsUrl, { pingTimeout: 60000 });
// Handling change events
pws.onmessage = event => {
const {
documentKey: {_id: documentId},
operationType: changeType,
} = JSON.parse(event.data);
console.log(`Received event for '${documentId}' with change type '${changeType}'`);
}
})();
</script>
9.1.2. GraphQL example application
This chapter describes a use case for a GraphQL application as an example. For this we will outline the individual steps that belong to the creation of a GraphQL application and its later use.
Create the GraphQL app definition
In the example scenario, a GraphQL application is created that can be used to query data records located in the CaaS. The data sets used here are the products from the example project of the fictitious company “Smart Living”. Image references and product categories in the data sets are resolved directly.
The entire command to create the GraphQL app definition for this example scenario looks like this.
curl --location --request PUT 'https://REST-HOST:PORT/mycorp-dev/gql-apps/mycorp-dev___products' \
--header 'Authorization: Bearer <PERMITTED_APIKEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"descriptor": {
"name": "products",
"description": "example app to fetch product relevant information from SLG",
"enabled": true,
"uri": "mycorp-dev___products"
},
"schema": "type Picture{ name: String! identifier: String! binaryUrl: String! width: Int! height: Int! } type Category{ name: String! identifier: String! } type Product{ name: String! identifier: String! description: String categories: [Category] picture: Picture } type Query{ products(_language: [String!] = [\"DE\", \"EN\"]): [Product] }",
"mappings": {
"Category": {
"name": "displayName",
"identifier": "_id",
},
"Picture": {
"name": "displayName",
"identifier": "_id",
"binaryUrl": "resolutionsMetaData.ORIGINAL.url",
"width": "resolutionsMetaData.ORIGINAL.width",
"height": "resolutionsMetaData.ORIGINAL.height"
},
"Product": {
"name": "displayName",
"identifier": "_id",
"description": "formData.tt_abstract.value",
"picture": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"identifier": {
"$fk": "formData.tt_media.value.0.formData.st_media.value.identifier"
},
"locale.identifier": {
"$fk": "locale.identifier"
}
}
},
"categories": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"identifier": {
"$in": {
"$fk": "formData.tt_categories.value.identifier"
}
},
"locale.identifier": {
"$fk": "locale.identifier"
}
}
}
},
"Query": {
"products": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"locale.identifier": { "$in": { "$arg": "_language" } },
"entityType": "product"
}
}
}
}
}'
When creating a GraphQL app definition, the schema must be specified as a JSON string. For better readability, we recommend formatting the schema more appropriately. |
Scheme of the GraphQL app definition
The schema used for the example contains the following definitions.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
type Picture {
name: String!
identifier: String!
binaryUrl: String!
width: Int!
height: Int!
}
type Category {
name: String!
identifier: String!
}
type Product {
name: String!
identifier: String!
description: String
categories: [Category]
picture: Picture
}
type Query {
products(_language: [String!] = ["DE", "EN"]): [Product]
}
Lines 1, 9 and 14 of the schema are the starting point for the type definitions of the objects used in the GraphQL app. In addition, each GraphQL schema contains a query
type (line 22) that defines what data can be queried by a GraphQL app. More details about schemas in GraphQL can be found in the GraphQL documentation.
In this example we will query product records. This is made possible in line 23 of the schema by the type [Product]
. The queries should be able to specify the languages for which we want to get products. The specification of the language shall be optional in this example. This requirement is made possible by specifying a variable called _language
, which defaults to ["DE", "EN"]
.
Mapping the GraphQL app definition +. The GraphQL app definition mapping represents the connection between the schema and the data in the database. Each type described in the schema generally requires an explicit entry, so this part of a GraphQL app definition is usually the longest. There may be situations where the fields in the type should be named exactly as the keys of the data. In this special case, no explicit entry in the mapping is necessary. For details about the mapping in GraphQL app definition, see the corresponding chapter in RESTHeart documentation.
The following example is an excerpt from creating a GraphQL app definition and clarifies some use cases.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
{
"Category": {
"name": "displayName",
"identifier": "_id"
},
"Picture": {
"name": "displayName",
"identifier": "_id",
"binaryUrl": "resolutionsMetaData.ORIGINAL.url",
"width": "resolutionsMetaData.ORIGINAL.width",
"height": "resolutionsMetaData.ORIGINAL.height"
},
"Product": {
"name": "displayName",
"identifier": "_id",
"description": "formData.tt_abstract.value",
"picture": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"identifier": {
"$fk": "formData.tt_media.value.0.formData.st_media.value.identifier"
},
"locale.identifier": {
"$fk": "locale.identifier"
}
}
},
"categories": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"identifier": {
"$in": {
"$fk": "formData.tt_categories.value.identifier"
}
},
"locale.identifier": {
"$fk": "locale.identifier"
}
}
}
},
"Query": {
"products": {
"db": "mycorp-dev",
"collection": "d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content",
"find": {
"locale.identifier": { "$in": { "$arg": "_language" } },
"entityType": "product"
}
}
}
}
The first use case considered is the so-called field to field mapping
. In this type of mapping, a field in the type is assigned a corresponding attribute of the data. An example of this can be seen in line 3, where the field Category.name
from the schema refers to attribute displayName
from the data.
The second use case is the field to query mapping
. Here a field in the type is mapped to the result of a data query. An example of such a mapping can be found in line 45ff: the field Query.products
is mapped by the data found in the REST Interface under /mycorp-dev/d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content
and correspond to the filters entityType": "product"
and "locale.identifier": { "$in": { "$arg":"_language" } }
. This means that exactly those products are queried which are located in the defined source, represent an entity of “product” and use one of the language abbreviations passed in the “_language” argument.
Another example of a “field to query mapping” can be found starting at line 29. In this mapping definition, the product categories, which are maintained in separate records, are identified using a foreign key relationship. The complete entry from line 29-42 states that the product.categories
field will list all product categories that are under /mycorp-dev/d8db6f24-0bf8-4f48-be47-5e41d8d427fd.preview.content
, whose identifier
is stored in the formData.tt_categories.value.identifier
field, and whose locale.identifier
exactly matches what is in the product record as locale.identifier
. Since a product can reference multiple categories in the dataset under formData.tt_categories.value.identifier
, the key $in
is used here.
If multiple filters are specified in a |
Complex mappings with multiple foreign key relationships may result in increased query response times. For more efficient query execution, we recommend using indexes. Configuring batching and caching can also help optimize response times. Details can be found in this documentation. |
All operations on GraphQL applications via the REST interface are mirrored in the background to the global The Re-synchronization of existing GraphQL apps chapter contains more information on how this mechanism can also be executed manually. |
Aggregation and GraphQL
Aggregations can be used to make more complex queries to the CaaS. For example, the number of documents that match a certain condition can be counted. Much more complex aggregations can be created. A list of possible aggregation stages and operators can be found here.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
"_id": "example___count",
"descriptor": {
"name": "mithras",
"description": "Query mithras PageRef",
"enabled": true,
"uri": "example___count"
},
"schema": "type PageRefs {_id: String count: Int} type Query{countPageRefs: [PageRefs]}",
"mappings": {
"PageRefs": {
"count": "count"
},
"Query": {
"countPageRefs":{
"db": "example",
"collection": "641154a9-b90c-4b10-a5f7-38677cbb5abc.release.content",
"stages": [
{ "$match": {"fsType":"PageRef"}},
{ "$count": "count" }
]
}
}
}
}
Using the GraphQL app
Requests can now be made to a GraphQL application using this GraphQL app definition. Through the GraphQL feature of introspection, REST clients supporting GraphQL can query the schema stored in the GraphQL app definition, making it easier for users of a GraphQL app to formulate queries.
curl --location --request POST 'https://REST-HOST:PORT/graphql/mycorp-dev___products' \
--header 'Authorization: Bearer <PERMITTED_APIKEY>' \
--header 'Content-Type: application/json' \
--data-raw '{"query": "query($lang: [String!]){products(_language: $lang) {name description categories {name} picture {name binaryUrl width height}}}", "variables": {"lang": ["DE"]}}'
This example of a GraphQL query shows the use of the example GraphQL app by cURL. The invocation of a GraphQL app is always done through the graphql
database followed by the descriptor.uri
set in the GraphQL app definition. Through this query, product data is retrieved depending on the variable $lang
. The variable is passed as a value to the _language
argument defined in the schema. Since a reasonable default value for _language
is included in the GraphQL app definition in the schema, the use of variables is optional in this scenario. Further details on arguments and variables can be found in the GraphQL documentation.
9.2. Troubleshooting: Known errors
9.2.1. File upload using PUT request fails
The errors
E11000 duplicate key error collection: [some-file-bucket].chunks index: files_id_1_n_1 dup key
or
error updating the file, the file bucket might have orphaned chunks
indicate the presence of orphaned file chunks in the MongoDB data. The orphaned data can be removed with the following mongo
shell script:
// Name of the file bucket to clean up (e.g., my-bucket.files)
var filesBucket = "{YOUR_FILE_BUCKET_NAME}";
var chunksCollection = filesBucket.substring(0, filesBucket.lastIndexOf(".")) + ".chunks";
db[chunksCollection].aggregate([
// avoid accumulating binary data in memory
{ $unset: "data" },
{
$lookup: {
from: filesBucket,
localField: "files_id",
foreignField: "_id",
as: "fileMetadata",
}
},
{ $match: { fileMetadata: { $size: 0 } } }
]).forEach(function (c) {
db[chunksCollection].deleteOne({ _id: c._id });
print("Removed orphaned GridFS chunk with id " + c._id);
});
10. Help
The Technical Support of the Crownpeak Technology GmbH provides expert technical support covering any topic related to the FirstSpirit™ product. You can get and find more help concerning relevant topics in our community.
11. Disclaimer
This document is provided for information purposes only. Crownpeak Technology GmbH may change the contents hereof without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. Crownpeak Technology GmbH specifically disclaims any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. The technologies, functionality, services, and processes described herein are subject to change without notice.