1. Introduction
The CaaS platform acts as the link between FirstSpirit and the customer’s end application. The REST Interface receives information and updates it in the internal persistence layer of the CaaS platform. Data in the customer’s end application is updated by requests to the REST Interface.
The CaaS platform consists of the following components, which are provided as Docker containers:
- REST Interface (caas-rest-api)
-
The REST Interface is used for both transferring and querying data into and from the CaaS repository. For this purpose it provides a REST endpoint that can be used by any service. It also supports authentication and authorization.
Between CaaS version 2.11 and 2.13 (inclusive), authentication and authorization functionality was provided by a separate Security Proxy.
- CaaS repository (caas-mongo)
-
The CaaS repository is not accessible from the Internet and is only reachable within the platform from the REST Interface. It serves as the storage for all project data and internal configuration.
This document is intended for operators of the CaaS platform and contains information and instructions for operating and technically administering the platform.
A description of the functions and usage options of the REST Interface of the CaaS platform can be found in the separate documentation of the REST Interface.
2. Technical Requirements
The operation of the CaaS platform is to be implemented with Kubernetes.
|
If you do not feel able to operate, configure, monitor, and analyze and resolve operational problems of the cluster infrastructure, we strongly advise against on-premises operation and refer to our SaaS offering. |
Since the CaaS platform is delivered as a Helm artifact, Helm must be available as a client.
|
It is important that Helm is installed securely. Further information can be found in the Helm installation guide. |
For system requirements, please refer to the technical datasheet of the CaaS platform.
3. Installation and Configuration
Setting up the CaaS platform for operation with Kubernetes is done using Helm charts. These are included in the delivery and already contain all required components.
The following subchapters describe the necessary installation and configuration steps.
3.1. Importing the Images
The setup of the CaaS platform requires the import of the images into your central Docker registry (e.g. Artifactory) as the first step. The images are included in the delivery in the file caas-docker-images-20.12.4.zip.
|
The credentials for the cluster’s access to the registry must be known. |
Please refer to the documentation of your registry for the necessary steps for the import.
3.2. Helm Chart Configuration
After importing the images, the configuration of the Helm chart is necessary. This is included in the delivery and can be found in the file caas-20.12.4.tgz. A default configuration of the chart is already provided in the values.yaml file. All parameters specified in this values.yaml can be overwritten with specific values in a manually created custom-values.yaml.
3.2.1. Authentication
All authentication settings for communication with or within the CaaS platform are defined in the credentials block of the custom-values.yaml.
This includes usernames, default passwords, and the CaaS Master API Key. It is strongly recommended to change the default passwords and the CaaS Master API Key.
|
All chosen passwords must be alphanumeric. Otherwise, issues may occur with CaaS. |
|
The CaaS Master API Key is automatically created during the installation of the CaaS platform and thus enables direct use of the REST Interface. |
3.2.2. CaaS Repository (caas-mongo)
The configuration of the repository includes two parameters:
- storageClass
-
The ability to overwrite parameters from the
values.yamlfile mainly concerns the parametermongo.persistentVolume.storageClass.
|
For performance reasons, we recommend that the underlying file system of MongoDB is provisioned with XFS. |
- clusterKey
-
A default configuration for the Mongo cluster authentication key is provided. The key can be set in the parameter
credentials.clusterKey. It is strongly recommended to generate a new key for production use with the following command:
openssl rand -base64 756
|
This value should only be changed during the initial installation. Changing it later may lead to permanent database unavailability, which can only be repaired manually. |
3.2.3. Docker Registry
Adjusting the parameters imageRegistry and imageCredentials is necessary to configure the used Docker registry.
imageRegistry: docker.company.com/e-spirit
imageCredentials:
username: "username"
password: "special_password"
registry: docker.company.com
enabled: true
3.2.4. Ingress Configurations
Ingress definitions control incoming traffic to each component. The definitions included in the chart are not created with the default configuration. The parameters restApi.ingress.enabled and restApi.ingressPreview.enabled allow Ingress configuration for the REST Interface.
|
The ingress definitions of the Helm chart require the NGINX Ingress Controller, as annotations and the class of this specific implementation are used. If you use a different implementation, you must adjust the annotations and the |
restApi:
ingress:
enabled: true
hosts:
- caas.company.com
ingressPreview:
enabled: true
hosts:
- caas-preview.company.com
If the configuration options are not sufficient for your specific use case, you can create the Ingress yourself. In this case, set the corresponding parameter to enabled: false. The following code example provides guidance for the definition.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
labels:
name: caas
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: caas-rest-api
port:
number: 80
host: caas-rest-api.mydomain.com
ingressClassName: my-ingress-caas
3.3. Helm Chart Installation
After configuring the Helm chart, it must be installed in the Kubernetes cluster. Installation is performed using the following commands, which must be executed in the directory of the Helm chart.
kubectl create namespace caas
helm install RELEASE_NAME . --namespace=caas --values /path/to/custom-values.yaml
The release name can be chosen freely.
If you want to use a different namespace, you must adjust the commands accordingly.
If you want to use an existing namespace, the creation step is omitted and the desired namespace is specified in the installation command.
Since the containers are first downloaded from the used image registry, installation may take a few minutes. Ideally, it should not take more than five minutes before the CaaS platform is ready for use.
The status of the individual components can be retrieved with the following command:
kubectl get pods --namespace=caas
Once all components have the status Running, installation is complete.
NAME READY STATUS RESTARTS AGE
caas-mongo-0 2/2 Running 0 4m
caas-mongo-1 2/2 Running 0 3m
caas-mongo-2 2/2 Running 0 1m
caas-rest-api-1851714254-13cvn 1/1 Running 0 5m
caas-rest-api-1851714254-13cvn 1/1 Running 0 4m
caas-rest-api-1851714254-xs6c0 1/1 Running 0 4m
3.4. TLS
Communication from the CaaS platform to the outside is not encrypted by default. If it should be protected by TLS, there are two configuration options:
- Use of an officially signed certificate
-
To use an officially signed certificate, a TLS secret is required, which must first be created. This must contain the keys
tls.keyand the certificatetls.crt.The steps required to create the TLS secret are described in the Kubernetes ingress documentation.
- Automated certificate management
-
Alternatively, you can automate certificate management using Cert-Manager. This must be installed in the cluster and takes care of the creation, distribution, and renewal of all required certificates. The configuration of Cert-Manager enables, for example, the use and automatic renewal of Let’s Encrypt certificates.
The necessary installation steps are described in the Cert-Manager documentation.
3.5. Scaling
To quickly process the information transferred to the CaaS, the CaaS platform must always ensure optimal load distribution. For this reason, the REST Interface and the Mongo database are scalable and, in terms of fail-safety, are already configured so that at least three instances are deployed. This minimum number of instances is especially required for the Mongo cluster.
3.5.1. REST Interface
Scaling of the REST Interface is done using Horizontal Pod Autoscalers. This allows the REST Interface to scale up or down depending on the current CPU load.
The parameter targetCPUUtilizationPercentage specifies the percentage at which scaling should occur. The parameters minReplicas and maxReplicas define the minimum and maximum number of possible REST Interface instances.
|
The CPU load threshold should be chosen carefully: If the percentage is too low, the REST Interface will scale up too early in case of increasing load. If it is too high, scaling may not occur quickly enough when load increases. Incorrect configuration can therefore jeopardize system stability. The official Kubernetes Horizontal Pod Autoscaler documentation and the examples listed in it provide further information on using an Horizontal Pod Autoscaler. |
Enabling the Horizontal Pod Autoscaler
Enabling and configuring the Horizontal Pod Autoscaler should be done in the custom-values.yaml file to overwrite the default values defined in the values.yaml file.
restApi:
horizontalPodAutoscaler:
enabled: false
minReplicas: 3
maxReplicas: 9
targetCPUUtilizationPercentage: 50
|
Enabling the Horizontal Pod Autoscaler in the If you want to avoid this behavior, remove the replica field from the deployment manifest in the Helm history secret before rolling out. Background information on migrating to an Horizontal Pod Autoscaler can be found in this documentation. |
3.5.2. Mongo Database
We distinguish between horizontal and vertical scaling. Horizontal scaling means adding additional instances to handle traffic. Vertical scaling means assigning more CPU/RAM to existing instances.
Horizontal scaling
Unlike the REST Interface, horizontal scaling of the Mongo database is only possible manually. It cannot be performed automatically using an Horizontal Pod Autoscaler.
Scaling of the Mongo database is done via the replicas parameter. This must be entered in the custom-values.yaml file to overwrite the default value defined in the values.yaml file.
|
At least three instances are required for the operation of the Mongo cluster, otherwise no |
mongo:
replicas: 3
|
Do not scale the StatefulSet directly in K8s. If you do, certain connection URLs will not be correct and the additional instances will not be used properly. Instead, use the custom Helm values. |
|
Scaling down the Mongo database is not possible without direct intervention and requires manual reduction of the replica set of the Mongo database. The MongoDB documentation describes the necessary steps.
|
Vertical scaling
Vertical scaling is done using Vertical Pod Autoscalers. Vertical Pod Autoscaler are Custom Resources in Kubernetes, so you must first ensure support in your cluster.
You can then configure the following parameters in your custom-values.yaml:
mongo:
verticalPodAutoscaler:
enabled: false
apiVersion: autoscaling.k8s.io/v1beta2
updateMode: Auto
minAllowed:
cpu: 100m
memory: 500Mi
maxAllowed:
cpu: 1
memory: 2000Mi
Applying the configuration
The updated custom-values.yaml file must be applied after configuration changes for the REST Interface or the Mongo database using the following command.
helm upgrade -i RELEASE_NAME path/to/caas-<VERSIONNUMBER>.tgz --values /path/to/custom-values.yaml
|
The release name can be determined with the command |
3.6. Monitoring
The CaaS platform is a microservice architecture and therefore consists of different components. To always monitor their status properly and react quickly in case of errors, integration into cluster-wide monitoring is essential for operation with Kubernetes.
The CaaS platform is already preconfigured for monitoring with Prometheus Operator, as this scenario is widespread in the Kubernetes environment. Prometheus ServiceMonitors for collecting metrics, Prometheus alerts for notification in case of problems, and predefined Grafana dashboards for visualizing metrics are included.
3.6.1. Prerequisites
|
It is essential to set up monitoring and log persistence for the Kubernetes cluster. Without these prerequisites, there are hardly any analysis options in case of errors, and Technical Support lacks important information. |
- Metrics
-
To install the Prometheus Operator, please use the official Helm chart so that cluster monitoring can be set up based on it. For further information, please refer to the relevant documentation.
If you do not operate a Prometheus Operator, you must disable the Prometheus ServiceMonitors and Prometheus alerts.
- Logging
-
With Kubernetes, it is possible to provide various containers or services automatically and scalably. To ensure that logs persist even after an instance is terminated in such a dynamic environment, an infrastructure must be integrated that persists them beforehand.
We therefore recommend using a central logging system, such as the Elastic Stack. The Elastic or ELK Stack is a collection of open-source projects that help persist, search, and analyze log data in real time.
For installation, you can also use an existing Helm chart.
3.6.2. Prometheus ServiceMonitors
Deployment of the ServiceMonitors provided by the CaaS platform for the REST Interface and the Mongo database is controlled via the custom-values.yaml file of the Helm chart.
|
Access to the metrics of the REST Interface is secured by API Key, and access to the metrics of MongoDB is secured by a corresponding MongoDB user. The respective credentials are included in the credentials block of the For security reasons, please adjust the credentials in your |
Typically, Prometheus is configured to only consider ServiceMonitors with certain labels. The labels can therefore be configured in the custom-values.yaml file and apply to all ServiceMonitors of the CaaS Helm chart. In addition, the scrapeInterval parameter allows you to define how often the respective metrics are retrieved.
monitoring:
prometheus:
# Prometheus service monitors will be created for enabled metrics. Each Prometheus
# instance has a configured serviceMonitorSelector property, to be able to control
# the set of matching service monitors. To allow defining matching labels for CaaS
# service monitors, the labels can be configured below and will be added to each
# generated service monitor instance.
metrics:
serviceMonitorLabels:
release: "prometheus-operator"
mongo:
enabled: true
scrapeInterval: "30s"
caas:
enabled: true
scrapeInterval: "30s"
The metrics of MongoDB are provided via a sidecar container and retrieved using a separate database user. You can configure the database user in the credentials block of the custom-values.yaml. The sidecar container is configured with the following default settings:
mongo:
metrics:
image: mongodb-exporter:0.11.0
syncTimeout: 1m
3.6.3. Prometheus Alerts
Deployment of the alerts provided by the CaaS platform is controlled via the custom-values.yaml file of the Helm chart.
Typically, Prometheus is configured to only consider alerts with certain labels. The labels can therefore be configured in the custom-values.yaml file and apply to all alerts of the CaaS Helm chart.
caas-common:
monitoring:
prometheus:
alerts:
# Labels for the PrometheusRule resource
prometheusRuleLabels:
app: "prometheus-operator"
release: "prometheus-operator"
# Additional Prometheus labels to attach to alerts (or overwrite existing labels)
additionalAlertLabels: {}
caas:
enabled: true
useAlphaAlerts: false
# Namespace(s) that should be targeted by the alerts (supports Go template and regular expressions)
targetNamespace: "{{ .Release.Namespace }}"
3.6.4. Grafana Dashboards
Deployment of the Grafana dashboards provided by the CaaS platform is managed via the custom-values.yaml file of the Helm chart.
Typically, the Grafana sidecar container is configured to only consider ConfigMaps with specific labels and in a defined namespace. The labels of the ConfigMap and the namespace in which it is deployed can be configured in the custom-values.yaml file:
caas-common:
monitoring:
grafana:
dashboards:
enabled: true
# Namespace that the ConfigMap resource will be created in (supports Go template and regular expressions)
configmapNamespace: "{{ .Release.Namespace }}"
# Additional labels to attach to the ConfigMap resource
configMapLabels: {}
overviewDashboardsEnabled: false
3.7. REST API Configuration
The REST Interface offers various configuration options that can be set in the custom-values.yaml file of the Helm chart.
3.7.1. Mongo Connection String with DNS Seed List
By default, the REST Interface uses a static connection string with all hostnames in the replica set (Standard Connection String) to connect to the MongoDB database. Optionally, the REST Interface can be configured to use a DNS seed list (SRV connection format).
This is done by setting restApi.mongoSrvConnectionFormat.enabled: true.
The cluster domain used for this can be overridden with the parameter restApi.mongoSrvConnectionFormat.domain.
|
A seed list is only usable for the MongoDB included in the chart. For connections to an externally set up MongoDB, only the standard connection string is available. |
3.7.2. Metadata in Collection Queries
By default, the REST Interface does not return collection metadata when filters are used in queries.
If metadata should be returned, set restApi.additionalConfigOverrides./noPropertiesInterceptor/enabled: false.
3.7.3. Excluding MongoDB Query Operators in Filter Queries
Certain MongoDB query operators can be excluded from use in filter queries by adding them to a blacklist.
To activate this feature, set restApi.filterOperatorBlacklist.enabled: true and specify the operators in restApi.filterOperatorBlacklist.value: [].
For more information, see the documentation.
3.7.4. Resolving References in Document Queries
When querying documents that contain references to other documents, these references are automatically resolved and the referenced documents are embedded in the result.
The maximum depth of reference resolution can be configured with restApi.additionalConfigOverrides./refResolvingInterceptor/max-depth.
The maximum number of references that can be resolved in a request is set with restApi.additionalConfigOverrides./refResolvingInterceptor/limit.
Reference resolution can be disabled with restApi.additionalConfigOverrides./refResolvingInterceptor/enabled: false.
4. Development Environment
Kubernetes and Helm form the basis of all installations of the CaaS platform. For development environments, we recommend installing the CaaS platform in a separate namespace on your production cluster or a similarly configured cluster. We advise against using local instances of the CaaS platform, even for development.
If you need a local environment on development machines, you must create a local Kubernetes cluster. You can use one of the following projects:
|
This list is not exhaustive. It is intended to provide some examples that we know generally work, but we do not use these projects permanently ourselves. |
Any of these projects can be used to manage Kubernetes clusters locally. However, we cannot provide support for any of these specific projects. The CaaS platform only uses standard features of Helm and Kubernetes and is therefore independent of a specific Kubernetes distribution.
Please ensure the following features are correctly configured when using a local Kubernetes cluster:
-
Kubernetes image pull secrets to resolve Docker images from your local or company Docker registry
-
Disable monitoring in the
custom-values.yamlor install the required prerequisites -
Adjust DNS settings of the host system to work with Kubernetes ingress resources, or use local port forwarding in the local cluster
5. Metrics
Metrics are used for monitoring and troubleshooting the CaaS components during operation and are accessible via HTTP endpoints. If metrics are available in Prometheus format, corresponding ServiceMonitors are created (see also Prometheus ServiceMonitors).
5.1. REST Interface
Healthcheck
The healthcheck endpoint provides information about the functionality of the respective component in the form of a JSON document. This status is calculated from several checks. If all checks are successful, the JSON response has HTTP status 200. If at least one check is false, the response has HTTP status 500.
The endpoint is available at: http://REST-HOST:PORT/_logic/healthcheck
|
The functionality of the REST Interface depends on both the reachability of the MongoDB cluster and the existence of a primary node. If the cluster does not have a primary node, write operations to MongoDB are not possible. |
HTTP Metrics
Performance metrics of the REST Interface can be retrieved in Prometheus format at the following URLs:
-
http://REST-HOST:PORT/_logic/metrics -
http://REST-HOST:PORT/_logic/metrics-caas -
http://REST-HOST:PORT/_logic/metrics-jvm
5.2. MongoDB
Metrics for MongoDB are provided via a sidecar container. This container accesses the metrics of MongoDB using a separate database user and provides them via HTTP.
Metrics can be retrieved at: http://MONGODB-HOST:METRICS-PORT/metrics
|
Please note that MongoDB metrics are delivered via a separate port. This port is not accessible from outside the cluster and is therefore not protected by authentication. |
6. Maintenance
Data transfer to the CaaS can only function if all components are working properly. If disruptions occur or updates are necessary, all CaaS components must be considered. The following subchapters describe the necessary steps for troubleshooting in case of a disruption and how to perform backups and updates.
6.1. Troubleshooting
The CaaS is a distributed system based on the interaction of different components. Each of these components can potentially cause errors. If a disruption occurs during the use of the CaaS, various causes may be responsible. The following basic analysis steps explain how to identify the causes of disruptions.
- Component status
-
The status of each CaaS platform component can be checked using the command
kubectl get pods --namespace=<namespace>. If the status of an instance deviates fromRunningorready, it is recommended to start troubleshooting there and check the associated log files.
|
If there are problems with the MongoDB, check whether a The chapter Consider Fault Tolerance in the MongoDB documentation describes how many nodes can explicitly fail before it becomes impossible to elect a new |
- Log analysis
-
In case of problems, log files are a good starting point for analysis. They allow you to track all operations on the systems. This way, any errors and warnings become visible.
Current log files of the CaaS components can be viewed using
kubectl --namespace=<namespace> logs <pod>, but only include events that occurred during the lifetime of the current instance. To analyze log files after a crash or restart, we recommend setting up a central logging system.
|
Log files can only be viewed for the currently running container. Therefore, it is necessary to set up persistent storage to access log files from containers that have already stopped or restarted. |
6.2. Backup
The architecture of the CaaS consists of various independent components that generate and process different information. If data backup is required, it must be performed depending on the respective component.
A backup of the information stored in the CaaS must be performed using the standard mechanisms of MongoDB. Either a copy of the underlying files can be created or mongodump can be used.
6.3. Update
Operation of the CaaS platform with Helm in Kubernetes allows updating to a new version without requiring a reinstallation.
|
Before updating the MongoDB database, a backup is strongly recommended. |
The helm list --all-namespaces command first returns a list of all already installed Helm charts. This list contains both the version and the namespace of the corresponding release.
$ helm list --all-namespaces
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
firstinstance integration 1 2019-12-11 15:51.. DEPLOYED caas-2.10.4 caas-2.10.4
secondinstance staging 1 2019-12-12 09:31.. DEPLOYED caas-2.10.4 caas-2.10.4
To update a release, the following steps must be carried out one after the other:
- Transfer the settings
-
To avoid losing the previous settings, it is necessary to have the
custom-values.yamlfile with which the initial installation of the Helm chart was carried out. - Adoption of further adjustments
-
If there are adjustments to files (e.g. in the
configdirectory), these must also be adopted. - Update
-
After performing the previous steps, the update can be started. It replaces the existing installation with the new version without any downtime. To do this, execute the following command, which starts the process:
Helm upgrade commandhelm upgrade RELEASE_NAME caas-20.12.4.tgz --values /path/to/custom-values.yaml
7. Appendix
7.1. Troubleshooting: Known Issues
7.1.1. File upload with PUT request fails
The error messages
-
E11000 duplicate key error collection: [some-file-bucket].chunks index: files_id_1_n_1 dup keyor -
error updating the file, the file bucket might have orphaned chunks
indicate that orphaned file chunks exist in the MongoDB data. These orphaned data can be deleted using the following mongo shell script:
// Name of the file bucket to clean up (e.g., my-bucket.files)
var filesBucket = "{YOUR_FILE_BUCKET_NAME}";
var chunksCollection = filesBucket.substring(0, filesBucket.lastIndexOf(".")) + ".chunks";
db[chunksCollection].aggregate([
// avoid accumulating binary data in memory
{ $unset: "data" },
{
$lookup: {
from: filesBucket,
localField: "files_id",
foreignField: "_id",
as: "fileMetadata",
}
},
{ $match: { fileMetadata: { $size: 0 } } }
]).forEach(function (c) {
db[chunksCollection].deleteOne({ _id: c._id });
print("Removed orphaned GridFS chunk with id " + c._id);
});
8. Help
The Technical Support of the Crownpeak Technology GmbH provides expert technical support covering any topic related to the FirstSpirit™ product. You can get and find more help concerning relevant topics in our community.
9. Disclaimer
This document is provided for information purposes only. Crownpeak Technology GmbH may change the contents hereof without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. Crownpeak Technology GmbH specifically disclaims any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. The technologies, functionality, services, and processes described herein are subject to change without notice.