Elastic Stack on Kubernetes

9 min readMay 12, 2020

Background

During a technical assessment I was tasked with automating the deployment a complete scalable ELK stack on Kubernetes (K8s), to support end-to-end monitoring and logging for cloud-ready applications.

The main aspects of the challenge were:

Configurations decoupled from runtime using ConfigMaps
Setup readiness and liveness probes
Setup resource limits and quotas
Jenkins Job triggered by a webhook (Not covered in this article)
Automated Tests to ensure the functionality of the stack (Not covered in this article)

Architecture

ELK Stack Architecture setup and data flow

The setup was deployed on Google Kubernetes Engine on a standard cluster that was configured to allow autoscaling. The beats components-Metricbeat, Auditbeat and Filebeat- were deployed as Daemonsets to collect metrics and logs from the Kubernetes Cluster nodes and shipping them to Logstash. Logstash was deployed as deployment with a service load-balancing and exposing the beats ingest port and an API port. Logstash was configured with a pipeline to ingest the data from the beats daemons and ship it to Elasticsearch. Elasticsearch is deployed as a StatefulSet with 3 initial replicas ensuring a basic healthy cluster. Kibana interfaces with Elasticsearch to provide visualizations and dashboards.

Deployment

We shall be working with version 7.6.2 of the container images.

Base Setup

We define a new namespace and storageClass to host our stack. We can go further and define quotas for the pods that will be deployed within the namespace but for now we shall leave it undefined.

We define a persistent StorageClass to be used for Elasticsearch data to maintain state through deployments. The StorageClass is custom provisioned for GKE, you can look up the definitions for other storage providers on https://kubernetes.io/docs/concepts/storage/ and https://kubernetes.io/docs/concepts/storage/storage-classes/.

apiVersion: v1
kind: Namespace
metadata:
  name: monitoring
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ssd
  namespace: monitoring
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd

Elasticsearch

We override the default configurations using environment variables (I attempted to use a config map and inject it into the config folder but during cluster bootstrapping the directory becomes read only and does not allow elastic to create a number of files required for bootstrapping, like the temporary keystore file). As this will be an internal Elastic service, we shall not provide an external load balancing service, we shall only provision an internal one; to receive data from the Logstash and APM, and expose endpoints to Kibana.

We roll out Elasticsearch in a StatefulSet so as to maintain data across deployment revisions. From advise from elastic documentation, we shall have 3 replicas to have a green, stable cluster and avoid the split brain problem — when two master eligible nodes lose connectivity briefly and each elect themselves as a master without a proper node consensus .

To try and alleviate the chances of data loss, we instruct the cluster to attempt not to schedule the Elastic pods on the same nodes using podAntiAffinity.

spec:
      affinity: 
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - elasticsearch
                topologyKey: kubernetes.io/hostname

Additionally we expose port 9200 for the API access to the cluster and port 9300 for communication between the elastic cluster pods. We define liveness and readiness probes to ensure the uptime and health of our cluster especially during updates and allow for the semblance of self-healing. To ensure we don’t have cascading failure as a result of delayed pods and allow our cluster to properly bootstrap, we delay the first probe by 30 seconds to allow the service to fully startup prior to testing the health of the cluster nodes. Using the previously defined Storage Class, we have a volumeClaimTemplate that defines the storage holding the data and that persists across deployments.

Due to the nature of the data volume mounts and system requirements, we need to prepare the container to be used by elastic search using initContainers. We modify the permissions of the data folder and tune the kernel configs of the new container,as shown below.

- name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true

The rest of the current manifest is directly as defined on: https://github.com/ianmuge/elk-automation-challenge/blob/master/elasticsearch.yml

We can further extend the availability of the cluster (not a consideration now), by creating a multi-tier structure, with specialized data ,coordinating and master nodes (doubling as ingest nodes), this is not necessary but it is provided as an option in https://github.com/ianmuge/elk-automation-challenge/blob/master/elasticsearch-specialized.yml . The data nodes are the work-horses of the entire cluster, therefore we can assign more resources to it, as opposed to the other node types. We can further extend by adding ML nodes but haven’t done so in our custom manifest. The coordinating nodes function should not be overstated as the data nodes can directly perform the same functionality without any problem, just that it would be easier to abstract the routing of requests and organization of the responses coming from the nodes.

Logstash

The Logstash setup is more straightforward than Elasticsearch, we deploy it as a K8s deployment, this allows us to scale on-demand by increasing the number of replicas. We also expose an internal service bound on port 5044 to receive data from the Beats daemons, and port 9600 to expose the logstash API. The configmap is a bit more curious though. The “logstash.yml” defines the configuration for Logstash itself and the pipeline config defines the ingestion and forwarding pipeline, as shown below.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash
  namespace: monitoring
  labels:
    app: logstash
data:
  logstash.yml: |-
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline
    xpack.monitoring.enabled: true
    xpack.monitoring.elasticsearch.hosts: "http://elasticsearch-svc:9200"
    xpack.monitoring.elasticsearch.sniffing: true
  pipeline.conf: |-
    input {
      beats {
        port => 5044
      }
    }
    output {
      elasticsearch {
        hosts => ['http://elasticsearch-svc:9200']
        index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
      }
      stdout { 
        codec => rubydebug 
      }
    }

N.B:

As Logstash and Elasticsearch are in the same namespace, their endpoints can be referenced by the service name, which is simple and straightforward, but as we shall see in Beats, we can alternatively use the FQDN when referencing from a different namespace.
Given many beats services will be streaming to the same Logstash endpoint, we customize the indices they write to based on the beat name, version and date as shown in:

index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"

APM

Due to the nature of the services being offered by APM like Real-User-Monitoring, you can expose the interface publicly,if it is to get data from services hosted on other environments, or if it is just capturing data from an application hosted on the same cluster; you can just have an internal endpoint. The basic configuration we have for APM is as shown below.

apiVersion: v1
kind: ConfigMap
metadata:
  name: apm-server
  namespace: monitoring
  labels:
    app: apm-server
data:
  apm-server.yml: |-
    apm-server.host: "0.0.0.0:8200"
    apm-server.rum.enabled: true
    apm-server.capture_personal_data: true
    apm-server.kibana.path: /kibana
    apm-server.kibana.enabled: true
    apm-server.kibana.host: "http://kibana-svc:5601"
    setup.kibana.host: "http://kibana-svc:5601"
    monitoring.enabled: true
    monitoring.elasticsearch.hosts: "http://elasticsearch-svc:9200"
    output.logstash.hosts: 'logstash-svc:5044'
    # output.elasticsearch.hosts: '${ELASTICSEARCH_HOST:elasticsearch-svc}:${ELASTICSEARCH_PORT:9200}'

Kibana

The Kibana setup is the easiest as it can can just be deployed as a single node, but to enable some form of HA, we use a deployment with 2 replicas. To identify the different pods we tag them using the server name and server uuid.

env:
          - name: SERVER_UUID
            valueFrom:
              fieldRef:
                fieldPath: metadata.uid
          - name: SERVER_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name

Considering that the service is public facing, we expose it using a load-balancer, as shown below. Of note; we are forwarding requests on port 5601 to port 5601, we can still bind the public facing port to whichever port we require, preferably 80, and we can go further and secure it using HTTPS. All query requests are forwarded to the Elasticsearch internal load-balanced API service, as defined in the config map.

apiVersion: v1               
kind: Service                
metadata:                    
  name: kibana-lb        
  namespace: monitoring
  labels:                    
    app: kibana
spec:                        
  type: LoadBalancer         
  selector:                  
    app: kibana
  ports:                     
  - name: http               
    port: 5601
    targetPort: 5601               
    protocol: TCP

Beats

Beats is lightweight data shipper. Its functionality is rather simple, intuitive and honestly, pretty cool. It works on “push” basis; meaning, it aggregates the data on the nodes and pushes them to the listening server (defined as the output in beats configurations), the alternative to this would be a “pull” where the aggregation service would pull the data exposed from the nodes, as is the case for Prometheus. Each Beat is specially built to work best with a particular form of data:

Filebeat: ship log data
Metricbeat: ship node and service metrics
Auditbeat: ship security (audit) logs and data.
Journalbeat: ship systemd data
Packetbeat: ship network data

But I digress….

We deploy beats as Daemonsets, meaning it will create a replica pod on each Kubernetes node and ship data from those nodes to the ingestion engine, in this case Logstash. A bit of a difference comes about in the case of Metricbeat as we deploy it both as a deployment and Daemonset. Most of the beats configurations are similar with respect to how they handle the data once they collect it, difference comes in when configuring where and how they get the data. It should be noted that we deploy the daemons in the “kube-system” namespace and forward logs to the “monitoring” namespace.

Using Auditbeat as an example,referenced below;

apiVersion: v1
kind: ConfigMap
metadata:
  name: auditbeat-config
  namespace: kube-system
  labels:
    k8s-app: auditbeat
data:
  auditbeat.yml: |-
    auditbeat.config.modules:
      # Mounted `auditbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml #[1]
      # Reload module configs as they change:
      reload.enabled: false    processors: #[2]
      - add_cloud_metadata:
      - add_host_metadata:    output.logstash.hosts: 'logstash-svc.monitoring.svc.cluster.local:5044' #[3]
    setup.kibana.host: "<http://kibana-svc.monitoring.svc.cluster.local:5601>" #[4]
    setup.dashboards.enabled: true 
    # setup.dashboards.index: "auditbeat-*"
#    output.elasticsearch.hosts: ['${ELASTICSEARCH_HOST:elasticsearch-svc.monitoring.svc.cluster.local}:${ELASTICSEARCH_PORT:9200}']

[1] We define where it will load modules from. Modules extend the basic functionality of the beat, we load their configurations from other configmaps, sampled below.

system.yml: |-
    - module: file_integrity
      paths:
      - /hostfs/bin
      - /hostfs/usr/bin
      - /hostfs/sbin
      - /hostfs/usr/sbin
      - /hostfs/etc
      exclude_files:
      - '(?i)\\.sw[nop]$'
      - '~$'
      - '/\\.git($|/)'
      scan_at_start: true
      scan_rate_per_sec: 50 MiB
      max_file_size: 100 MiB
      hash_types: [sha1]
      recursive: true

[2] We can add custom processors to pick up and tag specific data using the beat, in this case cloud metadata, host metadata and even Kubernetes node metadata.

[3] We define the output: where the data is shipped to

[4] To simplify our work, we can instruct beats to preload dashboards into Kibana. This builds beautiful dashboards to visualize the data being shipped with little effort, on our part.

We also note that we create RBAC and service accounts for the Daemonsets. This allows access to particular functions on specific resources in the Kubernetes Environment like: nodes, namespaces and pods. This ensures that we provide the daemons with only the access required to perform their role while still leaving our environment relatively secure. This is done using: ClusterRole (defines what you have access to in the cluster), ServiceAccount (defines what/who you are), and ClusterRoleBinding (Defines the mapping between the ServiceAccount and ClusterRole).

Specific beat details can be found in the beats folder in the repository.

Bring it together

We bring everything together using Kustomize; it collates and formats all our manifests into one handy, easy to deploy manifest. N.B This is not compulsory but simplifies our work especially when working with a CI/CD pipeline

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - setup.yml
  - elasticsearch.yml
  - kibana.yml
  - logstash.yml
  - apm-server.yml
  - beats/metricbeat.yml
  - beats/auditbeat.yml
  - beats/filebeat.yml

This then allows us to deploy using a single manifest.

kubectl kustomize . > compiled.yml
kubectl apply -f compiled.yml

Conclusion

This is not, in any way, fully exhaustive but should get you a complete, stable, Highly Available ELK cluster running on Kubernetes; ready to receive,index and process any data sent to it. Beats will form a baseline to the ingestion of data and helps us monitor how the cluster is performing. I have a number of suggestions below, you can add on to these.

Improvements

Secure Cluster using X-Pack
Add full scale tests, we can use Kube-Test to enhance the tests.
Additional Beats components; you can ship from windows devices using WinLogBeat or build your own using libbeat.
ILM policy to rollup indices
Split and specialize the Elastic search nodes

TL;DR

A manifest collection and guide to set up a Highly Availability ELK stack cluster on Kubernetes ingesting K8s node data and logs from Beats. All available on:

ianmuge/elk-automation-challenge

ELK stack Kubernetes Automation Challenge During a technical assessment I was tasked with automating the deployment a…

github.com