Note: If you have missed my previous articles on Docker and Kubernetes, you can find them here:
Application deployment models evolution.
Getting started with Docker.Docker file and images.
Publishing images to Docker Hub and re-using them.
Docker- Find out what's going on.
Docker Networking- Part 1.
Docker Networking- Part 2.
Docker Swarm-Multi-Host container Cluster.Docker Networking- Part 3 (Overlay Driver).
Introduction to Kubernetes.Kubernetes- Diving in (Part 1)-Installing Kubernetes multi-node cluster.
Kubernetes-Diving in (Part2)- Services.
Kubernetes- Infrastructure As Code with Yaml (part 1).
Kubernetes- Infrastructure As Code Part 2- Creating PODs with YAML.
Kubernetes Infrastructure-as-Code part 3- Replicasets with YAML.
Kubernetes Infrastructure-as-Code part 4 - Deployments and Services with YAML.
Deploying a microservices APP with Kubernetes.
Kubernetes- Time based scaling of deployments with python client.
Kubernetes Networking - The Flannel network explained.
Kubernetes- Installing and using kubectl top for monitoring nodes and PoDs
Kubernetes Administration- Scheduling
Kubernetes Administration- Storage
Kubernetes Administration- Users
Kubernetes Administration - Network Policies with Calico network plugin
Kubernetes Administration - Managing Kubernetes Clusters with Rancher
Kubernetes Administration - Package Management with Helm
Kubernetes Administration - Monitoring cluster health with Prometheus
In my previous article, I showed you how to install Prometheus and use it to monitor Kubernetes nodes. Monitoring nodes is a fundamental function that any standard Linux monitoring tool can accomplish- Prometheus is way too powerful for this simple task. In addition to monitoring nodes, Prometheus can monitor the components of any microservice application.
Microservice Monitoring- the need
To understand why Prometheus is required, let's consider a simple microservice application- the voting app. The app allows users to vote between "cats" or "Dogs" and displays results of the vote on a webpage like this:
Here is the architecture of the application:
Here is a quick flow of how the app works:
- Users Vote through the "voting-app" page.
- The votes get saved to the in-memory cache "Redis".
- Periodically "worker" service reads from Redis and saves to PostgresSQL DB.
- result-app reads from Postgres SQL DB and displays results on the webpage.
Each component of the voting app is a deployment made up of replicaset enabling horizontal scaling (by default just 1 replica of each component is deployed)
root@sathish-vm2:/home/sathish/example-voting-app# kubectl get deployment -n vote
NAME READY UP-TO-DATE AVAILABLE AGE
db 1/1 1 1 11m
redis 1/1 1 1 11m
result 1/1 1 1 11m
vote 1/1 1 1 11m
worker 1/1 1 1 11m
root@sathish-vm2:/home/sathish/example-voting-app# kubectl get rs -n vote
NAME DESIRED CURRENT READY AGE
db-684b9b49fd 1 1 1 11m
redis-67db9bd79b 1 1 1 11m
result-86d8966d87 1 1 1 11m
vote-6d4876585f 1 1 1 11m
worker-7cbf9df499 1 1 1 11m
In this architecture, let's assume the "worker" pod crashes and is not able to come up again- this would mean
a) Data is not periodically read from Redis and written to Postgres DB.
b) Result page displays stale/incorrect voting results.
Users will probably complain about the above symptoms and SysAdmins/Operators/DevOps would have to back-track from symptom to isolate the root cause i.e worker node crash. Imagine the above problem on a complex app with 100's microservices deployed in an environment with many clusters- each cluster with multiple nodes. You begin to get the picture about troubleshooting pain points from a DevOp/Operator perspective.
What if there was a way to periodically gather "metrics"/"heartbeats" from running PoDs and push them as notifications? These notifications can be aggregated in a data store and presented in a searchable form. Further, alerts could be defined for specific notifications. Correlating this - worker PoD would have probably experienced "Out Of Memory" errors before it crashed. If an alert was sent- the operator would have probably migrated the PoD to another server that has better resources.
Note: Kubernetes provides many ways to do this- Taints/Tolerations, Node Affinity, Resource Requests etc. The are better ways/tools to do this especially in cloud deployments.
Prometheus Architecture
Here is the Prometheus architecture diagram from the official docs page.
There is a lot going on in the diagram- so let's try to understand the components at a high level:
Prometheus server:
The server component is made up of following
a) Time-Series DB (TSDB) - This is a data store for time-series data like CPU/memory usage, exceptions, alerts, etc.
b) HTTP Server: This is the web UI for Prometheus. The UI queries Prometheus server with PromQL. Any other tool like Grafana, that supports PromQL can be used instead of/in-addition-to this server.
c) Retrieval component- Pulls metrics data from services, servers, etc, and saves them in TSDB.
As noted above, the "retrieval" component pulls data- so there should be the agent on the server or microservice that pushes data periodically and another service that aggregates data. And there is!!
Before getting into details of what those "agents" are, Let's try to understand what can be monitored with Prometheus. Prometheus can be used to monitor- Windows, Linux bare metal servers, Web servers like Apache, services. Depending on the target being monitored, the "unit" changes. For example: On bare-metal servers running Linux - CPU and Memory utilization are units, for apps it could be errors/exceptions/auth failures, etc. Units used to monitor targets are called Metrics.
Metrics
Metrics are text-based monitoring data generated by targets - in other words, stuff that is being monitored. Based on Prometheus docs, metrics are of the following types:
Counter: Count number of instance occurrences. For Eg: Authentication failure per user in an app.
Gauge: Values that can go up or down- for example, memory usage.
Histogram: As the name indicates this metric provides data like the duration of an event. For eg: how long the spike in CPU was.
Each metric also has a help attribute that explains what the metric is.
Prometheus server (retrieval component) pulls these metrics from targets (servers, apps etc) using http. This leads to the next question- who generates these metrics? Metrics are generated by exporters running on targets.
Exporters
Many services like Redis have built-in Prometheus exporters, but some don't. In any case, an exporter:
Fetches metrics from the target.
Converts to a format that Prometheus understands (text-based).
Exports to a Prometheus endpoint (typically http://host/metrics) from where it can be retrieved.
List of supported exporters available here. As you can see there are exporters for commonly used services. When developing an app it is possible to use client libraries in the language of your choice and export metrics from your app to the metrics endpoint. This way, as an app developer you can notify the DevOps team about various metrics that can be expected from different app components.
Alert Manager
Alerts can be defined on the Prometheus server and depending on configured rules they get pushed to the Alert manager. Alertmanager in turn can notify users through email, text message, or any other configured channel.
Data Storage and retrieval
Prometheus stores data in a time series text format (nonrelational data). The data can be queried with PromQl language. In fact, the Prometheus web UI uses PromQl to query the data.
Grafana is another tool with slick UI/dashboards and PromQL querying capabilities which can be used to query data from the Prometheus server and display results.
Where's the catch ??
All the above makes Prometheus a great monitoring tool suitable for most scenarios. However, remember that Prometheus-Server is a single point of data aggregation. Hence there are limits as to the number of metrics/endpoints it can monitor based on HDD size, CPU, Memory of the server where Prometheus is installed. The only way around this is vertical scaling of server resources where Prometheus is running i.e increase CPU, RAM, HDD of the server. Alternatives to this are- limit the number of endpoints monitored per Prometheus instance or decrease the number of metrics collected.
I Hope, this was informative. Special thanks to everyone who shared their feedback on my previous article. Till next time, ciao and have a great weekend.
Kommentare