20.12.2024
Optimized Resource Utilization and Cost Control.
Efficient Runtimes with KEDA: Dynamic Autoscaling for Kubernetes Clusters
Kubernetes is performance-strong, but without optimized operating times, unnecessary resources and costs can arise. KEDA (Kubernetes Event - Driven Autoscaling) enables dynamically scaling workloads and pausing them outside defined operating times. In this blog post, we show how you can adapt your cluster to work times – for more efficiency and reduced hosting costs.
Table of Contents
Introduction to Kubernetes Autoscaling
Kubernetes already brings a lot of automation and possibilities for efficiency optimization. Whether assigning workloads to nodes, Readiness and Liveness probes to start hanging containers, or dynamically adding or removing nodes through Cluster Autoscaler. Dynamically scaling individual workloads based on resource utilization is also possible through horizontal Pod autoscaling. However, not all requirements can be covered. What about a cluster that is only needed on workdays during work hours? In this blog post, we will examine how you can scale Kubernetes workloads using KEDA based on a time-based schedule.
What is KEDA?
KEDA picks up the idea of Pod autoscaling and expands it to the possibility of not only scaling based on resource utilization, but also based on general events. The events can be based on database queries, metrics of a message broker, a cron schedule, and much more.
Currently, KEDA supports 71 so-called scalers that you can use as a basis for event-based autoscaling. You only need to create a ScaledObject
, a CustomResourceDefinition
from KEDA. The keda-operator, which is deployed in the cluster, then dynamically creates a corresponding HorizontalPodAutoscaler
resource to scale your selected workloads based on the specified events.
Cron Scaler to Define Operating Times
To scale the cluster's workloads outside operating hours to 0, we use the Cron Scaler. This allows defining a cron schedule with start and end times, within which the workloads are scaled to a desired number of replicas. Outside the schedule, the workloads are scaled to the specified minimum number.
Using an exemplary ScaledObject
for the described scenario, you will see how simple the configuration for our use case is:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: cron-scaledobject
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicaCount: 0
maxReplicaCount: 3
cooldownPeriod: 60
triggers:
- type: cron
metadata:
timezone: Europe/Berlin
start: 0 6 * * 1-5
end: 0 20 * * 1-5
desiredReplicas: "3"
What happens here?
The selected deployment is my-deployment
, in the default
namespace. The minReplicaCount
is 0, so that it scales to 0 outside of operating hours. Start and end of the operating time are specified with the Cron schedules 0 6 * * 1-5
, and 0 20 * * 1-5
. I.e. from Monday to Friday between 6:00 and 20:00 the deployment will be scaled to 3 replicas, which is specified by the parameter desiredReplicas
. This ensures a more efficient resource usage.
We can also scale your apps to your needs.
What are the cost savings?
The really exciting question is: How big is the efficiency gain and thus the cost savings through the implementation of operating hours? This strongly depends on your setup.
Prerequisite
The cluster autoscaler must be active so that unused nodes can be removed. The amount of savings then depends on how many workloads can be paused outside of operating hours.
Example
- If in your cluster 10 applications are running and only one is scaled outside of operating hours, the effect remains small.
- If you can pause 9 workloads, the need for nodes reduces significantly - this saves noticeable costs.
Conclusion
KEDA makes it very easy to dynamically scale Kubernetes workloads based on time-based events. The installation and specification of ScaledObjects
are uncomplicated and take little time.
Even if the exact cost savings cannot be generally predicted, the use of KEDA is worthwhile long-term when your workloads have a time-based characteristic.
Frequently Asked Questions
1. What is KEDA and how does it differ from Kubernetes horizontal Pod autoscaler (HPA)?
KEDA (Kubernetes Event-Driven Autoscaling) extends the classic Horizontal Pod Autoscaling (HPA) in Kubernetes. While HPA scales workloads based on resources like CPU or storage utilization, KEDA enables autoscaling based on external events. These include database queries, message broker metrics, or time-controlled triggers. KEDA works complementarily to HPA by transmitting external metrics and thus creating more flexible scaling options.
2. What advantages does KEDA offer for the dynamic autoscaling of Kubernetes workloads?
KEDA offers the following advantages:
- Flexibility: Scaling based on external events (e.g. Kafka, Prometheus or Azure Event Hubs).
- Zero Scaling: Workloads can be reduced to 0 pods when no resources are needed.
- Simple Integration: KEDA works seamlessly with HPA and uses existing Kubernetes mechanisms.
Cost Savings: Through needs-based scaling, unnecessary resources can be avoided.
- Broad Support: With over 70 scalers, KEDA is widely deployable.
3. How does the Cron Scaler of KEDA work for time-based autoscaling?
The Cron Scaler in KEDA enables time-controlled scaling of workloads. You define a Cron schedule with start and end times as well as the desired number of replicas. Outside this time window, workloads are scaled to the specified minimum number (e.g., 0 Pods).
Example: A deployment can be scaled up to 3 Pods from Monday to Friday between 6:00 and 20:00 and reduced to 0 outside these times.
4. Which event sources (Scalers) does KEDA support for autoscaling?
KEDA supports over 70 Scalers, including:
- Message Broker: Kafka, RabbitMQ, Azure Event Hubs.
- Databases: MySQL, PostgreSQL, MongoDB.
- Metric Sources: Prometheus, AWS CloudWatch, Azure Monitor.
- Time-based Triggers: Cron Schedules.
- Others: Redis, GitHub Actions, AWS SQS, and more.
KEDA can also be expanded through user-defined scalers to integrate almost any event source.
5. What does Zero Scaling mean and how does KEDA help?
Zero Scaling means that workloads are completely deactivated by setting their number of Pods to 0. KEDA enables this by using event sources that activate Pods when needed. This drastically reduces resource usage when no events are present and helps save costs while maximizing cluster efficiency.
6. Is KEDA suitable for all Kubernetes applications or are there limitations?
KEDA is suitable for many applications, especially for:
- Event-driven applications (e.g., Message Broker, database operations).
- Workloads with time-based requirements (e.g., specific business hours).
- Metric-based applications that react to external signals.
Limitations:
- KEDA is designed for applications using scalable architectures (e.g., Deployments, Jobs).
- Workloads that must run continuously benefit less from KEDA, as Zero Scaling is not possible.
- Your cluster needs an active Cluster Autoscaler to remove nodes during Zero Scaling.
Tip: For applications with fixed resource requirements or continuous availability, HPA might possibly be sufficient.