What is HorizontalPodAutoscaler in Kubernetes?

Kubernetes has emerged as a standard for container orchestration, providing a powerful platform for managing and scaling containerized applications.

Scalability is essential in software delivery, and Kubernetes offers built-in features that automatically scale application resources to match demand. As a result, applications can accommodate increased traffic without compromising performance. HorizontalPodAutoscaler (HPA) is one of these features.

This article explains what horizontal pod autoscaling is , how it works, and provides a walkthrough example of how to use HPA in kubernetes.

HorizontalPodAutoscaler is a scaling-on-demand feature provided by Kubernetes as an alternative to manually scaling individual pods. HPA automatically scales up or scales down the number of running pods based on resource metrics like pod CPU or memory utilization and other custom metrics like client request per second.

How does HorizontalPodAutoscaler work?

Kubernetes requires the installation of the Metrics Server to enable autoscaling. The Metrics Server gathers the necessary metrics, such as CPU and memory utilization for the nodes and pods in your cluster. Its major function is to provide resource utilization metrics to Kubernetes autoscaler components.

The Metrics Server continuously monitors the resource request metrics from the application workload. The observed metrics are compared to HPA target metrics listed in the HPA manifest.

If the HPA target metrics are reached, the application is scaled up to meet the demand.

Deploying HPA in Kubernetes

This section is a walkthrough example of how HPA can be set up to automatically scale application pods based on CPU utilization.

We will learn how to:

Deploy a Metrics Server on kind.
Create HPA for our applications.
Test the HPA setup.

1. Deploy the Metrics Server

HPA relies on the Metrics Server to collect and expose resource utilization metrics. To deploy the Metrics Server, follow these steps:

To enable Metrics Server on kind, run the command.

apiVersion: v1
kind: Service
metadata:
  name: my-nginx-service
  labels:
    app: web
spec:
  ports:
  - port: 80
  selector:
    app: web
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx-deployment
  labels:
    app: web
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m

Nginx.yaml

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

What is HorizontalPodAutoscaler in Kubernetes?

How does HorizontalPodAutoscaler work?

Deploying HPA in Kubernetes

1. Deploy the Metrics Server

2. Create a sample deployment.

3. Configure HPA

4. Checking if HPA works

Conclusion