Horizontal Pod Autoscaling (HPA)

A Horizontal Pod Autoscaler can automatically scale the number of Pods in a Deployment based on observed CPU utilization or other custom metrics. It's very common in a Kubernetes environment to have a low number of pods in a deployment, and then scale up the number of pods automatically as CPU usage increases.

Assignment

First, delete the replicas: 1 line from the testcpu deployment. This will allow our new autoscaler to have full control over the number of pods.

Create a new file called testcpu-hpa.yaml. Add the following YAML to it:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: testcpu-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: x
  minReplicas: x
  maxReplicas: x
  targetCPUUtilizationPercentage: x

Set the following values:

name: The name of the testcpu deployment
minReplicas: 1
maxReplicas: 4
targetCPUUtilizationPercentage: 50

This hpa will monitor the CPU usage of the pods in the testcpu deployment. Its goal is to scale up or down the number of pods in the deployment so that the average CPU usage of all pods is around 50%. As CPU usage increases, it will add more pods. As CPU usage decreases, it will remove pods. You can find the algorithm it uses here if you're interested.

Apply the hpa, then run the following commands every few seconds to watch as the number of pods scales up:

kubectl get pods
kubectl top pods

An hpa is just another resource, so you can also use kubectl get hpa to see the current state of the autoscaler.