Vertical Pod autoscaling automates setting CPU and memory resource requests and limits for containers within Kubernetes Pods. Vertical Pod autoscaling analyzes historical and current resource usage to provide recommendations, which it can either display or automatically apply by updating Pods. This feature improves stability and cost efficiency by right-sizing resource allocations.
Before you begin
Before you configure Vertical Pod Autoscaling, ensure you meet the following prerequisites:
- You have a running bare metal cluster.
- You have
kubectlaccess to the cluster. - Metrics Server is available in the cluster. Bare metal clusters include Metrics Server by default.
Enable vertical Pod autoscaling
Enable vertical Pod autoscaling on your bare metal cluster by setting a preview annotation and configuring the cluster specification:
Add or update the preview annotation on the Cluster custom resource.
Edit the Cluster custom resource directly or modify the cluster configuration file and use
bmctl update.metadata: annotations: preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enableModify the
specof the Cluster custom resource to include theverticalPodAutoscalingfield and specify theenableUpdaterandenableMemorySavermodes:apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: cluster1 namespace: cluster-cluster1 annotations: preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enable spec: # ... other cluster spec fields verticalPodAutoscaling: enableUpdater: true # Set to true for automated updates enableMemorySaver: true # Set to true to reduce recommender memory usageIf you modified the cluster configuration file, apply the changes using the following command:
bmctl update cluster -c CLUSTER_NAME --kubeconfig KUBECONFIGReplace the following:
CLUSTER_NAME: the name of your cluster.KUBECONFIG: the path of your cluster kubeconfig file.
Create a VerticalPodAutoscaler custom resource
After enabling vertical Pod autoscaling on your cluster, define a VerticalPodAutoscaler custom resource to target specific workloads:
Define a
VerticalPodAutoscalerresource in the same namespace as the target workload.This custom resource specifies which Pods it targets using
targetRefand any resource policies.apiVersion: "autoscaling.k8s.io/v1" kind: VerticalPodAutoscaler metadata: name: hamster-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: hamster resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 1 memory: 500Mi controlledResources: ["cpu", "memory"]Apply the
VerticalPodAutoscalermanifest using the following command:kubectl apply -f VPA_MANIFEST \ --kubeconfig KUBECONFIGReplace the following:
VPA_MANIFEST: the path of theVerticalPodAutoscalermanifest file.KUBECONFIG: the path of the cluster kubeconfig file.
Understand vertical Pod autoscaling modes
Vertical Pod autoscaling operates in different modes that control how it applies resource recommendations.
Recommendation mode
In recommendation mode, vertical Pod autoscaling installs the recommender component. This component analyzes resource usage and publishes recommended values for CPU and memory requests and limits in the status section of the VerticalPodAutoscaler custom resources you create.
To view resource requests and limits recommendations, use the following command:
kubectl describe vpa VPA_NAME \ --kubeconfig KUBECONFIG \ -n CLUSTER_NAMESPACE Replace the following: * `VPA_NAME`: the name of the `VerticalPodAutoscaler` that's targeting the workloads for which you are considering resource adjustments. * `KUBECONFIG`: the path of the cluster kubeconfig file. * `CLUSTER_NAMESPACE`: the name of the cluster that's running vertical Pod autoscaling. The response should contain a Status section that's similar to the following sample:
Status: Conditions: Last Transition Time: 2025-08-04T23:53:32Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: hamster Lower Bound: Cpu: 100m Memory: 262144k Target: Cpu: 587m Memory: 262144k Uncapped Target: Cpu: 587m Memory: 262144k Upper Bound: Cpu: 1 Memory: 500Mi Pods aren't automatically updated in this mode. Use these recommendations to manually update your Pod configurations. This is the default behavior if enableUpdater isn't set or is false.
Automated update mode
When you set enableUpdater enableUpdater to true, bare metal lifecycle controllers deploy the vertical Pod autoscaling updater and admission controller components in addition to the recommender. The updater monitors for Pods whose current resource requests deviate significantly from the recommendations.
The update policy in the VerticalPodAutoscaler resource specifies how the updater applies the recommendations. By default, the update mode is Auto, which dictates that the updater assigns updated resource settings on Pod creation. The following VerticalPodAutoscaler sample how you set the update mode to Initial:
apiVersion: "autoscaling.k8s.io/v1" kind: VerticalPodAutoscaler metadata: name: hamster-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: hamster resourcePolicy: updatePolicy: updateMode: "Initial" ... The updater supports the following five modes:
Auto: The updater evicts the Pod. The admission controller intercepts the creation request for the new Pod and modifies it to use the recommended CPU and memory values provided by the recommender. Updating resources requires recreating the Pod, which can cause disruptions. Use Pod Disruption Budgets, which the updater honors, to manage the eviction process. This mode equivalent toRecreate.Recreate: The updater evicts Pods and assigns recommended resource requests and limits when the Pod is recreated.InPlaceOrRecreate(alpha): The updater attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible. For more information, see the in-place pod resize documentation.Initial: The updater only assigns resource requests on Pod creation and never changes them later.Off: The updater doesn't automatically change the resource requirements of the Pods. The recommendations are calculated and can be inspected in theVerticalPodAutoscalerobject.
For more information about the VerticalPodAutoscaler custom resource, use kubectl to retrieve the verticalpodautoscalercheckpoints.autoscaling.k8s.io custom resource definition that is installed on the version 1.33.0 or later cluster.
The following sample shows how resource recommendations might appear in the Status section for the hamster container. The sample also shows an example of a Pod eviction event, which occurs when the updater evicts a Pod prior to automatically assigning the recommended resource configuration to the recreated Pod:
Spec: Resource Policy: Container Policies: Container Name: * Controlled Resources: cpu memory Max Allowed: Cpu: 1 Memory: 500Mi Min Allowed: Cpu: 100m Memory: 50Mi Target Ref: API Version: apps/v1 Kind: Deployment Name: hamster Update Policy: Update Mode: Auto Status: Conditions: Last Transition Time: 2025-08-04T23:53:32Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: hamster Lower Bound: Cpu: 100m Memory: 262144k Target: Cpu: 587m Memory: 262144k Uncapped Target: Cpu: 587m Memory: 262144k Upper Bound: Cpu: 1 Memory: 500Mi Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EvictedPod 49s vpa-updater VPA Updater evicted Pod hamster-7cb59fb657-lkrk4 to apply resource recommendation. Memory saver mode
Memory saver mode reduces the memory footprint of the vertical Pod autoscaling recommender component. When you set enableMemorySaver to true, the recommender only tracks and computes aggregations for Pods that have a matching VerticalPodAutoscaler custom resource.
The trade-off is that when you create a new VerticalPodAutoscaler custom resource for an existing workload, the recommender takes some time (up to 24 hours) to gather sufficient history to provide accurate recommendations. This mode is false by default for most cluster types, but defaults to true for edge clusters.
Disable Vertical Pod Autoscaling
Disable Vertical Pod Autoscaling by removing its custom resources and configuration from your cluster:
Delete any
VerticalPodAutoscalercustom resources you have created.Modify the Cluster custom resource and remove the entire
verticalPodAutoscalingsection from thespec.You can edit the Cluster custom resource directly or modify the cluster configuration file and use
bmctl update.Remove the
preview.baremetal.cluster.gke.io/vertical-pod-autoscalerannotation from the Cluster custom resource.
Limitations
Consider the following limitations when using Vertical Pod Autoscaling:
- Vertical Pod autoscaling isn't ready for use with JVM-based workloads due to limited visibility into actual memory usage of the workload.
- The updater requires a minimum of two Pod replicas for Deployments to replace Pods with revised resource values.
- The updater doesn't quickly update Pods that are crash-looping due to Out-Of-Memory (OOM) errors.
- The
InPlaceOrRecreateupdate policy for Pods is an alpha feature within vertical Pod autoscaling. It attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible.
What's next
- Explore Pod Disruption Budgets.