Skip to content

Commit b5b760f

Browse files
authored
Merge pull request #7548 from adrianmoisey/move_around_docs
Move all VPA docs into ./docs
2 parents 4ec3336 + 1621f41 commit b5b760f

File tree

11 files changed

+569
-538
lines changed

11 files changed

+569
-538
lines changed

vertical-pod-autoscaler/README.md

+13-412
Large diffs are not rendered by default.
+135
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# Components
2+
3+
## Contents
4+
5+
- [Components](#components)
6+
- [Introduction](#introduction)
7+
- [Recommender](#recommender)
8+
- [Running](#running-the-recommender)
9+
- [Implementation](#implementation-of-the-recommender)
10+
- [Updater](#updater)
11+
- [Current implementation](#current-implementation)
12+
- [Missing Parts](#missing-parts)
13+
- [Admission Controller](#admission-controller)
14+
- [Running](#running-the-admission-controller)
15+
- [Implementation](#implementation-of-the-admission-controller)
16+
17+
## Introduction
18+
19+
The VPA project consists of 3 components:
20+
21+
- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it,
22+
provides recommended values for the containers' cpu and memory requests.
23+
24+
- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not,
25+
kills them so that they can be recreated by their controllers with the updated requests.
26+
27+
- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created
28+
or recreated by their controller due to Updater's activity).
29+
30+
More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md).
31+
32+
## Recommender
33+
34+
Recommender is the core binary of Vertical Pod Autoscaler system.
35+
It computes the recommended resource requests for pods based on
36+
historical and current usage of the resources.
37+
The current recommendations are put in status of the VPA resource, where they
38+
can be inspected.
39+
40+
## Running the recommender
41+
42+
- In order to have historical data pulled in by the recommender, install
43+
Prometheus in your cluster and pass its address through a flag.
44+
- Create RBAC configuration from `../deploy/vpa-rbac.yaml`.
45+
- Create a deployment with the recommender pod from
46+
`../deploy/recommender-deployment.yaml`.
47+
- The recommender will start running and pushing its recommendations to VPA
48+
object statuses.
49+
50+
### Implementation of the recommender
51+
52+
The recommender is based on a model of the cluster that it builds in its memory.
53+
The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with
54+
their configuration (e.g. labels) as well as other information, e.g. usage data for
55+
each container.
56+
57+
After starting the binary, recommender reads the history of running pods and
58+
their usage from Prometheus into the model.
59+
It then runs in a loop and at each step performs the following actions:
60+
61+
- update model with recent information on resources (using listers based on
62+
watch),
63+
- update model with fresh usage samples from Metrics API,
64+
- compute new recommendation for each VPA,
65+
- put any changed recommendations into the VPA resources.
66+
67+
## Updater
68+
69+
Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338)
70+
71+
Updater runs in Kubernetes cluster and decides which pods should be restarted
72+
based on resources allocation recommendation calculated by Recommender.
73+
If a pod should be updated, Updater will try to evict the pod.
74+
It respects the pod disruption budget, by using Eviction API to evict pods.
75+
Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin
76+
to update pod resources when the pod is recreated after eviction.
77+
78+
### Current implementation
79+
80+
Runs in a loop. On one iteration performs:
81+
82+
- Fetching Vertical Pod Autoscaler configuration using a lister implementation.
83+
- Fetching live pods information with their current resource allocation.
84+
- For each replicated pods group calculating if pod update is required and how many replicas can be evicted.
85+
Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag.
86+
- Evicting pods if recommended resources significantly vary from the actual resources allocation.
87+
Threshold for evicting pods is specified by recommended min/max values from VPA resource.
88+
Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources
89+
(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted
90+
before pod with 20% memory increase and no change in cpu).
91+
92+
### Missing parts
93+
94+
- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender.
95+
96+
## Admission-controller
97+
98+
This is a binary that registers itself as a Mutating Admission Webhook
99+
and because of that is on the path of creating all pods.
100+
For each pod creation, it will get a request from the apiserver and it will
101+
either decide there's no matching VPA configuration or find the corresponding
102+
one and use current recommendation to set resource requests in the pod.
103+
104+
### Running the admission-controller
105+
106+
1. You should make sure your API server supports Mutating Webhooks.
107+
Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of
108+
the values on the list and its `--runtime-config` flag should include
109+
`admissionregistration.k8s.io/v1beta1=true`.
110+
To change those flags, ssh to your API Server instance, edit
111+
`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick
112+
up the changes: ```sudo systemctl restart kubelet.service```
113+
1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create
114+
a secret in your cluster with the certs.
115+
1. Create RBAC configuration for the admission controller pod by running
116+
`kubectl create -f ../deploy/admission-controller-rbac.yaml`
117+
1. Create the pod:
118+
`kubectl create -f ../deploy/admission-controller-deployment.yaml`.
119+
The first thing this will do is it will register itself with the apiserver as
120+
Webhook Admission Controller and start changing resource requirements
121+
for pods on their creation & updates.
122+
1. You can specify a path for it to register as a part of the installation process
123+
by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`.
124+
1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`.
125+
1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`.
126+
1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2.
127+
128+
### Implementation of the Admission Controller
129+
130+
All VPA configurations in the cluster are watched with a lister.
131+
In the context of pod creation, there is an incoming https request from
132+
apiserver.
133+
The logic to serve that request involves finding the appropriate VPA, retrieving
134+
current recommendation from it and encodes the recommendation as a json patch to
135+
the Pod resource.
+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Examples
2+
3+
## Contents
4+
5+
- [Examples](#examples)
6+
- [Keeping limit proportional to request](#keeping-limit-proportional-to-request)
7+
- [Capping to Limit Range](#capping-to-limit-range)
8+
- [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range)
9+
- [Starting multiple recommenders](#starting-multiple-recommenders)
10+
- [Using CPU management with static policy](#using-cpu-management-with-static-policy)
11+
- [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource)
12+
- [Limiting which namespaces are used](#limiting-which-namespaces-are-used)
13+
- [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy)
14+
15+
## Keeping limit proportional to request
16+
17+
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
18+
specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA
19+
applies the recommendation, it will also set the memory limit to 4 GB.
20+
21+
## Capping to Limit Range
22+
23+
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
24+
specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
25+
VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will
26+
set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB (
27+
to maintain a 2:1 limit/request ratio from the template).
28+
29+
## Resource Policy Overriding Limit Range
30+
31+
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
32+
specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
33+
VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and
34+
2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation,
35+
VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain
36+
the 2:1 limit/request ratio from the template).
37+
38+
## Starting multiple recommenders
39+
40+
It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles.
41+
For example you could have 3 profiles: [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml),
42+
[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and
43+
[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will
44+
use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations.
45+
46+
Please note the usage of the following arguments to override default names and percentiles:
47+
48+
- --recommender-name=performance
49+
- --target-cpu-percentile=0.95
50+
51+
You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec.
52+
53+
## Custom memory bump-up after OOMKill
54+
55+
After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`.
56+
You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender:
57+
`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event.
58+
`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB)
59+
60+
Usage in recommender deployment
61+
62+
```yaml
63+
containers:
64+
- name: recommender
65+
args:
66+
- --oom-bump-up-ratio=2.0
67+
- --oom-min-bump-up-bytes=524288000
68+
```
69+
70+
## Using CPU management with static policy
71+
72+
If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers,
73+
you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up.
74+
To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender.
75+
The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container.
76+
The annotation format is the following:
77+
78+
```yaml
79+
vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true
80+
```
81+
82+
## Controlling eviction behavior based on scaling direction and resource
83+
84+
To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container
85+
86+
Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down
87+
88+
```yaml
89+
updatePolicy:
90+
evictionRequirements:
91+
- resources: ["cpu", "memory"]
92+
changeRequirement: TargetHigherThanRequests
93+
```
94+
95+
Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information.
96+
97+
## Limiting which namespaces are used
98+
99+
By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options:
100+
101+
1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore
102+
1. `vpa-object-namespace` - A single namespace to monitor
103+
104+
These options cannot be used together and are mutually exclusive.
105+
106+
## Setting the webhook failurePolicy
107+
108+
It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller.
109+
Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA.
110+
Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk.

vertical-pod-autoscaler/FAQ.md vertical-pod-autoscaler/docs/faq.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Contents
44

5-
- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-CPU-or-memory-settings)
5+
- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-cpu-or-memory-settings)
66
- [How can I apply VPA to my Custom Resource?](#how-can-i-apply-vpa-to-my-custom-resource)
77
- [How can I use Prometheus as a history provider for the VPA recommender?](#how-can-i-use-prometheus-as-a-history-provider-for-the-vpa-recommender)
88
- [I get recommendations for my single pod replicaSet, but they are not applied. Why?](#i-get-recommendations-for-my-single-pod-replicaset-but-they-are-not-applied)
@@ -135,7 +135,7 @@ spec:
135135
- --v=4
136136
- --storage=prometheus
137137
- --prometheus-address=http://prometheus.default.svc.cluster.local:9090
138-
```
138+
```
139139

140140
In this example, Prometheus is running in the default namespace.
141141

@@ -148,9 +148,9 @@ Here you should see the flags that you set for the VPA recommender and you shoul
148148

149149
This means that the VPA recommender is now using Prometheus as the history provider.
150150

151-
### I get recommendations for my single pod replicaSet but they are not applied
151+
### I get recommendations for my single pod replicaset but they are not applied
152152

153-
By default, the [`--min-replicas`](pkg/updater/main.go#L56) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](deploy/updater-deployment.yaml) file:
153+
By default, the [`--min-replicas`](https://github.com/kubernetes/autoscaler/tree/master/pkg/updater/main.go#L44) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](https://github.com/kubernetes/autoscaler/tree/master/deploy/updater-deployment.yaml) file:
154154

155155
```yaml
156156
spec:
@@ -179,7 +179,7 @@ election with the `--leader-elect=true` parameter.
179179
The following startup parameters are supported for VPA recommender:
180180

181181
Name | Type | Description | Default
182-
|-|-|-|-|
182+
-|-|-|-
183183
`recommendation-margin-fraction` | Float64 | Fraction of usage added as the safety margin to the recommended request | 0.15
184184
`pod-recommendation-min-cpu-millicores` | Float64 | Minimum CPU recommendation for a pod | 25
185185
`pod-recommendation-min-memory-mb` | Float64 | Minimum memory recommendation for a pod | 250
@@ -230,7 +230,7 @@ Name | Type | Description | Default
230230
The following startup parameters are supported for VPA updater:
231231

232232
Name | Type | Description | Default
233-
|-|-|-|-|
233+
-|-|-|-
234234
`pod-update-threshold` | Float64 | Ignore updates that have priority lower than the value of this flag | 0.1
235235
`in-recommendation-bounds-eviction-lifetime-threshold` | Duration | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range | time.Hour*12
236236
`evict-after-oom-threshold` | Duration | Evict pod that has OOMed in less than evict-after-oom-threshold since start. | 10*time.Minute
+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Features
2+
3+
## Contents
4+
5+
- [Limits control](#limits-control)
6+
7+
## Limits control
8+
9+
When setting limits VPA will conform to
10+
[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103).
11+
It will maintain limit to request ratio specified for all containers.
12+
13+
VPA will try to cap recommendations between min and max of
14+
[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts
15+
with VPA resource policy, VPA will follow VPA policy (and set values outside the limit
16+
range).
17+
18+
To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`.

0 commit comments

Comments
 (0)