Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource Quota Per VolumeAttributesClass #50082

Open
wants to merge 1 commit into
base: dev-1.33
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
252 changes: 252 additions & 0 deletions content/en/docs/concepts/policy/resource-quotas.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,7 @@ Resources specified on the quota outside of the allowed set results in a validat
| `NotBestEffort` | Match pods that do not have best effort quality of service. |
| `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/scheduling-eviction/pod-priority-preemption). |
| `CrossNamespacePodAffinity` | Match pods that have cross-namespace pod [(anti)affinity terms](/docs/concepts/scheduling-eviction/assign-pod-node). |
| `VolumeAttributesClass` | Match persistentvolumeclaims that references the specified [volume attributes class](/docs/concepts/storage/volume-attributes-classes). |

The `BestEffort` scope restricts a quota to tracking the following resource:

Expand Down Expand Up @@ -459,6 +460,257 @@ With the above configuration, pods can use `namespaces` and `namespaceSelector`
if the namespace where they are created have a resource quota object with
`CrossNamespacePodAffinity` scope and a hard limit greater than or equal to the number of pods using those fields.

### Resource Quota Per VolumeAttributesClass

{{< feature-state feature_gate_name="VolumeAttributesClass" >}}

PersistentVolumeClaims can be created with a specific [volume attributes class](/docs/concepts/storage/volume-attributes-classes/), and might be modified after creation. You can control a PVC's consumption of storage resources based on the associated volume attributes classes, by using the `scopeSelector` field in the quota spec.

The PVC references the associated volume attributes class by the following fields:

* `spec.volumeAttributesClassName`
* `status.currentVolumeAttributesClassName`
* `status.modifyVolumeStatus.targetVolumeAttributesClassName`

A quota is matched and consumed only if `scopeSelector` in the quota spec selects the PVC.

When the quota is scoped for the volume attributes class using the `scopeSelector` field, the quota object is restricted to track only the following resources:

* `persistentvolumeclaims`
* `requests.storage`

This example creates a quota object and matches it with PVC at specific volume attributes classes. The example works as follows:

- PVCs in the cluster have at least one of the three volume attributes classes, "gold", "silver", "copper".
- One quota object is created for each volume attributes class.

Save the following YAML to a file `quota-vac.yaml`.

{{% code_sample file="policy/quota-vac.yaml" %}}

Apply the YAML using `kubectl create`.

```shell
kubectl create -f ./quota-vac.yaml
```

```
resourcequota/pvcs-gold created
resourcequota/pvcs-silver created
resourcequota/pvcs-copper created
```

Verify that `Used` quota is `0` using `kubectl describe quota`.

```shell
kubectl describe quota
```

```
Name: pvcs-gold
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 10Gi


Name: pvcs-silver
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 20Gi


Name: pvcs-copper
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 30Gi
```

Create a pvc with volume attributes class "gold". Save the following YAML to a file `gold-vac-pvc.yaml`.

{{% code_sample file="policy/gold-vac-pvc.yaml" %}}

Apply it with `kubectl create`.

```shell
kubectl create -f ./gold-vac-pvc.yaml
```

Verify that "Used" stats for "gold" volume attributes class quota, `pvcs-gold` has changed and that the other two quotas are unchanged.

```shell
kubectl describe quota
```

```
Name: pvcs-gold
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 10Gi


Name: pvcs-silver
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 20Gi


Name: pvcs-copper
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 30Gi
```

Once the PVC is bound, it is allowed to modify the desired volume attributes class. Let's change it to "silver" with kubectl patch.

```shell
kubectl patch pvc gold-vac-pvc --type='merge' -p '{"spec":{"volumeAttributesClassName":"silver"}}'
```

Verify that "Used" stats for "silver" volume attributes class quota, `pvcs-silver` has changed, `pvcs-copper` is unchanged, and `pvcs-gold` might be unchanged or released, which depends on the PVC's status.

```shell
kubectl describe quota
```

```
Name: pvcs-gold
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 10Gi


Name: pvcs-silver
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 20Gi


Name: pvcs-copper
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 30Gi
```

Let's change it to "copper" with kubectl patch.

```shell
kubectl patch pvc gold-vac-pvc --type='merge' -p '{"spec":{"volumeAttributesClassName":"copper"}}'
```

Verify that "Used" stats for "copper" volume attributes class quota, `pvcs-copper` has changed, `pvcs-silver` and `pvcs-gold` might be unchanged or released, which depends on the PVC's status.

```shell
kubectl describe quota
```

```
Name: pvcs-gold
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 10Gi


Name: pvcs-silver
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 20Gi


Name: pvcs-copper
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 30Gi
```

Print the manifest of the PVC using the following command:

```shell
kubectl get pvc gold-vac-pvc -o yaml
```

It might show the following output:

```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gold-vac-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: default
volumeAttributesClassName: copper
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
currentVolumeAttributesClassName: gold
phase: Bound
modifyVolumeStatus:
status: InProgress
targetVolumeAttributesClassName: silver
storageClassName: default
```

Wait a moment for the volume modification to complete, then verify the quota again.

```shell
kubectl describe quota
```

```
Name: pvcs-gold
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 10Gi


Name: pvcs-silver
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 0 10
requests.storage 0 20Gi


Name: pvcs-copper
Namespace: default
Resource Used Hard
-------- ---- ----
persistentvolumeclaims 1 10
requests.storage 2Gi 30Gi
```

## Requests compared to Limits {#requests-vs-limits}

When allocating compute resources, each container may specify a request and a limit value for either CPU or memory.
Expand Down
12 changes: 12 additions & 0 deletions content/en/examples/policy/gold-vac-pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gold-vac-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: # change this to the name of the storage class you want to use
volumeAttributesClassName: gold
42 changes: 42 additions & 0 deletions content/en/examples/policy/quota-vac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
apiVersion: v1
kind: List
items:
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pvcs-gold
spec:
hard:
requests.storage: "10Gi"
persistentvolumeclaims: "10"
scopeSelector:
matchExpressions:
- operator: In
scopeName: VolumeAttributesClass
values: ["gold"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pvcs-silver
spec:
hard:
requests.storage: "20Gi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this quota handling doesn't do anything to prevent creation of more than 20Gi volumes that use VAC silver. Why would we mention it? I am confused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an arbitrary value, not special meaning. The quota will be used in the above example. Do you mean that it is better to use the same hard setting for these quota objects. I'm okay to change it if you think it is better.

Copy link
Member Author

@carlory carlory Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you change the desired vac and capacity of the existing pvc, it will be rejected if the new capacity is larger than the quota. In the above example, I didn't show this case. The example is used to tell users which quota is changed or not when the pvc is updated.

Copy link
Member

@gnufied gnufied Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this looks kinda confusing. If you see scope selector being used here - https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-per-priorityclass , for example:

apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-high
  spec:
    hard:
      cpu: "1000"
      memory: "200Gi"
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator: In
        scopeName: PriorityClass
        values: ["high"]

To me, this reads like - this quota will apply to all mentioned fields (cpu, memory and count) of pods in "high" priority class, not just cpu or memory.

I understand that - we chose not to apply quota based on capacity just yet for VACs, but if within same scope selector, capacity is applied differently from count, then that smells like user experience issue (and a bad one at that). cc @msau42 @deads2k @sunnylovestiramisu @xing-yang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question here is that maybe we should block configuring capacity and VAC together in one ResourceQuota? VAC works with Scope but the existing capacity quota does not work with VAC Scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I was wrong and may be something was lost in communication - but quota as implemented, does work for both counting and capacity and is scoped to specified scopeSelector.

For example, give resourcequota:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: silver-pvcs
  namespace: vim1
spec:
  hard:
    count/persistentvolumeclaims: "3"
    requests.storage: 15Gi
  scopeSelector:
    matchExpressions:
    - operator: In
      scopeName: VolumeAttributesClass
      values:
      - silver
status:
  hard:
    count/persistentvolumeclaims: "3"
    requests.storage: 15Gi
  used:
    count/persistentvolumeclaims: "2"
    requests.storage: 13Gi

I could confirm that, even though there is a "count" capacity available, if I try and create a PVC that exceeds remaining 2GB, then I get quota related errors:

cat csi-pvc.yaml|sed 's/csi-pvc/csi-pvc-4/g'|sed 's/1Gi/4Gi/g'|kubectl create -f -
Error from server (Forbidden): error when creating "STDIN": persistentvolumeclaims "csi-pvc-4-silver" is forbidden: exceeded quota: silver-pvcs, requested: requests.storage=4Gi, used: requests.storage=13Gi, limited: requests.storage=15Gi

I can however create PVCs, larger than remaining capacity, as long as I am not using specified VAC. So this is working as expected.

So tldr; quota handling works for both counting and capacity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we specify Pod Scope and VolumeAttributesClass Scope in one ResourceQuota? Are they both respected?

Copy link
Member Author

@carlory carlory Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't share any standard resource name, so it should fail to create such quota object. I can not foresee whether they share same resource name in future (It should not in my mind). In quota PR, I didn't add a validation for this case when a quota object is created without any resource name, but it has both above scope. Should I enhance the validation for this case? cc @deads2k

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding "." in the key requests.storage?
This will break JSON path parsing.

Copy link
Member Author

@carlory carlory Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tengqm This pattern was introduced in kubernetes/kubernetes@55e3824 and kubernetes/kubernetes@09bac89. Related proposal was here: #19761

persistentvolumeclaims: "10"
scopeSelector:
matchExpressions:
- operator: In
scopeName: VolumeAttributesClass
values: ["silver"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pvcs-copper
spec:
hard:
requests.storage: "30Gi"
persistentvolumeclaims: "10"
scopeSelector:
matchExpressions:
- operator: In
scopeName: VolumeAttributesClass
values: ["copper"]