Prioritized Alternatives in Device Requests #128586

mortent · 2024-11-05T18:16:53Z

What type of PR is this?

/kind feature
/kind api-change
/kind deprecation

What this PR does / why we need it:

This implements https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4816-dra-prioritized-list

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

DRA support for a "one-of" prioritized list of selection criteria to satisfy a device request in a resource claim.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP] https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4816-dra-prioritized-list

johnbelamaric

super cursory quick look...

pkg/apis/resource/types.go

johnbelamaric · 2024-11-07T18:42:25Z

/assign

dims · 2025-03-05T03:22:15Z

also hold for enough folks to chime in.

/hold

dom4ha

/lgtm from the scheduler perspective

dom4ha · 2025-03-04T08:35:55Z

staging/src/k8s.io/dynamic-resource-allocation/structured/allocator.go

-		slices:             slices,
-		celCache:           celCache,
+		adminAccessEnabled:     adminAccessEnabled,
+		prioritizedListEnabled: prioritizedListEnabled,


Have you considered creating features structure similar to the resourceclaim.Features?

Yeah, I am actually making this change in the PR for Partitionable Devices. I suggest we address this in that PR so we can get this one merged.

dom4ha · 2025-03-05T14:26:12Z

/cc @sanposhiho @alculquicondor

dom4ha · 2025-03-05T17:39:34Z

/cc @macsko @sanposhiho for sig-scheduling approval

k8s-ci-robot · 2025-03-05T17:39:38Z

@dom4ha: GitHub didn't allow me to request PR reviews from the following users: for, approval.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @macsko @sanposhiho for sig-scheduling approval

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sanposhiho · 2025-03-06T06:23:50Z

test/integration/scheduler_perf/dra.go

Where's the integration test for this feature? This file got the change to enable it, but don't we need any new test case specifically?

I'm not sure how useful such a test would be. As agreed upon in kubernetes/enhancements#5077, not everything needs all kinds of tests.

There is an E2E test, which covers the full flow (API, scheduler, etc.).

Might be Yes for some features, but this feature could be achieved with different components. Why don't we need an integration test for it?

Which additional code paths or scenarios would an integration test cover that isn't already covered?

How do we make sure the full scenario works from one component to others (or even more specifically, from one struct to others within the same component)? How do we make sure one component's behavior/output matches the expectations from others?
The integration test is not for the test coverage. It's not that once unit tests cover all code path, you can skip the integration tests.

The firstAvailable is mostly internal to the plugin. It doesn't change how the plugin interacts with the scheduler (but see below). We still need to test with an apiserver to ensure that the fields flow properly from user to apiserver to scheduler to plugin, which is covered by the E2E test.

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

That shouldn't be in scheduler_perf. We can probably add it to test/integration/scheduler/plugins/plugins_test.go.

The happy path could be in scheduler_perf.

Morten is taking some time off right now. To unblock this PR, let me see whether I can do something and address #128586 (comment).

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

Unfortunately that is exactly the scenario that currently cannot be handled in an integration test: apiserver and scheduler have to be brought up in the same process and share the global default feature gate. Without modifying both components it's impossible to bring up the apiserver with the feature enabled and the scheduler without it. It's not supported to change the feature gate during a test run (explicitly prevented by featuregatetesting.SetFeatureGateDuringTest). Perhaps some custom code could do it.

I think this falls under "version skew testing" which is not required for alpha. @sanposhiho: okay to do it later?

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

okay to do it later?

🙆‍♂️

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

Thanks.

sanposhiho · 2025-03-06T06:32:37Z

pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go

+						return nil, status
+					}
+				} else {
+					for _, subRequest := range request.FirstAvailable {


do we need to feature-gate this path?

Yes. As in the allocator it should refuse to schedule pods which depend on firstAvailable.

This then raises the question: does the allocator then still need that feature gate check? It should never get called for claims involving firstAvailable when the feature is off.

In other words, move the check up one level?

Perhaps it's better to do it in both places. The allocator might also be used outside of the scheduler.

This is the validation path only, so the only consequence of not having the feature-gate here is that PreFilter won't fail with the complain that request.DeviceClassName is missing (when firstAvailable was present) in PreFilter. The further attempt to use firstAvailable should still fail in allocator with a different error (most likely in Filter).

True. But it's still better to fail earlier. I've implemented that, just need to finish the integration test updates.

Done in #130622.

sanposhiho · 2025-03-06T06:35:43Z

staging/src/k8s.io/dynamic-resource-allocation/structured/allocator.go

-				return nil, fmt.Errorf("claim %s, request %s: admin access is requested, but the feature is disabled", klog.KObj(claim), request.Name)
+			// Error out if the prioritizedList feature is not enabled and the request
+			// has subrequests. This is to avoid surprising behavior for users.
+			if !a.prioritizedListEnabled && hasSubRequests {


What if users followed enable -> disable path? i.e., a user created it with FirstAvailable with the feature gate enabled, and then disable the feature gate for some reason. It looks like they must update all claims not to have FirstAvailable when they need to disable the gate.
Is there any existing behavior that FirstAvailable requests can fall back to?

Is there any existing behavior that FirstAvailable requests can fall back to?

No. All that Kubernetes can do is to not behave badly. Ignoring the sub-request and scheduling the pod without the devices that it needs is worse than refusing to schedule with an explanation why.

As you said, the user then needs to take action and either define their workload differently or accept that the cluster cannot run it.

Then, is there any way to make it visible to users? (instead of logs)

Refusal to deal with the request would cause PreFilter to fail and thus would get reported to the user as a scheduling failure.

But only assuming the comment #128586 (comment) is addressed. Without it the validation will pass in PreFilter and a pod will most likely remain unschedulable (fail in Filter), which might be acceptable as well.

#130622 adds that, so with that PR it'll fail already in PreFilter.

pohly

I don't know whether @mortent is available to take my commits from #130622 into his branch for this PR. I therefore prefer merging that as a follow-up.

@sanposhiho: are you okay with that? I think you can do the formal LGTM.

I believe everything else has been reviewed (@dom4ha in #128586 (review) for scheduler, Tim in #128586 (review) for the API). It also looks good to me.

pohly · 2025-03-06T19:01:24Z

pkg/scheduler/framework/plugins/dynamicresources/dynamicresources.go

+						return nil, status
+					}
+				} else {
+					for _, subRequest := range request.FirstAvailable {


Done in #130622.

pohly · 2025-03-06T19:02:43Z

staging/src/k8s.io/dynamic-resource-allocation/structured/allocator.go

-				return nil, fmt.Errorf("claim %s, request %s: admin access is requested, but the feature is disabled", klog.KObj(claim), request.Name)
+			// Error out if the prioritizedList feature is not enabled and the request
+			// has subrequests. This is to avoid surprising behavior for users.
+			if !a.prioritizedListEnabled && hasSubRequests {


#130622 adds that, so with that PR it'll fail already in PreFilter.

pohly · 2025-03-06T19:09:00Z

test/integration/scheduler_perf/dra.go

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

Unfortunately that is exactly the scenario that currently cannot be handled in an integration test: apiserver and scheduler have to be brought up in the same process and share the global default feature gate. Without modifying both components it's impossible to bring up the apiserver with the feature enabled and the scheduler without it. It's not supported to change the feature gate during a test run (explicitly prevented by featuregatetesting.SetFeatureGateDuringTest). Perhaps some custom code could do it.

I think this falls under "version skew testing" which is not required for alpha. @sanposhiho: okay to do it later?

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

thockin · 2025-03-06T19:14:27Z

/approve

k8s-ci-robot · 2025-03-06T19:14:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, johnbelamaric, mortent, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~api/OWNERS~~ [thockin]
~~cmd/kube-controller-manager/OWNERS~~ [dims,thockin]
~~pkg/apis/OWNERS~~ [thockin]
~~pkg/controller/resourceclaim/OWNERS~~ [dims,thockin]
~~pkg/features/OWNERS~~ [dims,thockin]
~~pkg/generated/openapi/OWNERS~~ [dims,thockin]
~~pkg/quota/v1/OWNERS~~ [dims,thockin]
~~pkg/registry/OWNERS~~ [dims,thockin]
~~pkg/scheduler/OWNERS~~ [dims,thockin]
~~staging/src/k8s.io/api/OWNERS~~ [thockin]
~~staging/src/k8s.io/client-go/applyconfigurations/OWNERS~~ [dims,thockin]
~~staging/src/k8s.io/dynamic-resource-allocation/OWNERS~~ [dims,thockin]
~~test/OWNERS~~ [dims,thockin]
~~test/featuregates_linter/test_data/OWNERS~~ [dims,thockin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sanposhiho

/lgtm
/approve

sig-scheduling.

note: Some of my comments aren't addressed within this PR, but #130622 will address those. Also, those points are, either way, not super critical ones that we must include in the same PR.

k8s-ci-robot · 2025-03-07T02:57:48Z

LGTM label has been added.

Git tree hash: 8942038fe5779e85d2db34003055e21edc7d9c1e

johnbelamaric · 2025-03-07T03:00:54Z

/hold cancel

pohly · 2025-03-07T07:46:36Z

Some of my comments aren't addressed within this PR, but #130622 will address those. Also, those points are, either way, not super critical ones that we must include in the same PR.

Thanks @sanposhiho for your review! I've rebased #130622, let's continue there.

k8s-ci-robot requested review from apelisse and bart0sh November 5, 2024 18:17

johnbelamaric reviewed Nov 5, 2024

View reviewed changes

pkg/apis/resource/types.go Show resolved Hide resolved

pkg/apis/resource/types.go Outdated Show resolved Hide resolved

mortent force-pushed the DRAPrioritizedList branch from 31aaf8f to 19ab559 Compare November 5, 2024 23:53

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 5, 2024

k8s-ci-robot assigned johnbelamaric Nov 7, 2024

mortent force-pushed the DRAPrioritizedList branch from 19ab559 to e49122d Compare November 8, 2024 02:24

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 8, 2024

mortent force-pushed the DRAPrioritizedList branch from e49122d to fa17f45 Compare November 8, 2024 03:11

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 5, 2025

dom4ha reviewed Mar 5, 2025

View reviewed changes

k8s-ci-robot requested review from alculquicondor and sanposhiho March 5, 2025 14:26

k8s-ci-robot requested a review from macsko March 5, 2025 17:39

cici37 pushed a commit to cici37/kubernetes that referenced this pull request Mar 5, 2025

Commits from kubernetes#128586

3bc083e

sanposhiho reviewed Mar 6, 2025

View reviewed changes

pohly mentioned this pull request Mar 6, 2025

DRA: Prioritized Alternatives in Device Requests, II #130622

Merged

pohly reviewed Mar 6, 2025

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 6, 2025

sanposhiho approved these changes Mar 7, 2025

View reviewed changes

k8s-ci-robot assigned sanposhiho Mar 7, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2025

k8s-ci-robot merged commit 9d45ea8 into kubernetes:master Mar 7, 2025
19 of 20 checks passed

github-project-automation bot moved this from PRs Waiting on Author to Done in SIG Node CI/Test Board Mar 7, 2025

github-project-automation bot moved this from Tracked to Done in [sig-release] Bug Triage Mar 7, 2025

github-project-automation bot moved this from In Progress to Done in SIG Apps Mar 7, 2025

This was referenced Mar 10, 2025

[Flaking Test] UT TestRoundTripTypes for DeviceRequest related #130674

Closed

fix a flake of TestRoundTripTypes: for FirstAvailable[].AllocationMode #130675

Merged

johnbelamaric moved this from 👀 In review to ✅ Done in SIG Node: Dynamic Resource Allocation Mar 11, 2025

Prioritized Alternatives in Device Requests #128586

Prioritized Alternatives in Device Requests #128586

Conversation

mortent commented Nov 5, 2024 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

johnbelamaric left a comment

Choose a reason for hiding this comment

johnbelamaric commented Nov 7, 2024

dims commented Mar 5, 2025

dom4ha left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dom4ha commented Mar 5, 2025

dom4ha commented Mar 5, 2025

k8s-ci-robot commented Mar 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanposhiho Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dom4ha Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pohly left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thockin commented Mar 6, 2025

k8s-ci-robot commented Mar 6, 2025

sanposhiho left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 7, 2025

johnbelamaric commented Mar 7, 2025

pohly commented Mar 7, 2025

mortent commented Nov 5, 2024 •

edited

Loading

sanposhiho Mar 6, 2025 •

edited

Loading

dom4ha Mar 6, 2025 •

edited

Loading

pohly left a comment •

edited

Loading