Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritized Alternatives in Device Requests #128586

Merged
merged 10 commits into from
Mar 7, 2025

Conversation

mortent
Copy link
Member

@mortent mortent commented Nov 5, 2024

What type of PR is this?

/kind feature
/kind api-change
/kind deprecation

What this PR does / why we need it:

This implements https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4816-dra-prioritized-list

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

DRA support for a "one-of" prioritized list of selection criteria to satisfy a device request in a resource claim.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/code-generation area/test sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 5, 2024
Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super cursory quick look...

@mortent mortent force-pushed the DRAPrioritizedList branch from 31aaf8f to 19ab559 Compare November 5, 2024 23:53
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 5, 2024
@johnbelamaric
Copy link
Member

/assign

@mortent mortent force-pushed the DRAPrioritizedList branch from 19ab559 to e49122d Compare November 8, 2024 02:24
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 8, 2024
@mortent mortent force-pushed the DRAPrioritizedList branch from e49122d to fa17f45 Compare November 8, 2024 03:11
@dims
Copy link
Member

dims commented Mar 5, 2025

also hold for enough folks to chime in.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 5, 2025
Copy link
Member

@dom4ha dom4ha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm from the scheduler perspective

slices: slices,
celCache: celCache,
adminAccessEnabled: adminAccessEnabled,
prioritizedListEnabled: prioritizedListEnabled,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered creating features structure similar to the resourceclaim.Features?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am actually making this change in the PR for Partitionable Devices. I suggest we address this in that PR so we can get this one merged.

@dom4ha
Copy link
Member

dom4ha commented Mar 5, 2025

/cc @sanposhiho @alculquicondor

@dom4ha
Copy link
Member

dom4ha commented Mar 5, 2025

/cc @macsko @sanposhiho for sig-scheduling approval

@k8s-ci-robot k8s-ci-robot requested a review from macsko March 5, 2025 17:39
@k8s-ci-robot
Copy link
Contributor

@dom4ha: GitHub didn't allow me to request PR reviews from the following users: for, approval.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @macsko @sanposhiho for sig-scheduling approval

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

cici37 pushed a commit to cici37/kubernetes that referenced this pull request Mar 5, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's the integration test for this feature? This file got the change to enable it, but don't we need any new test case specifically?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how useful such a test would be. As agreed upon in kubernetes/enhancements#5077, not everything needs all kinds of tests.

There is an E2E test, which covers the full flow (API, scheduler, etc.).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be Yes for some features, but this feature could be achieved with different components. Why don't we need an integration test for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which additional code paths or scenarios would an integration test cover that isn't already covered?

Copy link
Member

@sanposhiho sanposhiho Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we make sure the full scenario works from one component to others (or even more specifically, from one struct to others within the same component)? How do we make sure one component's behavior/output matches the expectations from others?
The integration test is not for the test coverage. It's not that once unit tests cover all code path, you can skip the integration tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The firstAvailable is mostly internal to the plugin. It doesn't change how the plugin interacts with the scheduler (but see below). We still need to test with an apiserver to ensure that the fields flow properly from user to apiserver to scheduler to plugin, which is covered by the E2E test.

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

That shouldn't be in scheduler_perf. We can probably add it to test/integration/scheduler/plugins/plugins_test.go.

The happy path could be in scheduler_perf.

Morten is taking some time off right now. To unblock this PR, let me see whether I can do something and address #128586 (comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

Unfortunately that is exactly the scenario that currently cannot be handled in an integration test: apiserver and scheduler have to be brought up in the same process and share the global default feature gate. Without modifying both components it's impossible to bring up the apiserver with the feature enabled and the scheduler without it. It's not supported to change the feature gate during a test run (explicitly prevented by featuregatetesting.SetFeatureGateDuringTest). Perhaps some custom code could do it.

I think this falls under "version skew testing" which is not required for alpha. @sanposhiho: okay to do it later?

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay to do it later?

🙆‍♂️

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

Thanks.

return nil, status
}
} else {
for _, subRequest := range request.FirstAvailable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to feature-gate this path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As in the allocator it should refuse to schedule pods which depend on firstAvailable.

This then raises the question: does the allocator then still need that feature gate check? It should never get called for claims involving firstAvailable when the feature is off.

In other words, move the check up one level?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's better to do it in both places. The allocator might also be used outside of the scheduler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the validation path only, so the only consequence of not having the feature-gate here is that PreFilter won't fail with the complain that request.DeviceClassName is missing (when firstAvailable was present) in PreFilter. The further attempt to use firstAvailable should still fail in allocator with a different error (most likely in Filter).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. But it's still better to fail earlier. I've implemented that, just need to finish the integration test updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #130622.

return nil, fmt.Errorf("claim %s, request %s: admin access is requested, but the feature is disabled", klog.KObj(claim), request.Name)
// Error out if the prioritizedList feature is not enabled and the request
// has subrequests. This is to avoid surprising behavior for users.
if !a.prioritizedListEnabled && hasSubRequests {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if users followed enable -> disable path? i.e., a user created it with FirstAvailable with the feature gate enabled, and then disable the feature gate for some reason. It looks like they must update all claims not to have FirstAvailable when they need to disable the gate.
Is there any existing behavior that FirstAvailable requests can fall back to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any existing behavior that FirstAvailable requests can fall back to?

No. All that Kubernetes can do is to not behave badly. Ignoring the sub-request and scheduling the pod without the devices that it needs is worse than refusing to schedule with an explanation why.

As you said, the user then needs to take action and either define their workload differently or accept that the cluster cannot run it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, is there any way to make it visible to users? (instead of logs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refusal to deal with the request would cause PreFilter to fail and thus would get reported to the user as a scheduling failure.

Copy link
Member

@dom4ha dom4ha Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But only assuming the comment #128586 (comment) is addressed. Without it the validation will pass in PreFilter and a pod will most likely remain unschedulable (fail in Filter), which might be acceptable as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#130622 adds that, so with that PR it'll fail already in PreFilter.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether @mortent is available to take my commits from #130622 into his branch for this PR. I therefore prefer merging that as a follow-up.

@sanposhiho: are you okay with that? I think you can do the formal LGTM.

I believe everything else has been reviewed (@dom4ha in #128586 (review) for scheduler, Tim in #128586 (review) for the API). It also looks good to me.

return nil, status
}
} else {
for _, subRequest := range request.FirstAvailable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #130622.

return nil, fmt.Errorf("claim %s, request %s: admin access is requested, but the feature is disabled", klog.KObj(claim), request.Name)
// Error out if the prioritizedList feature is not enabled and the request
// has subrequests. This is to avoid surprising behavior for users.
if !a.prioritizedListEnabled && hasSubRequests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#130622 adds that, so with that PR it'll fail already in PreFilter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason that I can think of for an integration test is the "feature disabled, field is set, PreFilter returns error". That is better tested through an integration test. But IMHO it's not important.

Unfortunately that is exactly the scenario that currently cannot be handled in an integration test: apiserver and scheduler have to be brought up in the same process and share the global default feature gate. Without modifying both components it's impossible to bring up the apiserver with the feature enabled and the scheduler without it. It's not supported to change the feature gate during a test run (explicitly prevented by featuregatetesting.SetFeatureGateDuringTest). Perhaps some custom code could do it.

I think this falls under "version skew testing" which is not required for alpha. @sanposhiho: okay to do it later?

Instead, I added some test cases to scheduler_perf and to test/integration/dra, see #130622.

@thockin
Copy link
Member

thockin commented Mar 6, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, johnbelamaric, mortent, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 6, 2025
Copy link
Member

@sanposhiho sanposhiho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

sig-scheduling.

note: Some of my comments aren't addressed within this PR, but #130622 will address those. Also, those points are, either way, not super critical ones that we must include in the same PR.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 8942038fe5779e85d2db34003055e21edc7d9c1e

@johnbelamaric
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2025
@k8s-ci-robot k8s-ci-robot merged commit 9d45ea8 into kubernetes:master Mar 7, 2025
19 of 20 checks passed
@github-project-automation github-project-automation bot moved this from PRs Waiting on Author to Done in SIG Node CI/Test Board Mar 7, 2025
@github-project-automation github-project-automation bot moved this from Tracked to Done in [sig-release] Bug Triage Mar 7, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in SIG Apps Mar 7, 2025
@pohly
Copy link
Contributor

pohly commented Mar 7, 2025

Some of my comments aren't addressed within this PR, but #130622 will address those. Also, those points are, either way, not super critical ones that we must include in the same PR.

Thanks @sanposhiho for your review! I've rebased #130622, let's continue there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/code-generation area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Projects
Status: Assigned
Archived in project
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.