-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Cascade Delete When Removing Karpenter from my Cluster #1040
Comments
Building a coalition of supporters for this idea is effort, but may pay off really well. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/reopen |
@jonathan-innis: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/remove-lifecycle rotten |
/triage accepted |
Discussed this in WG today: The consensus was that folks, in general, still want the ability to have the graceful termination of their nodes -- so they don't want Karpenter to always do a forceful termination of all the nodes on their behalf. There are currently workarounds with the TerminationGracePeriod implementation that would allow users to start the teardown of Karpenter's CRDs, have the NodeClaims start to be terminated, and then have a user or automation annotate all of the nodes with the In the case that you want forceful termination, you could mark the timestamp to be the current time and then everything should start forcefully removing itself, with the instances that were launched by Karpneter torn down as well. |
Description
What problem are you trying to solve?
I'd like to be able to configure cascading delete behavior for Karpenter so that I can set values on NodePool deletion or on CRD deletion that convey to Karpenter that I want a more expedited termination of my nodes rather than waiting for all nodes to fully drain.
Right now, it's possible for nodes to hang due to stuck pods or fully blocking PDBs due to our graceful drain logic. Because a NodePool deletion or CRD deletion causes all the nodes to gracefully drain, it's also possible for these deletion operations to hang, halting the whole process. Ideally, a user could send through something like
--grace-period
when they are deleting a resource and Karpenter could reason about how to pass that down to all resources that the deletion cascades to.Minimally, we should allow CRD deletions to get unblocked so that cluster operators can uninstall Karpenter from clusters without being blocked by graceful node drains that may hang.
An initial implementation of this was tried here #466 and there was some discussion in the community about enabling the ability to pass
gracePeriod
through to CRs in the same way that you can pass them through to pods today to affect thedeletionTimestamp
for a CR, allowing controller authors to build custom logic around thisgracePeriod
concept.The text was updated successfully, but these errors were encountered: