-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter taints nodes as un-schedulable too early causing them to be unusable and scaled down #1421
Comments
This issue is currently awaiting triage. If Karpenter contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Pod nomination is only going to exclude a node from disruption for that reconciliation cycle. If by the time the next cycle rolls around the node is no longer nominated, it will be considered as a candidate. Without anything to show otherwise, I assume that is what's happening here. If your able provide a full set of Karpenter logs that would be useful in determining exactly what is happening. As for tainting, Karpenter will only taint a node once it has decided to make a disruption decision. The only reason it may bail from this decision and un-taint the node later is if we failed to launch or register the replacement node. All #1180 did was reduce the gap in time between validation and tainting the node, if anything it made it more likely we wouldn't taint the node in the first place. Your reproduction steps are to schedule pods and then delete them, which is followed by Karpenter consolidating the nodes correct? This sounds like expected behavior to me, if there are no longer any pods to schedule against that node we're going to remove it. |
@jmdeal I think the details matter here. If you look at the timing of the events on the node, you can see that the node is almost immediately tainted after it is If we zoom back a little, the nodes are provisioned to respond to scheduling pods on them. If a node is provisioned and immediately killed, this looks like either: 1-the initial estimation has been wrong and we didn't need the node in the first place 2- there is some issue (of race condition nature) that is causing the node to go down even though it is needed. Based on the scaling behaviour that I have observed here (spinning up another node at the exact time the current node is being scaled down) the second issue is more likely to be the case. it might be alleviated once this PR is released so folks will be able to let it happen less often but I think the underlying issue might be something more serious |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
Karpenter taints nodes as un-schedulable too early causing them to be unusable and scaled down. specially when pods are being added and removed.
(please note the time differences on DisruptionBlocked and NodeNotSchedulable). If a node is picked for a pod, it should not be tainted logically
the issue might have been introduced in this PR #1180
Observed Behavior:
Karpenter taints node too early (as non-schedulable) and then won't deprovision it since it is a candidate for a pod
sample events for node1:
node 2:
karpenter logs for the second node:
Expected Behavior:
karpenter should not taint a node as non-schedulable if there is going to be a pod scheduled on it. This is causing unnecessary scaling issues
Reproduction Steps (Please include YAML):
create a nodepool with:
scheduling pods and removing them causes the nodes to be created and then tainted immediately
Versions:
v0.37.0
kubectl version
):The text was updated successfully, but these errors were encountered: