Add a detailed metric for deprovisioning eligible nodes #695
Labels
deprovisioning
Issues related to node deprovisioning
help wanted
Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.
kind/feature
Categorizes issue or PR as related to a new feature.
metrics-audit
Tell us about your request
Currently Karpenter exposes
karpenter_deprovisioning_eligible_machines
as a total count of nodes that are eligible for deprovisioning by deprovisioner type (e.g. consolidation/emptiness)This is good, I can easily tell that a cluster has nodes that could be deprovisioned.
What's missing is
a) Which nodes?
b) What's blocking deprovisioning (if anything)?
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Ideally I want a dashboard and/or alert that says "Hey there are x nodes in your cluster that could be deprovisioned but aren't because of Y"
I can choose to ignore if that's because they are spots, or because replacing them would take more nodes etc
But I could easily see that if I manually migrate a do-not-evict pod I can free up a node to be consolidated
Are you currently working around this issue?
Manually going through every node (with some educated guessing based on node resource consumption) and running
kubectl describe
to see why that node is or is not consolidatableAdditional Context
A label on the node would be helpful too / instead, although my preference would be for a metric, if you could at least do
kubectl get nodes -l karpenter.sh/deprovisioning-eligible
that would be an improvement!Attachments
No response
Community Note
The text was updated successfully, but these errors were encountered: