You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[question]Regarding the inconsistency in the computation methods for production and non-production resources in the load_aware plugin of the koord-scheduler.
#2317
Open
ditingdapeng opened this issue
Jan 9, 2025
· 4 comments
When using Koordinator's load-aware scheduler plugin, I discovered that when hotspot issues occur, many nodes exceeding the threshold still remain in the scoring phase before non-production pods are bound to nodes. The expected behavior is that highly utilized nodes should be filtered out during the filter phase.
Therefore, I reviewed the filter code with this concern and found inconsistencies in how production and non-production resources are calculated within the filter. This leads to node resources being underestimated when scheduling non-production pods. Specific references can be found below.
I am unsure if this discrepancy is intentional as part of the design, look forward to your response!
What you expected to happen:
The code process is as follows: when calculating the total node resources, it subtracts the anomaly (estimated number of pods), whereas when calculating production resources, it adds the anomaly (estimated number of pods).
I also noticed the same issue while using the load_aware plugin in Koordinator. Specifically, the inconsistency in resource computation between production and non-production workloads has been a point of confusion for me as well.
Like you mentioned, I would have expected nodes exceeding the threshold to be filtered out during the filter phase, but it seems they are still considered during the scoring phase for non-production pods. This behavior can sometimes lead to unexpected scheduling results.
I’m also curious if this is an intentional design decision or if there might be room for improvement in the computation logic. Looking forward to insights from the maintainers or contributors on this matter!
@ditingdapeng Please note that nodeUsage >= sum(podUsage) due to a basic node-level overhead. And the assignedPodEstimatedUsed is mainly for the in-flight pods including both the abnormal pods which do not reported in the nodeMetric status and the normal pods just assigned without a valid pod metric. So there is not a certain underestimation for the non-Prod pod when comparing the given formula terms. Anyway, it is still an interesting topic for your problem. How about joining the bi-weekly meeting of the community to discuss together?
@ditingdapeng Like we discussed on the meeting, you can check the real value inside NodeMetric to verify if the scheduling result is as expected. We can communicate with more details here.
What happened:
When using Koordinator's load-aware scheduler plugin, I discovered that when hotspot issues occur, many nodes exceeding the threshold still remain in the scoring phase before non-production pods are bound to nodes. The expected behavior is that highly utilized nodes should be filtered out during the filter phase.
Therefore, I reviewed the filter code with this concern and found inconsistencies in how production and non-production resources are calculated within the filter. This leads to node resources being underestimated when scheduling non-production pods. Specific references can be found below.
I am unsure if this discrepancy is intentional as part of the design, look forward to your response!
What you expected to happen:
The code process is as follows: when calculating the total node resources, it subtracts the anomaly (estimated number of pods), whereas when calculating production resources, it adds the anomaly (estimated number of pods).
code link
code link
Environment:
The text was updated successfully, but these errors were encountered: