You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
Datafusion have benchmarks which would determine general performance of datafusion as a whole system, but when it comes to testing and analyzing each operator in datafusion, there is no major development so far.
We currently can potentially monitor performance metrics of DataFusion, but if a single operator could be responsible for slow performance then finding that operator could be a difficult task. This is something I discussed a bit in slack and in #5504 .
There is a lot which could be added with passage of time, but so far I would like to listen to suggestions from the community about what they think as well.
Describe alternatives you've considered
No response
Additional context
I will be continuously updating this issue as more information is gained. This would be a rather large project in itself, maybe we could branch out more such tickets to gradually add more benchmarks for each operators.
The text was updated successfully, but these errors were encountered:
I would like to work on few such operators this week and add updates if I find something nice. I might also be creating another issue to add scripts to run these per-operator benchmarks and produce suitable outputs.
Is your feature request related to a problem or challenge?
Datafusion have benchmarks which would determine general performance of datafusion as a whole system, but when it comes to testing and analyzing each operator in datafusion, there is no major development so far.
We currently can potentially monitor performance metrics of DataFusion, but if a single operator could be responsible for slow performance then finding that operator could be a difficult task. This is something I discussed a bit in slack and in #5504 .
Describe the solution you'd like
Like the implementation of benchmark
SortPreservingMerge
: https://github.com/apache/datafusion/blob/main/datafusion/core/benches/spm.rs , we could build a bunch of benchmarks for different operators.We could start with a few like
Filter
, joins likeHashJoin
,Projection
and so on.There is a lot which could be added with passage of time, but so far I would like to listen to suggestions from the community about what they think as well.
Describe alternatives you've considered
No response
Additional context
I will be continuously updating this issue as more information is gained. This would be a rather large project in itself, maybe we could branch out more such tickets to gradually add more benchmarks for each operators.
The text was updated successfully, but these errors were encountered: