You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The current implementation of the filter statistics hides all the column statistics which makes it really hard to further cost estimators to work (e.g. if the parent node is a hashjoin, it needs the child's column boundaries to estimate its own result; otherwise it just gives up).
Describe the solution you'd like
There are certain cases where we can know a particular filter's effect on the resulting table (e.g. a > 25 on a a=[0, 100]; b=[50, 60] would mean a=[25, 100] (different), b=[50, 60] (same)). For simple (and relatively common) expressions like the above, we should be able to derive the new column boundaries for used predicates and push it down further in the statistic estimation chain.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The current implementation of the filter statistics hides all the column statistics which makes it really hard to further cost estimators to work (e.g. if the parent node is a hashjoin, it needs the child's column boundaries to estimate its own result; otherwise it just gives up).
Describe the solution you'd like
There are certain cases where we can know a particular filter's effect on the resulting table (e.g.
a > 25
on aa=[0, 100]; b=[50, 60]
would meana=[25, 100] (different), b=[50, 60] (same)
). For simple (and relatively common) expressions like the above, we should be able to derive the new column boundaries for used predicates and push it down further in the statistic estimation chain.Describe alternatives you've considered
None
Additional context
Related to #3929
The text was updated successfully, but these errors were encountered: