Enrich filter statistics predictions with estimated column boundaries #4518

isidentical · 2022-12-05T20:47:00Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The current implementation of the filter statistics hides all the column statistics which makes it really hard to further cost estimators to work (e.g. if the parent node is a hashjoin, it needs the child's column boundaries to estimate its own result; otherwise it just gives up).

Describe the solution you'd like
There are certain cases where we can know a particular filter's effect on the resulting table (e.g. a > 25 on a a=[0, 100]; b=[50, 60] would mean a=[25, 100] (different), b=[50, 60] (same)). For simple (and relatively common) expressions like the above, we should be able to derive the new column boundaries for used predicates and push it down further in the statistic estimation chain.

Describe alternatives you've considered
None

Additional context
Related to #3929

The text was updated successfully, but these errors were encountered:

isidentical added the enhancement New feature or request label Dec 5, 2022

isidentical mentioned this issue Dec 5, 2022

Enrich filter statistics with known column boundaries #4519

Merged

alamb closed this as completed in #4519 Dec 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich filter statistics predictions with estimated column boundaries #4518

Enrich filter statistics predictions with estimated column boundaries #4518

isidentical commented Dec 5, 2022

Enrich filter statistics predictions with estimated column boundaries #4518

Enrich filter statistics predictions with estimated column boundaries #4518

Comments

isidentical commented Dec 5, 2022