Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove recompute_schema usage from optimizer #14357

Open
findepi opened this issue Jan 29, 2025 · 0 comments
Open

Remove recompute_schema usage from optimizer #14357

findepi opened this issue Jan 29, 2025 · 0 comments
Labels
optimizer Optimizer rules

Comments

@findepi
Copy link
Member

findepi commented Jan 29, 2025

The basic assumption that for a given operator we can recompute its schema from inputs' schema is unsound.

  • metadata: for plans constructed from SQL metadata will usually be empty, but an application can attach additional metadata to schema or field. The metadata can be assigned on the relational operator (its schema or one of the fields) and may not be derivable from inputs.
  • field qualification: a plan node may have field qualification retained from inputs or erased, or reassigned. At the optimizer time, we cannot simply assume one way or the other.

The usage of recompute_schema within optimizer should be replaced with explicit node schema updates.
For example, when pruning inputs with RequiredIndices, the node's schema should be pruned the same way, not recomputed anew.

The usage of recompute_schema within analyzer is left for a different issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

No branches or pull requests

1 participant