Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve round-robin repartitioning #6047

Closed
wants to merge 4 commits into from
Closed

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Apr 18, 2023

Which issue does this PR close?

Closes #6043

Rationale for this change

See issue

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Apr 18, 2023
@andygrove
Copy link
Member

I ran a quick test with q1.

master

Query 1 executed in: 2.431816676s
Query 1 executed in: 2.485038908s
Query 1 executed in: 2.228489379s

this PR

Query 1 executed in: 2.190762139s
Query 1 executed in: 2.071724741s
Query 1 executed in: 2.071821053s

🚀

@Dandandan
Copy link
Contributor Author

Dandandan commented Apr 18, 2023

@andygrove could you repeat it with the current version? PR was with an outdated.

It looks like main branch doesn't have RepartitionExec anymore, creating the target partitions is done inside ParquetExec now so I don't expect it to change running the query normally.

@andygrove
Copy link
Member

master

https://github.com/apache/arrow-datafusion?branch=main#cf278704

Query 1 executed in: 2.345591833s
Query 1 executed in: 2.646317406s
Query 1 executed in: 2.605519797s
Query 1 executed in: 2.42015894s
Query 1 executed in: 2.363104105s
Query 1 executed in: 2.282647509s
Query 1 executed in: 2.483410922s
Query 1 executed in: 2.318904738s
Query 1 executed in: 2.218033255s
Query 1 executed in: 2.201664783s

this PR

Query 1 executed in: 2.792964707s
Query 1 executed in: 2.952075901s
Query 1 executed in: 2.74711636s
Query 1 executed in: 2.620084054s
Query 1 executed in: 2.698429628s
Query 1 executed in: 2.764418208s
Query 1 executed in: 2.636262451s
Query 1 executed in: 2.589391766s
Query 1 executed in: 2.374242298s
Query 1 executed in: 2.291481349s

@alamb
Copy link
Contributor

alamb commented Apr 26, 2023

marking as draft as CI is not passing and this PR doesn't seem in need of active review

@alamb alamb marked this pull request as draft April 26, 2023 10:58
@Dandandan
Copy link
Contributor Author

Thanks @alamb I will revisit when I have time (needs to have a unit test and failing test updated).

Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label May 12, 2024
@github-actions github-actions bot closed this May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate Stale PR has not had any activity for some time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve RoundRobin RepartitionExec
3 participants