Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhead of pytest-xdist is often greater than time saved for parallelization #44

Open
alex opened this issue Feb 1, 2016 · 11 comments

Comments

@alex
Copy link

alex commented Feb 1, 2016

(I'm pretty sure this is a known issue, but I didn't see an open issue for it, and wanted there to be a canonical place to track it)

Example:

(cryptography) ~/p/cryptography (master) $ py.test tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
collected 492 items

tests/test_x509.py .........................................................................................................................................................................................
tests/test_x509_crlbuilder.py ............................
tests/test_x509_ext.py .......................................................................................................................................................................................................................................................................
tests/test_x509_revokedcertbuilder.py ................

=================================================== 492 passed in 3.60 seconds ===================================================
(cryptography) ~/p/cryptography (master) $ py.test -n 2 tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
gw0 [492] / gw1 [492]
scheduling tests via LoadScheduling
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
=================================================== 492 passed in 5.00 seconds ===================================================
(cryptography) ~/p/cryptography (master) $ py.test -n 2 tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
gw0 [492] / gw1 [492]
scheduling tests via LoadScheduling
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
=================================================== 492 passed in 4.62 seconds ===================================================
(cryptography) ~/p/cryptography (master) $ py.test -n 4 tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
gw0 [492] / gw1 [492] / gw2 [492] / gw3 [492]
scheduling tests via LoadScheduling
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
=================================================== 492 passed in 6.23 seconds ===================================================
(cryptography) ~/p/cryptography (master) $ py.test -n 4 tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
gw0 [492] / gw1 [492] / gw2 [492] / gw3 [492]
scheduling tests via LoadScheduling
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
=================================================== 492 passed in 6.73 seconds ===================================================
(cryptography) ~/p/cryptography (master) $ py.test -n 4 tests/test_x509*.py
====================================================== test session starts =======================================================
platform darwin -- Python 2.7.10[pypy-4.0.1-final], pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alex_gaynor/projects/cryptography, inifile: tox.ini
plugins: hypothesis-2.0.0, xdist-1.14
gw0 [492] / gw1 [492] / gw2 [492] / gw3 [492]
scheduling tests via LoadScheduling
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
=================================================== 492 passed in 6.53 seconds ===================================================
@nonatomiclabs
Copy link

I think that in so quick tests, it is normal because of the scheduling.
And I guess it's not really an issue as it takes almost no time anyway.

@RonnyPfannschmidt
Copy link
Member

@alex we did not actually track that detail

But we do have some different solutions in mind for different reasons

For example file-level granularity at collection time, and scheduling approaches with different kinds of latency in mind

The main blocker for this is turning scheduled into a set of explicit statemachines

@nicoddemus
Copy link
Member

Hmm file-level granularity at collection time seems simple to implement, and would eliminate the overhead of all nodes collecting all tests... not sure how big a boost would be for such small test suite anyway. FWIW we do see linear gains as we add CPUS when test suites start taking more than 2-3 minutes or so.

Not sure if @RonnyPfannschmidt wants to tackle this before the internal refactoring though.

@vladu
Copy link

vladu commented Mar 1, 2018

Is there any work currently being done to implement the "distributed collection", as described above, to avoid every node collecting every test?

@nicoddemus
Copy link
Member

@vladu I don't think anybody has taken time to work on this, but if anybody wants to start a PR we would be glad to help guide them. 👍

@vladu
Copy link

vladu commented Mar 1, 2018

I might take a stab at it. I've poked around the code a little bit, and I have some ideas how this might be accomplished, but any suggestions from the experts on a clean solution would be appreciated.

@RonnyPfannschmidt
Copy link
Member

@vladu currently there is not even groundwork fir this, i propose some kind of brainstorming to get a rough idea of starting points for experimentation, we might need a major internal refactoring in the beginning

@nicoddemus
Copy link
Member

@RonnyPfannschmidt good idea.

One solution I have thought (borrowed from some other system which I don't recall right now) is that each worker can infer which tests should be collected based on their id and total number of workers. This can be accomplished easily today because each worker knows its own id and knows how many workers there are in total (based on PYTEST_XDIST_WORKER_COUNT env variable), so they can implement pytest_collect_ignore to ignore every path except the Nth path that is given to it (assuming all workers receive the paths in the same order).

For example, here's a list of tests in a suite and which worker collects that file (with 3 workers):

tests/test_1.py  # gw0
tests/test_2.py  # gw1
tests/test_3.py  # gw2
tests/test_4.py  # gw0
tests/test_5.py  # gw1
tests/test_6.py  # gw2
...

And so on.

With that working, the master node can then just say to all workers: "hey, run all the tests you have collected" and that's it.

This is of course a quick draft, just throwing this here to see what you guys think.

@RonnyPfannschmidt
Copy link
Member

@nicoddemus thats one of the ideas i had collected under the banner "pytest-bigtest for xdist"

@jdahlin
Copy link

jdahlin commented Aug 20, 2020

Another idea would be to move collection to the master node and have the scheduling pass in tests to the worker nodes. That would also make it possible to solve #586.

@RonnyPfannschmidt
Copy link
Member

All nodes will have to recollect in any case, nodes are not designed to be network transferable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants