Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 · 2024-12-16T22:10:59Z

Describe the bug
Not sure if this is a bug yet, as much as a sanity check/question. I recently started using the ThreadPoolExecutor context from concurrent.futures to parallelize network requests.

I've had tests running using pyfakefs for about a month or so now. As soon as I added the thread pool, some tests started taking a significant amount of time. I'm talking about from going from milliseconds to 30 seconds.

I thought it was a deadlock at first but the tests always eventually resolve. There is also no such slow down outside of the testing environment. Although I haven't ruled everything out, if I disable pyfakefs, the issue disappears.

I assume this is caused for much the same reasons why subprocess and multiprocessing have known compatibility issues. I mean we are talking threads and not processes, but they probably make similar low-level calls.

So here's my question: is concurrent.futures another library with known compatibility issues? Should it be added to the documentation I linked to above? Does the same go for the older ThreadPool library?

And is there a general recommendation for working around this kind of issue? I don't really want to disable the thread pool just for the tests and I also need pyfakefs to mock the file system.

Let me know if you would like more context or details. All of this is in an open conda project so I can grab some GitHub Action logs if that would be useful. Thanks!

The text was updated successfully, but these errors were encountered:

mrbean-bremen · 2024-12-17T06:40:50Z

Interesting... I haven't seen this yet, probably because nobody has used ThreadPoolExecutor or concurrent.futures with pyfakefs, but would guess that it is not compatible given that multithreading also has problems. pyfakefs is generally more suited to "standard" unit tests as opposed to integration tests in the sense of the complexity of the environment.

The behavior is strange though. So far, the problems usually appear as a faked file function trying to access a real file or vice verse, resulting in an exception, mostly FileNotFound or similar.
I have no idea what could cause that slowdown though, but I will see if I can find something out later. I guess it is unrealistic to ask for a simple reproducible example...

schuylermartin45 · 2024-12-17T15:33:40Z

If I find time this week, I could try to isolate this to an streamlined example. I guess this is further complicated by also using:

pytest w/ xdist to parallelize the tests
Python's tempfile library
The requests library (but this is mocked out in the problematic tests)
The click library's test CLI invocation interface
So far in my testing, those components don't appear to interfere to cause the slow down.

If I don't find time before the holidays, I'll at least point you to the project code:

Thread pool usage
The test code (namely the cases that use the gsm-amzn2-aarch64_version_bump.yaml and libprotobuf_version_bump.yaml files)
Recent GHA run Note that the pytest-cov phase takes a minute to complete. This used to be on the order of a few seconds. I will point out that I haven't addressed the warnings related to weakref yet.

~~On my current working branch, I've added what should have been a simple test that now takes 7 minutes to complete.~~
Edit: The 7 minute test was a complete goof on my part that I conflated with the other issue.

Maybe I have some battle scars from my IoT days, but I suppose I could have some of these problematic tests write to actual disk. PITA I could probably work around for local testing, but shouldn't be an issue on the GitHub Action runners.

mrbean-bremen · 2024-12-17T17:48:12Z

Thank you! I'm also not sure when I will tackle this - I've a couple of other things in my queue, and there are the holidays of course.
As for the used libraries:
xdist should not be a problem, each process will have it's own fake filesystem. This makes it more difficult to debug, of course, but I guess that problem will persist if running without xdist.
tempfile should not be a problem, it is mostly patched like any other module (with the excption of some additional patching done under Posix to handle a default argument that is a file system function).
request would probably be a problem, but not if it is mocked out, of course. Not sure about click - I do remember some problem with it, have to check that.

If you are doing more integrated tests, sometimes you can't get around using the real filesystem. A good compromise is to use a RAM disk as file system, if that is possible, that would cleaning it up easy.

Anyway I'm interested to understand the problem, even if cannot be fixed easily - it sounds a bit wierd.

schuylermartin45 · 2024-12-17T17:58:19Z

I've had some issues with mixing click and pathlib with pyfakefs. With how click uses decorators, you can't guarantee Path object will be constructed after pyfakefs mocks all the calls. We have a few notes in the code base about this and have had to use string file paths with the click decorators as a work around.

I've thought about using a RAM disk or setting up a local container to run these in, but that's a lot of overhead and maintenance I'd like to try to avoid. The more steps I add to my setup process, the less likely folks will be willing to try to fork and contribute to the project. I'm sure I could find a way to streamline that, but the appeal to me to use pyfakefs is how easy it is to setup/use once it is installed.

Total props for maintaining this project btw, I've really enjoyed using it so far.

- also fix some minor documentation issues - see pytest-dev#1102

mrbean-bremen · 2025-03-13T19:55:48Z

As the real discussion happened in the other repo, I'm copying the relevant comments over from there, and will turn this into a discussion issue.

So far I only found that it has probably to do with the fact that the packages in question have more than one URL (and SHA), which seem to be handled in parallel. pyfakefs does nothing to be thread-safe (though that being Python, it shouldn't need to, at least in normal Python code), and there seems to be some timing problem here.

Originally posted by @mrbean-bremen in #265

I'd have to double check, but I think the parallelism mechanisms I'm using bypass the global lock to perform better with I/O bound ops.

So I think a resource contention issue makes sense.
...
I think requests releases the lock via sockets.

Originally posted by @schuylermartin45 in #265

Ok, I had another look after what I've learned, and I'm pretty sure that the problem is an artifact of the requests mock - specifically, that it accesses the same dummy file with write access from different threads.
If I add a lock in the mock call, the problem goes away:
from threading import Lock

_lock = Lock()
...

    match endpoint:
        case endpoint if endpoint in default_artifact_set:
            _lock.acquire()
            try:
                return MockHttpStreamResponse(200, f"archive_files/dummy_project_01.tar.gz")
            finally:
                _lock.release()

Originally posted by @mrbean-bremen in #265

Awesome! I think that all makes sense to me. I might chew on this a bit and see if there's maybe another approach. Something about putting a lock in test code feels a little side-effecty or "not fixing an underlying issue" to me. But again I need to think about it some more.

Originally posted by @schuylermartin45 in #265

Yes, the underlying issue here is probably that pyfakefs does not lock files on writing. This is OS-dependent: under Windows, you cannot open a file for writing twice, as it is done here, so the behavior is as expected, but under Linux, as far as I know there is a write lock, but not a lock on the underlying stream - though I'm not sure how much of this behavior is guaranteed, would need to look it up. Anyway, pyfakefs is not designed for multi-threading, which is something I'm reluctant to change (it may open a whole new can of worms), but which should at least be better documented (EDIT: on it).

What I meant above was that your problem only appears only because of the mock behavior, the real code would not behave this way - at least this is what I understood.

Anyway, thanks for working with me here! It is always very helpful for me to understand the problems that appear in real code working with pyfakefs.

Originally posted by @mrbean-bremen in #265

schuylermartin45 mentioned this issue Dec 16, 2024

test_bump_recipe_cli() for multi-sourced recipes dramatically slows tests down (caused by retrying on mocked HTTP calls?) conda-incubator/conda-recipe-manager#265

Open

mrbean-bremen added question limitation and removed question labels Mar 12, 2025

mrbean-bremen added a commit to mrbean-bremen/pyfakefs that referenced this issue Mar 13, 2025

Document problems with multi-threading

e47c273

- also fix some minor documentation issues - see pytest-dev#1102

pytest-dev locked and limited conversation to collaborators Mar 13, 2025

mrbean-bremen converted this issue into discussion #1140 Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 commented Dec 16, 2024

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024 •

edited

Loading

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024

mrbean-bremen commented Mar 13, 2025

This issue was moved to a discussion.

This issue was moved to a discussion.

Question: Is ThreadPoolExecutor a known incompatible library? #1102

Question: Is ThreadPoolExecutor a known incompatible library? #1102

Comments

schuylermartin45 commented Dec 16, 2024

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024 • edited Loading

mrbean-bremen commented Dec 17, 2024

schuylermartin45 commented Dec 17, 2024

mrbean-bremen commented Mar 13, 2025

This issue was moved to a discussion.

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

Question: Is `ThreadPoolExecutor` a known incompatible library? #1102

schuylermartin45 commented Dec 17, 2024 •

edited

Loading