Deadlock when trying to open same zarr file with multiple processes #2868
-
Hello together, I have a question about a somewhat specific multiprocessing problem. Long story short, I run into deadlocks while trying to open the same zarr file with multiple processes, here is some minimal code: import multiprocessing as mp
import zarr
def worker(i):
print(f"Stated worker {i}")
z = zarr.open("data.zarr", mode="r+")
print(f"Opened store for {i} | {dict(z.attrs)}")
a = z.attrs["done"]
a.append(i)
z.attrs["done"] = a
def main():
z = zarr.create(
shape=(10, 10),
chunks=(5, 5),
store="data.zarr",
overwrite=True,
)
z.attrs["done"] = []
p1 = mp.Process(target=worker, args=(1,))
p2 = mp.Process(target=worker, args=(2,))
p1.start()
p2.start()
p1.join()
p2.join()
z = zarr.open("data.zarr", mode="r")
print(z.attrs["done"])
main() This outputs Stated worker 1
Stated worker 2 and then stops doing anything. Reason for this seems to be the Is this intentional behavior? What are possible workarounds or better solutions / approaches? About my concrete use caseI want to build a multiprocessing AND multithreading capable pipeline (user should choose, ideally this makes the pipeline compatible with e.g. dask or ray) which uses the same underlying datacube as auxiliary data. Maybe this could help to understand my vision step-by-step:
This approach works very well in a non-multiprocessed version. I already was able to get it running with multiple threads. However, since threads (in python) are only useful for (network) IO-bounded tasks and not compute bound tasks, I also want to be able to use multiprocessing. I currently plan to make a library with that functionality, of course I will share it when it's ready. :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Zarr by itself is not capable of providing safe concurrent modification of metadata from multiple uncoordinated processes, as in your example. There are inevitable race conditions and deadlocks. It's up to the user's code to avoid these situations. I would highly recommend exploring Icechunk for this scenario. Icechunk augments Zarr with a transactional storage engine. With Icechunk as your store, each process can commit its changes in a safe way via an ACID transaction. |
Beta Was this translation helpful? Give feedback.
Zarr by itself is not capable of providing safe concurrent modification of metadata from multiple uncoordinated processes, as in your example. There are inevitable race conditions and deadlocks. It's up to the user's code to avoid these situations.
I would highly recommend exploring Icechunk for this scenario. Icechunk augments Zarr with a transactional storage engine. With Icechunk as your store, each process can commit its changes in a safe way via an ACID transaction.