-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
I have been getting this error frequently.
In a batch processing pipeline involving image downloading and processing with rasterio from s3 it happend regularly in AWS batch containers.
In the same container running locally I couldn't reproduce it. I have tried fork and spawn.
(almost) the same code with dask didn't cause these issues.
Now I wanted to compare dask vs mpire in a benchmark and I got the same issue!
I think there may be some bug in mpire.
This is the benchmark:
https://github.com/sybrenjansen/multiprocessing_benchmarks
I run it with python 3.12 on linux.
❯ python benchmark_mpire.py
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738766355.878362 1560537 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738766355.882581 1560537 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
=====
MPIRE
=====
Setting up benchmark 1 ...
Benchmark #1 started
Running with 1 job
Numerical computation (MPIRE) took 59.898664474487305 seconds.
Numerical computation (MPIRE) took 60.894070625305176 seconds.
Numerical computation (MPIRE) took 60.4944064617157 seconds.
Numerical computation (MPIRE) took 60.799951791763306 seconds.
Numerical computation (MPIRE) took 60.616430044174194 seconds.
Running with 2 jobs
Numerical computation (MPIRE) took 41.30088138580322 seconds.
Numerical computation (MPIRE) took 35.43703866004944 seconds.
Numerical computation (MPIRE) took 32.693188190460205 seconds.
Numerical computation (MPIRE) took 32.87881588935852 seconds.
Numerical computation (MPIRE) took 31.104052543640137 seconds.
Running with 4 jobs
Numerical computation (MPIRE) took 18.82200002670288 seconds.
Numerical computation (MPIRE) took 17.475537300109863 seconds.
Numerical computation (MPIRE) took 16.806360006332397 seconds.
Numerical computation (MPIRE) took 16.586058139801025 seconds.
Numerical computation (MPIRE) took 17.376100063323975 seconds.
Results:
- Numerical computation:
- 1 jobs, runtime mean: 60.54070467948914, std: 0.34990571231388484, total mean: 60.54070467948914, std: 0.34990571231388484
- 2 jobs, runtime mean: 34.6827953338623, std: 3.5885435587666694, total mean: 34.6827953338623, std: 3.5885435587666694
- 4 jobs, runtime mean: 17.41321110725403, std: 0.7800510616255131, total mean: 17.41321110725403, std: 0.7800510616255131
Setting up benchmark 2 ...
Creating benchmark #2 documents data ...
Benchmark #2 started
Running with 1 job
Traceback (most recent call last):
File "/home/thomas/multiprocessing_benchmarks/benchmark_mpire.py", line 155, in <module>
main()
File "/home/thomas/multiprocessing_benchmarks/benchmark_mpire.py", line 142, in main
run_trials(partial(benchmark_2, documents), "Stateful computation",
File "/home/thomas/multiprocessing_benchmarks/util.py", line 146, in run_trials
benchmark_function(n_jobs)
File "/home/thomas/multiprocessing_benchmarks/benchmark_mpire.py", line 65, in benchmark_2
pool.map_unordered(streaming_actor.add_document,
File "/home/thomas/multiprocessing_benchmarks/venv/lib/python3.12/site-packages/mpire/pool.py", line 534, in map_unordered
return list(self.imap_unordered(func, iterable_of_args, iterable_len, max_tasks_active, chunk_size,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/thomas/multiprocessing_benchmarks/venv/lib/python3.12/site-packages/mpire/pool.py", line 788, in imap_unordered
self._handle_exception()
File "/home/thomas/multiprocessing_benchmarks/venv/lib/python3.12/site-packages/mpire/pool.py", line 926, in _handle_exception
raise exception
RuntimeError: Worker-0 died unexpectedly
Metadata
Metadata
Assignees
Labels
No labels