-
Notifications
You must be signed in to change notification settings - Fork 211
Description
General
- Operating System: Windows 11 Professional 22H2 22621.2715
- Python version: 3.10
- Pandas version: 2.0.3
- Pandarallel version: 1.6.5
Acknowledgement
- My issue is NOT present when using
pandaswithout alone (withoutpandarallel) - If I am on Windows, I read the Troubleshooting page
before writing a new bug report
Bug description
Pandarallel could stuck without raising any errors when using all the physical cores, while some of them may be occupied by other tasks in the background at the same time.
Observed behavior
My CPU is Intel(R) Core(TM) i7-14700K, which has 20 physical cores as shown by psutil.cpu_count(logical=False). But when I try using all those cores in Pandarallel, it could stuck without raising any errors. If I turn on the progress_bar, I can see that few of bars not moving at all.
I am pretty sure that there is nothing wrong within my code, because after I reboot the system and re-run the code (without doing anything else and no other tasks in the background), it could work totally as expected.
I think this problem is similar to #183 and #226.
Expected behavior
It's best for Pandarallel to dispatch all the available cores at real-time, maybe like how joblib does.
I've compared my code using joblib with 20 workers and other background tasks running at my Win11. It could work. Maybe slower than really using all 20 cores, but at least it won't stuck without raising an error.
A sub-optimal way might be raising an error to let users know their cores have been occupied.
I left 4 cores out to get around for now (nb_workers=20-4), it works well with my code (a bit slower somehow).
Minimal but working code sample to ease bug fix for pandarallel team
Sorry that I am not a good developer. All I can do is describing this issue.
Pandarallel is an awesome package after all. Thank you very much.