-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
I am adding reticulate to the EMD computation so that it can be called by R directly
- Refactored the code into gene_distance_calculate.py
- Added progress bar using tqdm
- Added parameters for setting number of processes and maxIterations
- Use concurrent.futures.ThreadPoolExecutor. This allows to share the data between different threads instead of having to copy (or read the data multiple times and have a large footprint on systems with many CPUs). This also seems to fix issues on MacOS where starting other processes did not work well
Other ideas
- Use multiprocessing.SharedMemory. An alternative to ThreadPoolExecutor but more cumbersome.
- Optimize chunksize of pool.imap (did not change performances significantly 48m on 8 cpus w/o and 46 w/)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels