Parallelized Model Training using numpy and MPI.

Works for any number of processes.
Distribute data evenly to all processes.
Computing gradients and losses in parallel.
Log the experiment results

IMPORTANT:

YOU NEED TO PREPARE THE FILE nytaxi2022.csv ON YOUR OWN.

Move it to the project folder so that the scripts can read it.

Try it out!

Clone this repo, install the packages, and run the following command on the local computer:

Set up environment

conda create -n mpipy_mpich -c conda-forge python=3.11 mpich mpi4py numpy pandas scikit-learn matplotlib
 
conda activate mpipy_mpich

Run Experiment

Use the following command to run experiment with the interested hyperparameters.

mpiexec -np {NUMBER OF PROCESSES} python -u -m experiments "{LIST OF ACT FUNC NAME}" "{LIST OF BATCH SIZE}" {NUMBER OF PROCESSES} {READ IN DATA CHUNK SIZE}

Run sub-experiments with one activation function

mpiexec -np 1 python -u -m experiments "['relu']" "[1440]" 1 40000
mpiexec -np 2 python -u -m experiments "['relu']" "[1440]" 2 40000

Run the full experiments on 1, 2, 3processes

mpiexec -np 1 python -u -m experiments "['relu','sigmoid','tanh']" "[480, 960, 1440, 1920, 2400]" 1 400000
mpiexec -np 2 python -u -m experiments "['relu','sigmoid','tanh']" "[480, 960, 1440, 1920, 2400]" 2 400000
mpiexec -np 3 python -u -m experiments "['relu','sigmoid','tanh']" "[480, 960, 1440, 1920, 2400]" 3 400000

Note:

Use MPICH
Use Python 3.11
Training and evaluation are carried out on a laptop equipped with an Apple M4 Pro processor and 24 GB of memory.
GPT interactions for MPI development are all labeled, search for 'GPT' to see the part where we get helps.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.idea		.idea
archived		archived
mpinn		mpinn
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
da.ipynb		da.ipynb
experiments.py		experiments.py
preprocessing_objects.pkl		preprocessing_objects.pkl
relu_training_history.pdf		relu_training_history.pdf
requirements.txt		requirements.txt
sigmoid_training_history.pdf		sigmoid_training_history.pdf
tanh_training_history.pdf		tanh_training_history.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallelized Model Training using numpy and MPI.

IMPORTANT:

Try it out!

Set up environment

Run Experiment

Run sub-experiments with one activation function

Run the full experiments on 1, 2, 3processes

Note:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Parallelized Model Training using numpy and MPI.

IMPORTANT:

Try it out!

Set up environment

Run Experiment

Run sub-experiments with one activation function

Run the full experiments on 1, 2, 3processes

Note:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages