This project is an AI-powered MRI tumor classification system that uses a pre-trained deep learning model to detect and classify tumors from MRI scans. It includes a web-based front-end for image upload and real-time predictions powered by a ResNet50 backend.
| Timesheet | Slack channel | Project report |
|---|
- Timesheet: Link your timesheet (pinned in your project's Slack channel) where you track per student the time and tasks completed/participated for this project/
- Slack channel: Link your private Slack project channel.
- Project report: Link your Overleaf project report document.
Watch a quick overview of how NAVI works:
🔗 Watch on YouTube
A minimal example to showcase your work
All the files are uploaded on the git.
This project has been tested on CSIL workstations and on Ubuntu 20.04 EC2 instances using the following exact versions:
- Python: 3.10
- TensorFlow: 2.10.0
- boto3: 1.26.10
- scikit-learn: 1.2.2
- matplotlib: 3.6.2
- Git installed on your system.
- Python 3.10 installed.
- MRI Model file downloaded through the SFU OneDrive link:
Due to GitHub's file size limitations, the trained Keras model (mri_model.keras) is hosted externally.
Click here to download mri_model.keras
After downloading, place the file in theWeb/folder of the project before running the app. - AWS CLI installed on your machine (system-level installation).
On Ubuntu, install with:
On macOS, you can install via Homebrew:
sudo apt update && sudo apt install awsclibrew install awscli
Clone the repository, set up a virtual environment, and install the required Python packages.
# Clone the repository
git clone https://github.com/sfu-cmpt340/2025_1_project_07.git
cd 2025_1_project_07
# Create and activate a Python virtual environment
python3 -m venv venv
source venv/bin/activate
# Install project dependencies using pip
pip install -r requirements.txtThe requirements.txt file includes the exact versions for:
numpy==1.23.5
tensorflow-macos==2.10.0
boto3==1.26.10
scikit-learn==1.2.2
matplotlib==3.6.2
flask
pillow
datetime
werkzeug
reportlabThis ensures that the environment is set up reproducibly, exactly as tested on CSIL workstations and Ubuntu 20.04 EC2 instances.
You can now proceed with running the rest of the project (e.g., data upload, training scripts, evaluation) using the commands provided in the subsequent sections of the guide.
All source code, training scripts (e.g., mri_training.py, evaluate.py), and configuration files are available in the repository:
https://github.com/sfu-cmpt340/2025_1_project_07
- Installation on Your Local Machine Open your terminal and run the following commands:
git clone https://github.com/sfu-cmpt340/2025_1_project_07.git
cd 2025_1_project_07python3 -m venv venv
source venv/bin/activatepip install -r requirements.txt
<a name="data"></a>- Uploading Data to S3
Make an S3 bucket with any name, we used cmpt-340-rownak-merged in our case, with the following structure:
merged_output/Training/
merged_output/Testing/
And each folder contains subfolders for your classes (e.g., merged-glioma, merged-meningioma, etc.).
From your local machine, run:
aws s3 cp "/path/to/local/Training" s3://cmpt-340-rownak-merged/merged_output/Training/ --recursiveTo improve model robustness and enhance the diversity of the training dataset, we provide the
merge_execute_duplicate.py script. This script performs the following tasks:
Data Cleaning: Identifies and removes duplicate or near-duplicate MRI images from the dataset.
Data Merging: Combines cleaned and augmented data into a unified dataset, which can then be used to train the model.
we then used
relabeling.pyto relabel the training data with prefixes like "tr-gl" which stands for training glioma and "tr-mn" for training meningioma etc
we then fixed the heirarchy by moving those 4 folders to a main folder called "training" by
move_to_training.pythis script,
We did mess up a little and had to run an additional script
fixing_no_tumor.pywhich properly moved no_tumor dataset to the Training folder.
aws s3 cp "/path/to/local/Testing" s3://cmpt-340-rownak-merged/merged_output/Testing/ --recursive
(Ensure the AWS CLI is installed and configured on your local machine.)- Running the Training Script on an EC2 Instance 3.1. Launch an EC2 Instance Use the AWS Management Console to launch an Ubuntu 20.04 LTS instance.
For GPU support, choose an instance type such as g4dn.xlarge which we used.
We used it to increase computational speeds.
Attach an IAM role with proper S3 permissions (or configure AWS credentials manually).
Use your existing key pair (e.g., cmpt-340-rename.pem) or create a new one while creating EC2 instance.
3.2. Connect via SSH
ssh -i "/path/to/cmpt-340-rename.pem" ubuntu@ 3.3. Set Up the Environment on EC2
### Update packages and install essentials
sudo apt update
sudo apt install -y python3-pip python3-venv unzip awscli gitgit clone https://github.com/sfu-cmpt340/2025_1_project_07.git
cd 2025_1_project_07python3.10 -m venv venv
source venv/bin/activatepip install --upgrade pip setuptools wheel
pip install tensorflow
pip install flask pillow reportlabpython3 app.py3.4. Run the Training Script The training script (mri_training.py) downloads data from S3, trains the model, saves it as mri_model.keras, and uploads outputs back to S3.
python mri_training.pyMonitor the output for any errors. Once complete, verify the model file is saved:
ls -lh mri_model.keras- Downloading the Model for Your Backend Once the model is uploaded to S3, you can download it to your local machine (or your backend server) using the AWS CLI:
aws s3 cp s3://cmpt-340-rownak-merged/model_output/mri_model.keras .
ls -lh mri_model.kerasThen, in your website’s backend, load the model with:
import tensorflow as tf
model = tf.keras.models.load_model('mri_model.keras')- Reproduction of Evaluation Results To reproduce the evaluation results (e.g., confusion matrix and classification report):
# Create a temporary directory for the dataset
mkdir tmp && cd tmp
# Download the dataset (replace with actual URL)
wget https://yourstorageisourbusiness.com/dataset.zip
unzip dataset.zip
# Return to the project directory and activate the environment
cd ../
source venv/bin/activate
# Run the evaluation script (adjust parameters as needed)
python evaluate.py --epochs=10 --data=tmp/dataset
Output (such as reports and the trained model) will be saved in the designated output directories and optionally uploaded back to S3.- Use git
- Do NOT use history re-editing (rebase)
- Commit messages should be informative:
- No: 'this should fix it', 'bump' commit messages
- Yes: 'Resolve invalid API call in updating X'
- Do NOT include IDE folders (.idea), or hidden files. Update your .gitignore where needed.
- Do NOT use the repository to upload data
- Use VSCode or a similarly powerful IDE
- Use Copilot for free
- Sign up for GitHub Education
