Skip to content

yuyuanchin/parallel-image-processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Parallel Image Processing Using Different Parallel Computing Paradigms

Project Overview

This project implements a parallel image processing system that applies a sequence of five filters—grayscale conversion, Gaussian blur, edge detection, image sharpening, and brightness adjustment—to a large dataset of 10,000 images. Processing thousands of images sequentially is inefficient and does not utilize modern multi-core processors effectively. By leveraging parallel computing, the system distributes the workload across multiple workers to significantly reduce execution time and maximize CPU utilization.

The project compares three parallel computing approaches using Python's standard libraries: multiprocessing (process-based parallelism), concurrent.futures with processes, and concurrent.futures with threads. This comparison evaluates the performance characteristics and trade-offs between different parallelization paradigms, demonstrating how to effectively tackle compute-intensive workloads in Python.

System Architecture & Parallelization Strategy

The system follows a data parallelism strategy where the same image processing pipeline is applied concurrently to different images. Input images are loaded from category-based folders, distributed across multiple workers (processes or threads), and processed independently. Each worker executes the complete filter pipeline and saves results to the output directory.

Key Design Principles:

  • Data Partitioning: The dataset (not the algorithm) is partitioned across workers, with each image processed independently.
  • Independent Execution: Workers operate without shared state, shared memory access, or synchronization, minimizing communication overhead and eliminating race conditions.
  • Scalability: The architecture scales effectively with the number of available CPU cores since each worker processes images independently.

Parallel Computing Paradigms Used

Multiprocessing Module

Python's multiprocessing module provides process-based parallelism by creating separate Python processes, each with its own interpreter and memory space. This approach completely bypasses Python's Global Interpreter Lock (GIL), enabling true parallel execution on multi-core processors.

Characteristics:

  • Creates independent processes for each worker
  • Each process has isolated memory space
  • Ideal for CPU-bound tasks (like image filtering)
  • Higher overhead due to process creation and inter-process communication
  • No GIL contention

Implementation in this project: Uses Pool to spawn worker processes and imap_unordered() for efficient task distribution without order guarantees.

Concurrent.Futures

The concurrent.futures module provides a high-level executor-based interface for running tasks asynchronously. It abstracts the complexity of thread/process management and allows the same code to switch between different execution backends.

ProcessPoolExecutor

ProcessPoolExecutor uses a pool of worker processes, similar to multiprocessing but with a higher-level API. It provides the same true parallelism as the multiprocessing module by bypassing the GIL.

Characteristics:

  • Process-based parallelism with a simple executor interface
  • Suitable for CPU-bound workloads
  • Automatic resource management with context manager support
  • Similar performance to multiprocessing module
  • Easier to use and more Pythonic

ThreadPoolExecutor

ThreadPoolExecutor uses a pool of worker threads operating within the same Python process. While threads share memory efficiently, Python's Global Interpreter Lock (GIL) prevents true parallel execution of Python bytecode on CPU-bound tasks.

Characteristics:

  • Thread-based parallelism within a single process
  • Efficient for I/O-bound tasks (network requests, file operations)
  • Limited effectiveness for CPU-bound tasks due to GIL contention
  • Lower memory overhead compared to processes
  • Simpler state sharing between threads

Note: For this image processing project, ThreadPoolExecutor is less efficient than process-based approaches for CPU-intensive filter operations, but is included for comparison purposes.

Dataset

This project uses a subset of the Food-101 Dataset from Kaggle. The dataset contains 10 food categories with 1,000 images per category, for a total of 10,000 images.

Folder Structure

parallel-image-processing/
├── src/
│   ├── filters.py                          # Image filter implementations (grayscale, blur, edge, sharpen, brightness)
│   ├── image_processor_multiprocessing.py  # Parallel image processing using multiprocessing (process-based)
│   ├── image_processor_futures_process.py  # Parallel image processing using concurrent.futures ProcessPoolExecutor
│   └── image_processor_futures_thread.py   # Parallel image processing using concurrent.futures ThreadPoolExecutor
├── data/                                   # Input image dataset (organized by food categories)
├── output/                                 # Output images after applying the full processing pipeline
├── requirements.txt                        # Python dependencies required for the project
└── README.md

Installation & Setup

1. Get the Source Code and Dataset

  1. Clone this repository to your local machine.
  2. In the project root directory, create a folder named data.
  3. Download the dataset from here.
  4. Unzip the dataset and place all category folders inside the data directory.

After setup, the dataset structure should look as follows:

data/
└── 10000_raw_images/
    ├── category_1/
    │   ├── image1.jpg
    │   ├── image2.jpg
    │   └── ...
    ├── category_2/
    │   ├── image3.jpg
    │   ├── image4.jpg
    │   └── ...
    └── ...

2. Create a Virtual Machine on GCP

  1. Navigate to Google Cloud Console → Compute Engine → VM Instances.
  2. Click Create Instance.
  3. Configure the virtual machine with the following settings:
    • Name: image-processing-vm
    • Region: Singapore
    • Zone: Any available zone
    • Machine Series: E2
    • Machine Type: e2-standard-16 (16 vCPUs, 64 GB memory)
    • Operating System: Default Linux image
  4. Click Create to launch the virtual machine.

3. Upload Source Code and Dataset to the VM

3.1 Install Google Cloud SDK

  1. Download and install the Google Cloud SDK on your local machine.
  2. Initialize the SDK and authenticate your account using:
    gcloud init
    
  3. Set the correct project and zone when prompted.

3.2 Upload Files to the VM

  1. Open a terminal from the folder containing this repository folder.
  2. Use the gcloud compute scp command to upload the project repository from your local machine to the VM:
    gcloud compute scp --recurse ./parallel-image-processing \
    YOUR_USERNAME@image-processing-vm:/home/YOUR_USERNAME \
    --zone=YOUR_ZONE
    
  3. Verify that the files are successfully uploaded by connecting to the VM and listing the directory contents:
    ls ~/parallel-image-processing
    

After completion, the source code and dataset will be available on the VM and ready for execution.

How to Run

  1. Connect to the VM using SSH from the GCP console.
  2. Update the package list and install required system packages:
    sudo apt update
    sudo apt install -y python3-pip python3-venv
    
  3. Navigate to the project directory:
    cd ~/parallel-image-processing
    
  4. Create a Python virtual environment:
    python3 -m venv venv
    
  5. Activate the virtual environment:
    source venv/bin/activate
    
  6. Install all required Python dependencies using the provided requirements.txt file:
    pip install -r requirements.txt
    
  7. Navigate to the source code directory:
    cd src
    
  8. Run the image processing program (image_processor_multiprocessing.py or image_processor_futures.py) with the desired number of worker processes, for example:
    python image_processor_multiprocessing.py --workers 4
    
  9. Instead, you can run a single shell script with ./benchmark.sh | tee results.txt that runs all experiments sequentially with different worker counts (1 2 4 8 16). The output will show in the terminal and be saved to a file named results.txt at the same time.

Performance Evaluation

Performance is evaluated by measuring the total execution time for processing all 10,000 images. Each parallel implementation is tested with varying numbers of workers (1, 2, 4, 8, and 16) to observe how execution time changes with different levels of parallelism.

Sample Output:

================================ Multiprocessing Module ================================
Processing 10000 images using 4 processes

--- Execution Summary ---
Number of processes : 4
Time taken (seconds): 23.45

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published