Skip to content

A CUDA/HIP-accelerated video photo-mosaic generator achieving up to 184× speedup, enabling per-frame mosaic rendering for full-length videos.

License

Notifications You must be signed in to change notification settings

phantom0174/video_mosaic

Repository files navigation

Video Mosaic

🔧 Developers

📜 Overview

🎬 Demo: https://www.youtube.com/watch?v=Dh6Jhp5Qp9U

This repository is the public release of the final project for the NTHU Parallel Programming course.
It is designed to significantly accelerate the original photo-mosaic implementation, making it feasible to generate photo mosaics for every frame of a video within a reasonable amount of time.

Additional details can be found in the project slides.

📁 Project Structure

.
├── src/common.h          # shared configuration
├── src/opt_turbo.cu      # CUDA version (migrated from HIP version)
├── src/opt_turbo.cpp     # HIP version
├── src/make_cache.cpp    # preprocess tile library
├── tools/sewing.txt      # ffmpeg command for stitching frames
├── tools/process_pdf.py  # preprocess Epstein pdf files into images
├── bad_frames.zip        # Bad Apple frame DB

🔢 Performance

Matching Setting

  • target: Bad Apple (6571 frames)
  • tile library: Bad Apple frames
  • resolution: 40x40
  • output scale: 3.0 (1440*1080)

Results

version system spec core matching time equivalent speedup comment
photo-mosaic TR 7960X w/ 2x 5070 Ti 16G ~10 mins. baseline
HIP EPYC 7742 w/ 2x MI100 33 secs. x18.2 I/O bound (CPU)
CUDA (opt) Ultra 7 265K w/ 1x 5060 Ti 16G 6.5 secs. x184.6

🧱 Requirements

CUDA Platform (Primary)

  • Host OS: Windows 11
  • Linux Environment: WSL2
  • Kernel: 6.6.87.2-microsoft-standard-WSL2
  • Architecture: x86_64
  • GPU: NVIDIA RTX 5060 Ti 16GB
  • NVIDIA Driver: supports CUDA 13.1
  • CUDA Runtime: 13.1

HIP Platform

  • OS: Debian GNU/Linux
  • Kernel: 6.1.0-39-amd64
  • Architecture: x86_64
  • GPU: AMD MI100
  • ROCm: unavailable (currently offline)

☘️ Usage

frame extraction for target videos (get_frames.py) can be found in photo-mosaic

  1. Get the essential tools:

    run get_ffmpeg.sh, get_img_turbo.sh

  2. Tune matching parameters in common.h
  3. Build the cache generator:

    make make-cache

  4. Build the executable:

    make opt-turbo (hip) / opt-cuda (cuda)

  5. Generate the tile cache:

    run make-cache

  6. Run the mosaic generator:

    run opt-turbo/opt-cuda

  7. Stitch frames into a video using the command in sewing.txt
  8. Done.

About

A CUDA/HIP-accelerated video photo-mosaic generator achieving up to 184× speedup, enabling per-frame mosaic rendering for full-length videos.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published