Skip to content

RatneshKJaiswal/Imagin3D

Repository files navigation

🧠 3D Image Reconstruction from 2D Views using Vision Transformer (ViT)

A full-stack deep learning web application that reconstructs 3D objects from single 2D views using a Vision Transformer (ViT)-based model. This project showcases a seamless integration of cutting-edge deep learning, interactive UI, and real-time 3D visualization.

πŸ” Overview

This system takes 2D images from single viewpoints as input, processes them through a transformer-based deep learning model, and outputs a 3D model which can be visualized and downloaded in .OBJ format.

πŸš€ Tech Stack

  • Frontend: React.js
  • Backend: Python Flask (REST API)
  • Model: Vision Transformer (ViT) for multi-view 3D reconstruction
  • 3D Visualization: Three.js or similar WebGL library

πŸ”§ Key Features

  • 🎯 Transformer-based 3D Reconstruction: Uses ViT with self-attention for accurate feature extraction and integration across views.
  • πŸ–ΌοΈ Multi-View Input: Upload multiple 2D images captured from various angles.
  • 🧠 Deep Learning Inference API: Flask backend performs inference and generates the 3D model.
  • 🌐 Interactive Web Interface: Intuitive React-based frontend for uploading images, visualizing the output, and downloading .OBJ files.
  • 🧊 3D Viewer Integration: Embedded real-time 3D preview of reconstructed objects using Three.js.

πŸ“¦ Project Structure

β”œβ”€β”€ static/                   # Syle and Javascript flies
β”‚   β”œβ”€β”€ css/
        └── style.css
β”‚   └── js/
β”‚       └── main.js
β”œβ”€β”€ templates/                # HTML template file
β”‚   └── Index.html
β”œβ”€β”€ app.py                    # Flask API
β”œβ”€β”€ generation_code.ipynb     # Model Generation Code
└── requirements.txt

πŸ› οΈ Setup Instructions

1. Clone the Repository

git clone https://github.com/RatneshKJaiswal/Imagin3D
cd Imagin3D

2. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # For Linux/Mac
venv\Scripts\activate     # For Windows
pip install -r requirements.txt
python app.py

πŸ–ΌοΈ Example Usage

  1. Upload multiple 2D views of an object
  2. Preview the reconstructed 3D model on the webpage usind Three.js
  3. Download the model as an .OBJ file

🧠 Model Details

The ViT-based model leverages self-attention to learn spatial dependencies across image views, enabling robust feature fusion and accurate 3D reconstruction.

πŸ“„ License

This project is licensed under the MIT License.

🀝 Acknowledgments

  • Vision Transformer (ViT) research community
  • Javascript and Flask documentation
  • Three.js for 3D visualization

πŸŽ“ Representation Diagrams :-

Fig.1. Web Page Sample and Preview Video

Screenshot 2025-05-16 093548

Screen.Recording.2025-05-28.153811.new.mp4

Fig.2. Training loss and accuracy graph

Screenshot 2025-04-22 233604

Fig.3. Project Management Timeline

Screenshot 2025-05-16 010930

Fig.4. ER Diagram

Screenshot 2025-05-16 014517

Fig.5. Data Flow Diagram

Screenshot 2025-05-16 012939 Screenshot 2025-05-16 013005 Screenshot 2025-05-16 013029

About

3D Image reconstruction using 2D Image

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors