🧠 3D Image Reconstruction from 2D Views using Vision Transformer (ViT)

A full-stack deep learning web application that reconstructs 3D objects from single 2D views using a Vision Transformer (ViT)-based model. This project showcases a seamless integration of cutting-edge deep learning, interactive UI, and real-time 3D visualization.

🔍 Overview

This system takes 2D images from single viewpoints as input, processes them through a transformer-based deep learning model, and outputs a 3D model which can be visualized and downloaded in .OBJ format.

🚀 Tech Stack

Frontend: React.js
Backend: Python Flask (REST API)
Model: Vision Transformer (ViT) for multi-view 3D reconstruction
3D Visualization: Three.js or similar WebGL library

🔧 Key Features

🎯 Transformer-based 3D Reconstruction: Uses ViT with self-attention for accurate feature extraction and integration across views.
🖼️ Multi-View Input: Upload multiple 2D images captured from various angles.
🧠 Deep Learning Inference API: Flask backend performs inference and generates the 3D model.
🌐 Interactive Web Interface: Intuitive React-based frontend for uploading images, visualizing the output, and downloading .OBJ files.
🧊 3D Viewer Integration: Embedded real-time 3D preview of reconstructed objects using Three.js.

📦 Project Structure

├── static/                   # Syle and Javascript flies
│   ├── css/
        └── style.css
│   └── js/
│       └── main.js
├── templates/                # HTML template file
│   └── Index.html
├── app.py                    # Flask API
├── generation_code.ipynb     # Model Generation Code
└── requirements.txt

🛠️ Setup Instructions

1. Clone the Repository

git clone https://github.com/RatneshKJaiswal/Imagin3D
cd Imagin3D

2. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # For Linux/Mac
venv\Scripts\activate     # For Windows
pip install -r requirements.txt
python app.py

🖼️ Example Usage

Upload multiple 2D views of an object
Preview the reconstructed 3D model on the webpage usind Three.js
Download the model as an .OBJ file

🧠 Model Details

The ViT-based model leverages self-attention to learn spatial dependencies across image views, enabling robust feature fusion and accurate 3D reconstruction.

📄 License

This project is licensed under the MIT License.

🤝 Acknowledgments

Vision Transformer (ViT) research community
Javascript and Flask documentation
Three.js for 3D visualization

🎓 Representation Diagrams :-

Fig.1. Web Page Sample and Preview Video

Screen.Recording.2025-05-28.153811.new.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
Screen Recording 2025-05-28 153811(new).mp4		Screen Recording 2025-05-28 153811(new).mp4
app.py		app.py
generation_code.ipynb		generation_code.ipynb
requirements.txt		requirements.txt
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 3D Image Reconstruction from 2D Views using Vision Transformer (ViT)

🔍 Overview

🚀 Tech Stack

🔧 Key Features

📦 Project Structure

🛠️ Setup Instructions

1. Clone the Repository

2. Backend Setup

🖼️ Example Usage

🧠 Model Details

📄 License

🤝 Acknowledgments

🎓 Representation Diagrams :-

Fig.1. Web Page Sample and Preview Video

Fig.2. Training loss and accuracy graph

Fig.3. Project Management Timeline

Fig.4. ER Diagram

Fig.5. Data Flow Diagram

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 3D Image Reconstruction from 2D Views using Vision Transformer (ViT)

🔍 Overview

🚀 Tech Stack

🔧 Key Features

📦 Project Structure

🛠️ Setup Instructions

1. Clone the Repository

2. Backend Setup

🖼️ Example Usage

🧠 Model Details

📄 License

🤝 Acknowledgments

🎓 Representation Diagrams :-

Fig.1. Web Page Sample and Preview Video

Fig.2. Training loss and accuracy graph

Fig.3. Project Management Timeline

Fig.4. ER Diagram

Fig.5. Data Flow Diagram

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages