🎨 Image_Generator_SD

Overview

Image_Generator_SD is a web application that allows you to generate or modify images using Stable Diffusion models through a user-friendly Gradio interface. This project combines three core functionalities:

----> Web-UI Acces

Txt2Img: Generate images from scratch using text prompts.
Img2Img: Transform an existing image guided by text prompts.
Inpainting: Modify or fill specific regions of an image using masks and text prompts.

🎥 Demonstration Video

Below is a personal demonstration video showcasing how to use and interact with the application:

🔎 What is Gradio?

Gradio is an open-source Python library that enables developers to create customizable web interfaces for machine learning models with minimal code. Founded in 2019 by Abubakar Abid and colleagues, Gradio was designed to make machine learning models more accessible and interactive for users without requiring specialized software or expertise.

In 2021, Gradio was acquired by Hugging Face.

Gradio interfaces are practical for developers and data scientists during the development and testing phases and are also highly valuable for showcasing models to stakeholders, clients, or users. By providing an interactive and user-friendly interface, Gradio allows non-technical users to quickly understand and interact with the underlying machine learning models, fostering collaboration and feedback.

🧠 What is Stable Diffusion?

Stable Diffusion is a deep learning text-to-image model released in 2022, leveraging diffusion technology and latent space for efficient processing. This significantly reduces hardware requirements, allowing it to run on consumer GPUs.

This model has inspired major open-source projects like ControlNet, which enables fine control over image generation using depth maps, pose estimation, and edge detection. ComfyUI offers a node-based, visual workflow to build complex Stable Diffusion pipelines without coding. AnimateDiff brings AI-driven animations to life by applying Stable Diffusion consistently across frames.

🛠 Code Structure

This project is organized around three major classes to handle different image generation workflows: Txt2Img, Img2Img, and Inpainting. Each class is located in model.py and is utilized within different Gradio interfaces in app.py.

1️⃣ Txt2Img

Code Explanation

Location: class Txt2Img in model.py
Purpose: Generate images from textual prompts (positive and optionally negative).
Key Steps:
1. Load a StableDiffusionPipeline (default: "CompVis/stable-diffusion-v1-4").
2. Move the pipeline to the available device (CUDA if available).
3. Call txt2img() with the following parameters:
  - pos_prompt (required)
  - neg_prompt (optional negative prompt)
  - guidance scale
  - steps (number of inference steps)
  - width and height (image dimensions)

In app.py, the function generate_img_from_txt(...) orchestrates this process and returns the generated image to the user interface.

2️⃣ Img2Img

Code Explanation

Location: class Img2Img in model.py
Purpose: Transform an existing image based on new prompts and parameters.
Key Steps:
1. Load a StableDiffusionImg2ImgPipeline (default: "runwayml/stable-diffusion-v1-5").
2. Resize the input image if necessary (via resize_image() in imageProcess.py).
3. Call img2img() with parameters:
  - image (the original image file path)
  - pos_prompt & neg_prompt
  - strength (how strongly the prompt influences the final image)
  - guidance scale
  - steps (number of inference steps)

In app.py, the function generate_img_from_img(...) is triggered upon user interaction in the Img2Img tab.

3️⃣ Inpainting

Code Explanation

Location: class Inpainting in model.py
Purpose: Fill or modify specific regions of an image using a mask.
Key Steps:
1. Load a StableDiffusionInpaintPipeline (default: "runwayml/stable-diffusion-inpainting").
2. Convert and prepare the mask from the user’s edits. This involves converting the alpha channel to a binary mask.
3. Call inpainting() with parameters:
  - image (the original image)
  - mask (the area to modify)
  - pos_prompt & neg_prompt
  - guidance scale
  - steps (inference steps)
  - strength (influence of the prompt on changes)

In app.py, the function generate_image_from_paint(...) processes the edited image and mask from the Gradio ImageEditor component, then performs the inpainting.

🚀 Installation & Setup

To run this application on a Linux or Windows environment, follow these steps:

Linux / Mac

git clone https://github.com/YourUsername/Image_Generator_SD.git
cd Image_Generator_SD
./setup-linux.sh

Windows

git clone https://github.com/YourUsername/Image_Generator_SD.git
cd Image_Generator_SD
.\setup-windows.bat

Once installed, open your browser and go to http://127.0.0.1:7860

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup-linux.sh		setup-linux.sh
setup-windows.bat		setup-windows.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎨 Image_Generator_SD

Overview

🎥 Demonstration Video

🔎 What is Gradio?

🧠 What is Stable Diffusion?

🛠 Code Structure

1️⃣ Txt2Img

2️⃣ Img2Img

3️⃣ Inpainting

🚀 Installation & Setup

Linux / Mac

Windows

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Lilien86/Image_Generator_SD

Folders and files

Latest commit

History

Repository files navigation

🎨 Image_Generator_SD

Overview

🎥 Demonstration Video

🔎 What is Gradio?

🧠 What is Stable Diffusion?

🛠 Code Structure

1️⃣ Txt2Img

2️⃣ Img2Img

3️⃣ Inpainting

🚀 Installation & Setup

Linux / Mac

Windows

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages