AI Illustrator

AI Illustrator is a powerful tool designed to automatically generate consistent, high-quality illustrations for stories using Google's Gemini models for both text analysis and image generation. It processes a story text file, analyzes it to understand the visual style, characters, and locations, and then generates a sequence of cinematic illustrations.

✨ Features

Automatic Style Detection: Analyzes the story text to determine the most appropriate art style and generates consistent illustrations based on that style.
Character Consistency:
- Extracts character descriptions and generates reference character images (Full Body, 16:9).
- Maintains a persistent catalog of characters in output/data.json to ensure the same character looks consistent throughout the story.
- Uses reference images (multimodal generation) to keep character appearance stable across different scenes.
Location Consistency:
- Generates and caches location reference images (16:9 cinematic shots).
- Maintains a location catalog in output/data.json to reuse settings.
Cinematic Scene Generation:
- Splits the story into logical scenes.
- Generates a single, cohesive cinematic frame for each scene (16:9 aspect ratio).
- Enforces strict negative constraints to prevent comic-book layouts, text, or split screens.
- Uses full-body character references to maintain consistency across scenes.
Docker Support: Fully containerized for easy deployment and execution.
Comprehensive Testing: Includes a full suite of unit and integration-like tests using pytest.

🛠️ Prerequisites

Python 3.10+ (if running locally)
Docker & Docker Compose (recommended for isolation)
Google Cloud API Key with access to Gemini models (including image generation capabilities).

🚀 Installation & Setup

1. Clone the Repository

git clone <repository-url>
cd ai-illustrator

2. Configure Environment

Copy the example environment file and add your API key.

cp .env.example .env

Open .env and set your variables:

GEMINI_API_KEY=your_api_key_here
TEXT_MODEL_NAME=gemini-3-pro-preview # or compatible
IMAGE_MODEL_NAME=gemini-3-pro-image-preview # or specific imagen model

3. Running with Docker (Recommended)

Build the Docker image:

docker-compose build

Run the generator:

Place your story text file in the data/ directory (e.g., data/my_story.txt).
Execute the container:
```
docker-compose run app --text-file data/my_story.txt --output-dir output/my_project_name
```
Note: The output directory will be populated with the results on your host machine.

4. Running Locally

Create a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Run the application:

python main.py --text-file data/my_story.txt --output-dir output/my_project_name

💡 Usage

Command Line Arguments

--text-file: (Required) Path to the input text file containing the story.
--output-dir: Directory to save generated assets and illustrations (default: output).
--style-prompt: Optional prompt to guide the initial style detection (e.g., "Cyberpunk anime", "Oil painting").

Output Structure

The tool creates an organized, flat output directory:

output/
├── characters/             # Character assets
│   └── 1_character_name.jpeg
├── locations/              # Location assets
│   └── 1_location_name.jpeg
├── illustrations/          # Final Scene Illustrations
│   └── 1_sunny_park_scene.jpeg
├── data.json               # Unified manifest (Style, Characters, Locations, Illustrations)
└── style_templates/        # Generated style base images
    ├── bg_fullbody.jpg                # 16:9 solid background for characters
    ├── style_reference_fullbody.jpg   # 16:9 character style reference
    └── bg_location_16_9.jpg           # 16:9 neutral background for locations

`data.json` Structure

The data.json file serves as the central manifest for the project.

{
  "style_prompt": "Description of the visual style...",
  "characters": [
    {
      "id": 1,
      "name": "Character Name",
      "original_name": "Original Name from Text",
      "description": "Visual description...",
      "full_body_path": "output/characters/1_character_name.jpeg",
      "generation_prompt": "Full generation prompt used..."
    }
  ],
  "locations": [
    {
      "id": 1,
      "name": "Location Name",
      "original_name": "Original Name from Text",
      "description": "Visual description...",
      "reference_image_path": "output/locations/1_location_name.jpeg",
      "generation_prompt": "Full generation prompt used..."
    }
  ],
  "illustrations": [
    {
      "scene_id": 1,
      "story_segment": "Original text of the scene...",
      "name": "sunny_park_scene",
      "location": {
        "id": 1,
        "name": "Location Name"
      },
      "characters": [
        {
          "id": 1,
          "name": "Character Name",
          "full_body_path": "output/characters/1_character_name.jpeg"
        }
      ],
      "illustration_path": "output/illustrations/1_sunny_park_scene.jpeg",
      "generation_prompt": "Full generation prompt used..."
    }
  ]
}

🧪 Development & Testing

This project uses pytest for testing. The test suite covers models, configuration, asset management, and the AI client wrapper.

To run tests:

# Activate your virtual environment first
source venv/bin/activate

# Run all tests
pytest tests

# Run with verbose output
pytest -v tests

Mocking

The tests use unittest.mock and pytest-mock to simulate Google GenAI API responses and filesystem operations, ensuring that tests are fast and do not consume API quota.

🏗️ Project Structure

main.py: Entry point and orchestration logic.
app/: Core package.
- config.py: Configuration and environment management.
- core/: Key logic modules.
  - ai_client.py: Wrapper for Google GenAI SDK.
  - analyzer.py: Story analysis (Scene/Character/Location extraction).
  - asset_manager.py: Manages creation and cataloging of reference assets.
  - illustrator.py: Generates the final scene illustrations.
  - models.py: Pydantic data models.
tests/: Test suite.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
app		app
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Illustrator

✨ Features

🛠️ Prerequisites

🚀 Installation & Setup

1. Clone the Repository

2. Configure Environment

3. Running with Docker (Recommended)

4. Running Locally

💡 Usage

Command Line Arguments

Output Structure

`data.json` Structure

🧪 Development & Testing

Mocking

🏗️ Project Structure

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Illustrator

✨ Features

🛠️ Prerequisites

🚀 Installation & Setup

1. Clone the Repository

2. Configure Environment

3. Running with Docker (Recommended)

4. Running Locally

💡 Usage

Command Line Arguments

Output Structure

data.json Structure

🧪 Development & Testing

Mocking

🏗️ Project Structure

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`data.json` Structure

Packages