Bridge the gap between handmade quality and professional online presentation.
The Immersive Craft AI Platform empowers household artists to showcase their work professionally. This repository houses the backend microservices ecosystem, currently featuring the Text Service (Live) and the Image Service (In Development).
The platform is built on a scalable microservices architecture, separating lightweight CPU tasks from heavy GPU processing.
- Status: β Production Ready
- Role: The "AI Storyteller"
- Tech: FastAPI, Async I/O, Google Gemini (Multimodal)
- Function: Transforms raw images and simple seller notes into evocative, SEO-ready product descriptions in milliseconds.
- Status: ποΈ Construction in Progress
- Role: The "AI Photographer"
- Tech: Python, Celery, Redis, Kubernetes (GKE), KEDA
- Function: Asynchronous GPU pipeline for professional image correction:
- Perspective Correction
- Shadow Removal & Lighting Enhancement
- AI Super-Resolution (Upscaling)
- Background Manipulation
- Intelligent Cropping
- Interactive 3D Model
The platform uses a Strategy Pattern for flexible AI providers and an asynchronous event-driven architecture for handling heavy workloads.
A central Factory determines which AI Provider to instantiate based on runtime configuration, ensuring a unified interface (AIModelProvider) regardless of the underlying model.
graph LR
Client[Client App] -->|POST Request| API[FastAPI Endpoint]
API -->|Get Provider| Factory[Provider Factory]
Factory -->|Read Config| Settings[Settings / .env]
Factory -->|Instantiate| Strategy{Strategy Selection}
Strategy -->|Default| Gemini[Gemini Provider]
Strategy -->|Debug| Mock[Mock Provider]
Gemini -->|Async Call| Google[Google Gemini API]
Mock -->|Simulate| Local[Local Response]
Asynchronous GPU pipeline for professional image correction
Work in Progress (stay tuned)
The Text Service, a high-performance microservice that uses Multimodal AI to transform raw images and simple seller notes into evocative, SEO-ready product descriptions.
- π§ Multimodal AI: Analyzes visual cues (images) combined with text inputs using Google Gemini.
- π Provider Factory Pattern: flexible architecture allowing hot-swapping of AI backends (Gemini, OpenAI, etc.) via configuration.
- β‘ High Performance: Fully asynchronous (non-blocking) I/O using
httpxand FastAPI. - π‘οΈ Robust Configuration: Type-safe settings management using Pydantic Settings with nested environment variable support.
- π§ͺ Developer Friendly: Built-in Mock Mode for zero-cost testing and rapid UI development.
The Image Service, an asynchronous GPU pipeline for professional image correction Work in Progress (stay tuned)
- Python 3.10 or higher
python3 --version-
uv package manager (recommended) or
pip -
A Google API key for Gemini (get one from Google AI Studio)
-
Clone the repository (if not already done)
git clone https://github.com/SuvroBaner/immersive cd immersive/services/text-service -
Install packages & dependencies (this will automatically create and manage the virtual environment)
# Using uv (recommended) uv sync source .venv/bin/activate # On macOS/Linux # .venv\Scripts\Activate.ps1 # On Windows PowerShell
Or using standard Python:
python -m venv .venv source .venv/bin/activate -
Configure environment variables
Create a
.envfile inservices/text-service/:touch .env # or use your preferred editorMake a copy of example.env
# Create .env file cp example.env .envAdd your configuration:
# Default provider (gemini, mock) DEFAULT_PROVIDER=gemini # Enable mock mode for testing (set to true to bypass API calls) MOCK_MODE=false # Provider-specific settings (Gemini) # Note: Use double underscores (__) to nest settings PROVIDER_SETTINGS__GEMINI__API_KEY=your_google_api_key_here PROVIDER_SETTINGS__GEMINI__MODEL_NAME=gemini-2.5-flash # OPENAI (for future use/testing) # PROVIDER_SETTINGS__OPENAI__API_KEY=sk-yyy # PROVIDER_SETTINGS__OPENAI__MODEL_NAME=gpt-4o-mini
Important: The nested configuration format (
PROVIDER_SETTINGS__GEMINI__API_KEY) is required. This allows Pydantic to properly map environment variables to the nestedProviderConfigstructure. -
Run the Server
# From services/text-service/ uvicorn app.main:app --reload --port 8000The API will be available at
http://127.0.0.1:8000- Interactive API Docs: http://127.0.0.1:8000/docs
- Alternative Docs: http://127.0.0.1:8000/redoc
Health check endpoint.
Response:
{
"status": "ok",
"service": "text-service"
}List available AI providers and their configuration status. Useful for debugging if your .env keys are being loaded correctly (without leaking the actual keys).
Response:
{
"default_provider": "gemini",
"mock_mode": false,
"available_providers": ["gemini", "mock"],
"configured_providers": {
"gemini": {
"model_name": "gemini-2.5-flash",
"api_key_configured": true
}
}
}Generate product descriptions, titles, and structured marketing content from an image and seller inputs.
Query Parameters:
provider(optional): Override the default provider (e.g.,?provider=gemini)
Request Body:
{
"image_url": "https://example.com/product-image.jpg",
"seller_inputs": {
"item_name": "Handcrafted Clay Pot",
"materials": "Natural terracotta clay, white paint",
"inspiration": "Made during the rainy season, inspired by my garden",
"category": "Pottery"
},
"config": {
"tone": "evocative",
"language": "en-IN",
"target_platform": "web"
}
}Response:
{
"generated_content": {
"title": "Handcrafted Clay Pot - Pottery Collection",
"description": "This exquisite clay pot showcases the artistry...",
"product_facts": [
"Handcrafted pottery made with premium natural terracotta clay",
"Unique design inspired by the rainy season garden",
"One-of-a-kind piece, no two items are exactly alike"
],
"blog_snippet_idea": "Discover the story behind this stunning clay pot..."
},
"ai_model_used": "gemini-2.5-flash",
"latency_ms": 1250.5,
"metadata": {
"provider": "gemini"
}
}Example cURL:
curl -X POST "http://127.0.0.1:8000/v1/content/generate?provider=gemini" \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://images.unsplash.com/photo-1604264726154-26480e76f4e1",
"seller_inputs": {
"item_name": "Clay Pot",
"materials": "Natural terracotta clay, white paint",
"inspiration": "Made this during the rainy season, inspired by my garden",
"category": "Pottery"
},
"config": {
"tone": "evocative",
"language": "en-IN",
"target_platform": "web"
}
}'The repository is structured to support multiple microservices in a monorepo format.
immersive/
βββ README.md
βββ LICENSE
βββ services/
βββ text-service/
βββ app/
β βββ main.py # FastAPI application and routes
β βββ models.py # Pydantic request/response models
β βββ config.py # Settings and configuration
β βββ core/
β βββ base.py # Abstract provider interface
β βββ factory.py # Provider factory pattern
β βββ providers/
β β βββ gemini.py # Google Gemini implementation
β β βββ mock.py # Mock provider for testing
β βββ prompts/
β βββ base_templates.py
β βββ provider_specific/
β βββ gemini_templates.py
βββ pyproject.toml # Project dependencies
βββ Dockerfile # Container configuration
βββ .env # Environment variables (create this)
The platform is designed to be extensible. To add a new provider (e.g., Anthropic):
- Create a new provider class: Create app/core/providers/anthropic.py inheriting from AIModelProvider.
- Implement Interface: Implement the generate_content method.
- Register: Add the class to the _providers dict in app/core/factory.py.
- Configure dd a new ProviderConfig entry in app/config.py defaults.
- Add to the .env file: Add environment variable support following the
PROVIDER_SETTINGS__PROVIDER__KEYpattern
cd services/text-service
pytestThis means your API key isn't being loaded. Ensure:
- Your
.envfile uses the nested format:PROVIDER_SETTINGS__GEMINI__API_KEY=your_key(double underscore) - The
.envfile is inservices/text-service/directory - The service has been restarted after adding the key
Add the model name to your .env:
PROVIDER_SETTINGS__GEMINI__MODEL_NAME=gemini-2.5-flashCheck the service logs for detailed error messages. Common issues:
- Invalid or missing API key
- Network issues fetching the image URL
- Provider-specific API errors
- Template/Prompt Error : Check logs. Usually caused by missing double braces {{ }} in JSON prompt templates.
Virtual Environment Issues
- Ensure you ran source .venv/bin/activate
- Ensure you do uv sync
We believe in a "Golden Path" for development. Instead of managing complex Kubernetes manifests manually, our engineering team uses Canvas, an internal developer platform that abstracts infrastructure complexity.
- Standardization: Every microservice (Text, Image) is defined by a simple
canvas.yamlblueprint. - Automation: The Canvas Engine handles security hardening, logging sidecars, and scaling automatically.
- Velocity: Developers focus on code, not clusters.
π Read the full Canvas Platform Documentation
See LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions or issues, please open an issue in the repository.