AI-Powered Medical Image Annotation Tool Built by Team Googol for the Agentic AI App Hackathon
๐ 2nd Place Winner - ODSC Agentic AI App Hackathon 2024 Official Announcement
MedAnnotator is an LLM-assisted multimodal medical image annotation tool that uses Google Gemini and MedGemma to provide fast, structured, and consistent annotations for medical images (X-rays, CT scans, MRIs).
Key Innovation: Implements a ReAct (Reasoning + Acting) agentic pattern where the system autonomously reasons about medical images, orchestrates specialized tools, and generates standardized JSON outputs.
- Problem: Manual medical image annotation is slow (hours per image), inconsistent, and doesn't scale
- Solution: AI-powered structured annotation in 2-5 seconds
- Impact: Faster radiology workflows, better research datasets, improved patient care
Cloud API Integration (December 2024)
- โ๏ธ MedGemma now deployable on Google Cloud Compute Engine
- ๐ Automatic fallback: Cloud API โ Local HuggingFace model
- โก Faster processing without local GPU requirements
- ๐ See CLOUD_API_INTEGRATION.md for setup
Two-Tier Agentic Architecture
- ๐ง Enhanced agent pipeline with retry logic and validation
- ๐พ Dual database tables for full traceability (
annotation_request+annotation) - ๐ค Medical chatbot with dataset context and tool calling
- ๐ Real-time analytics and flagged image tracking
Demo & Documentation
- ๐ฌ Complete demo video with timestamp navigation (DEMO.md)
- ๐ Comprehensive technical documentation
- ๐ 2nd Place in ODSC Agentic AI App Hackathon 2024
- Python 3.11+
- Google Gemini API Key (Get one here)
- (Optional but recommended) UV for 10x faster installation
Option 1: With UV (Recommended - 10x faster) โก
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS/Linux
# or: powershell -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows
# Clone and setup
git clone https://github.com/your-username/googol.git
cd googol
# One-command install
./install.sh
# Or manually
uv sync
# Set up environment
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEYOption 2: With pip (Traditional)
# Clone the repository
git clone https://github.com/your-username/googol.git
cd googol
# Install dependencies
pip install -r requirements.txt
# Set up environment
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY๐ก New to UV? See .claude/UV_SETUP.md for a complete guide!
With Scripts (Auto-detects UV or Python):
# Terminal 1 - Backend
chmod +x run_backend.sh run_frontend.sh
./run_backend.sh
# Terminal 2 - Frontend
./run_frontend.shWith UV Directly (No activation needed!):
# Terminal 1 - Backend
uv run python -m src.api.main
# Terminal 2 - Frontend
uv run streamlit run src/ui/app.pyWith Traditional Python:
# Activate venv first
source .venv/bin/activate # macOS/Linux
.venv\Scripts\activate # Windows
# Terminal 1 - Backend
python -m src.api.main
# Terminal 2 - Frontend
streamlit run src/ui/app.pyAccess:
- Frontend: http://localhost:8501
- Backend API: http://localhost:8000/docs
# Build and run with Docker Compose
docker-compose up --build
# Or build manually
docker build -t medannotator .
docker run -p 8000:8000 --env-file .env medannotator- โ ReAct Pattern: Multi-step reasoning (Plan โ Act โ Observe โ Structure)
- โ Tool Orchestration: Automatic MedGemma โ Gemini pipeline
- โ Autonomous Decision Making: Plans annotation strategy independently
- โ Error Recovery: Graceful fallbacks and comprehensive logging
- โ Structured Output: Consistent JSON schema enforcement
- โ Medical image upload (JPG, PNG)
- โ AI-powered image analysis (2-5 second processing)
- โ Structured JSON annotation output
- โ Editable results with confidence scores
- โ Downloadable annotations
- โ Human-in-the-loop design
- โ FastAPI async backend
- โ Streamlit interactive frontend
- โ Pydantic data validation
- โ Comprehensive error handling
- โ Full logging and observability
- โ Docker containerization
- โ CI/CD pipeline
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Streamlit Frontend (UI) โ
โ Image Upload โ Results Display โ Edit โ
โโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ HTTP/REST API
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FastAPI Backend (API) โ
โ /annotate endpoint โ Request Validation โ
โโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ GeminiAnnotationAgent (ReAct) โ
โ Reason โ Act (MedGemma) โ Observe โ Structure (Gemini) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
โ MedGemma Tool โ โ Gemini API โ
โ Medical Analysis โ โ JSON Structuring โ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
See ARCHITECTURE.md for detailed system design.
{
"patient_id": "P-12345",
"findings": [
{
"label": "Pneumothorax",
"location": "Right lung apex",
"severity": "Small"
},
{
"label": "Normal",
"location": "Cardiac silhouette",
"severity": "None"
}
],
"confidence_score": 0.85,
"generated_by": "MedGemma/Gemini-2.0-Flash",
"additional_notes": "No other acute abnormalities identified"
}googol/
โโโ .github/
โ โโโ workflows/
โ โโโ ci.yml # CI/CD pipeline
โโโ .claude/ # Additional documentation
โ โโโ PROJECT_SETUP.md # Detailed setup guide
โ โโโ QUICKSTART.md # 5-minute guide
โ โโโ TEAM_TASKS.md # Task distribution
โ โโโ DEMO_GUIDE.md # Demo preparation
โโโ src/
โ โโโ api/ # FastAPI backend
โ โ โโโ main.py # API endpoints
โ โโโ agent/ # Gemini agent (ReAct)
โ โ โโโ gemini_agent.py # Orchestration logic
โ โโโ tools/ # Tool integrations
โ โ โโโ medgemma_tool.py # MedGemma wrapper
โ โโโ ui/ # Streamlit frontend
โ โ โโโ app.py # UI application
โ โโโ config.py # Configuration
โ โโโ schemas.py # Data models
โโโ data/
โ โโโ sample_images/ # Test images
โ โโโ annotations/ # Output annotations
โโโ logs/ # Application logs
โโโ tests/ # Test suite
โโโ ARCHITECTURE.md # System architecture โญ
โโโ EXPLANATION.md # Technical explanation โญ
โโโ DEMO.md # Demo video link โญ
โโโ TEST.sh # Smoke test script โญ
โโโ Dockerfile # Docker configuration โญ
โโโ docker-compose.yml # Docker Compose config
โโโ requirements.txt # Python dependencies
โโโ environment.yml # Conda environment
โโโ README.md # This file โญ
โญ = Required for hackathon submission
Run the smoke test suite:
chmod +x TEST.sh
./TEST.shThis will verify:
- Python version compatibility
- Required dependencies
- Module imports
- Configuration loading
- Mock tool functionality
- Documentation completeness
- ARCHITECTURE.md - Complete system architecture with diagrams
- EXPLANATION.md - Technical deep dive and workflows
- DEMO.md - Demo video with detailed timestamps (3:09 minutes)
- CLOUD_API_INTEGRATION.md - Cloud MedGemma API setup and deployment
- .claude/PROJECT_SETUP.md - Detailed setup instructions
- .claude/QUICKSTART.md - 5-minute quick start
- .claude/MEDGEMMA_SETUP.md - MedGemma model configuration
- Production-quality code (900+ lines)
- Comprehensive error handling
- Full logging and observability
- Type safety with Pydantic
- Async API design
- Clear component separation
- Modular, maintainable design
- 2000+ lines of documentation
- ASCII architecture diagrams
- Complete technical explanations
- Gemini 2.0 Flash with JSON mode
- ReAct pattern for agentic behavior
- Multi-model orchestration (Gemini + MedGemma)
- Structured output enforcement
- Tool calling architecture
- Solves real radiology workflow problem
- Improves annotation consistency
- Enables better medical research
- Scalable to thousands of images
- Human-in-the-loop design for safety
๐บ Watch the Demo Video (3:09) | Detailed Timestamps & Analysis
Watch our complete walkthrough showing:
- Problem statement and solution overview
- Dataset loading and configuration
- AI chatbot interaction with tool calling
- MedGemma analysis and Gemini validation pipeline
- Real-time structured output generation
- Edge case handling and human-in-the-loop design
- Real-world impact
๐ 2nd Place Winners - ODSC Agentic AI App Hackathon 2024
- Rafael Kovashikawa - @kovashikawa
- Ravali Yerrapothu - @ry639a
- Tyrone
- Guilherme - @guirque
Recognition: Official ODSC LinkedIn Announcement
- Python 3.11 - Primary language
- FastAPI - High-performance async web framework
- Streamlit - Interactive web UI
- Pydantic - Data validation and settings
- Google Gemini 2.0 Flash - LLM reasoning and JSON generation
- MedGemma 4B-IT - Medical specialist model
- Deployment: HuggingFace (local) or Google Cloud Compute Engine (cloud)
- Automatic fallback for reliability
- google-generativeai - Gemini SDK
- transformers - HuggingFace model loading
- PyTorch - Deep learning framework
- Google Cloud Compute Engine - Cloud MedGemma deployment
- Docker - Containerization
- GitHub Actions - CI/CD
- Uvicorn - ASGI server
- SQLite - Two-tier database architecture
This tool is for research and educational purposes only.
- NOT FDA approved
- NOT for clinical diagnosis
- Requires physician oversight
- May contain PHI concerns - anonymize data before upload
- MedGemma uses mock data (real integration via Vertex AI possible)
- Stateless design (no annotation history)
- Single-user sessions
- Max image size: 10MB recommended
See EXPLANATION.md for detailed limitations and future enhancements.
- Real MedGemma integration via Vertex AI
- RAG with medical guidelines
- Bounding box visualization
- Annotation history database
- User authentication
- HIPAA compliance
- FDA validation pathway
- Multi-user collaboration
- Batch processing
- Export to DICOM SR / HL7 FHIR
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Gemini Team for the powerful API
- MedGemma researchers for the specialized medical model
- FastAPI and Streamlit communities
- Agentic AI App Hackathon organizers
- GitHub Issues: Report bugs or request features
- Email: rkovashikawa@gmail.com
- Documentation: See .claude/ folder for additional guides
Built with โค๏ธ using Google Gemini, FastAPI, and Streamlit
๐ฅ Making medical annotation faster, better, and more accessible.