RAG-powered knowledge base for the Birmingham AI community. Named after Carrie A. Tuggle (1858-1924), a Birmingham educator who founded the first orphanage for African-American children in Alabama. Ask questions about past meetups, presentations, and livestreams.
-
Clone and configure
git clone https://github.com/Birmingham-AI/carrie.git cd carrie -
Set up environment
Copy
.env.exampleto.envand configure:# Required OPENAI_API_KEY=sk-... # Required for YouTube transcription and vector search SUPABASE_URL=https://your-project.supabase.co SUPABASE_KEY=your-service-role-key # Required for YouTube upload protection UPLOAD_API_KEY=your-secret-key
-
Initialize Supabase database
Run the SQL script in
backend/SQL/initialize.sqlin your Supabase SQL editor to create the required tables and functions. -
Start the application
For Docker:
docker-compose up -d
For Podman:
python -m podman_compose up -d
- Frontend: http://localhost:5174
- Backend API: http://localhost:8001
- API docs: http://localhost:8001/docs
-
Use the app
- Chat: Ask questions at http://localhost:5174
- Upload: Add YouTube videos at http://localhost:5174/upload
- AI-powered Q&A: Ask questions about Birmingham AI meetup content
- YouTube transcription: Upload YouTube videos to automatically transcribe and embed for search
- Vector search: Find relevant content using semantic similarity via Supabase pgvector
- Streaming responses: Real-time AI answers with source citations
- Web search integration: Optional web search for additional context
carrie/
├── backend/
│ ├── app.py # FastAPI application
│ ├── routes/
│ │ ├── ask.py # Q&A streaming endpoint
│ │ └── youtube.py # YouTube upload endpoints
│ ├── services/
│ │ ├── rag_service.py # Vector search via Supabase
│ │ └── streaming_agent.py # OpenAI agent for answers
│ ├── actions/
│ │ ├── transcribe_youtube.py # YouTube transcription + embedding
│ │ ├── process_slides.py # PDF slide extraction
│ │ ├── embed.py # Embedding generation
│ │ └── bundle.py # Embedding bundling (legacy)
│ ├── clients/
│ │ ├── openai.py # Async OpenAI client
│ │ └── supabase.py # Async Supabase client
│ ├── models/
│ │ └── schemas.py # Pydantic request/response models
│ ├── SQL/
│ │ └── initialize.sql # Database schema
│ └── requirements.txt
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── chat/ # Chat UI components
│ │ │ ├── upload/ # YouTube upload UI
│ │ │ └── error/ # Error boundary
│ │ ├── services/
│ │ │ └── ApiService.ts # API client
│ │ └── App.tsx # Routes and layout
│ └── package.json
├── docker-compose.yml
└── .env
POST /api/ask- Streaming Q&A with conversation historyGET /api/search?question=...&top_k=5- Vector search only
POST /api/youtube/verify-key- Verify API key (requiresX-API-Keyheader)POST /api/youtube/upload- Start transcription job (requiresX-API-Keyheader)GET /api/youtube/status/{job_id}- Check job statusGET /api/youtube/sources- List processed videosDELETE /api/youtube/sources/{id}- Delete video and embeddings (requiresX-API-Keyheader)
GET /- Health check
Full API documentation: http://localhost:8001/docs
Add YouTube videos to the knowledge base via the web UI or API:
Web UI: Navigate to http://localhost:5174/upload
CLI:
python -m backend.actions.transcribe_youtube \
--url "https://www.youtube.com/watch?v=VIDEO_ID" \
--session "Nov 2024 Birmingham AI Meetup"API:
curl -X POST http://localhost:8001/api/youtube/upload \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-key" \
-d '{"url": "VIDEO_URL", "session_info": "Session Name"}'Options:
chunk_size: Characters per chunk (default: 1000)overlap: Sentences to overlap between chunks (default: 1)language: Transcript language code (default: "en")
python -m venv .venv
source .venv/bin/activate
pip install -r backend/requirements.txt
cd backend
uvicorn app:app --reload --host 0.0.0.0 --port 8000cd frontend
npm install
npm run devThe system uses Supabase with pgvector for vector similarity search:
sources: Tracks uploaded content (YouTube videos, etc.)
id,source_type,source_id,session_info,chunk_count,processed_at
embeddings: Stores text chunks with vector embeddings
id,source_id,text,timestamp,embedding(vector 1536)
match_embeddings(): RPC function for cosine similarity search
- Backend: FastAPI, OpenAI Agents SDK, Supabase (async)
- Frontend: React, TypeScript, Tailwind CSS, React Router
- AI: OpenAI GPT-4o-mini, text-embedding-3-small
- Database: Supabase PostgreSQL with pgvector
- Infrastructure: Docker, Docker Compose
