HTN 2025 - AI-Powered Memory Gallery

A sophisticated image management and memory preservation application that combines FastAPI backend with React frontend, featuring AI-powered image analysis, tagging, and search capabilities by taking data from a snapchat spectacle.

Screenshots

🚀 Overview

This application allows users to:

Upload and manage image collections
Automatically generate AI-powered tags and descriptions for images
Search images using natural language queries with AI-powered semantic search
View image locations on interactive maps
Transcribe audio recordings for memory context
Generate embeddings for semantic image search

🏗️ Architecture

Backend (FastAPI)

Framework: FastAPI with async/await support
Database: SQLite with SQLAlchemy ORM (async)
AI Services: Google Gemini API for image analysis and transcription
Image Processing: PIL for image handling
Audio Processing: SoundDevice for recording, Gemini for transcription

Frontend (React + TypeScript)

Framework: React 18 with TypeScript
Build Tool: Vite
Styling: Tailwind CSS with Radix UI components
Maps: Leaflet for interactive map display
State Management: React hooks (useState, useEffect, useMemo)

📁 Project Structure

HTN-2025/
├── backend/                    # FastAPI backend application
│   ├── app/
│   │   ├── models/            # Pydantic models for API
│   │   ├── repository/        # Database access layer
│   │   ├── routers/           # API route handlers
│   │   ├── utils/             # Utility functions
│   │   └── config.py          # Configuration and Supabase setup
│   ├── database/              # Database models and connection
│   ├── images/                # Static image storage
│   └── main.py                # FastAPI application entry point
├── frontend/                  # React frontend application
│   ├── src/
│   │   ├── components/        # React components
│   │   ├── hooks/             # Custom React hooks
│   │   ├── lib/               # Type definitions and utilities
│   │   └── utils/             # API utilities and mock data
│   └── package.json           # Frontend dependencies
└── README.md                  # This file

🛠️ Setup Instructions

Prerequisites

Python 3.11+
Node.js 18+
Google API Key (for Gemini AI services)
Supabase account (optional, for cloud storage)

Backend Setup

Navigate to backend directory:
```
cd backend
```
Install dependencies:
```
uv sync
```
Activate virtual environment:
- Windows: .venv/scripts/activate.ps1
- macOS/Linux: .venv/bin/activate

Set up environment variables: Create a .env file in the root directory:

GOOGLE_API_KEY=your_google_api_key_here
SUPABASE_URL=your_supabase_url
SUPABASE_KEY=your_supabase_key

Run the development server:
```
uv run fastapi dev
```

The backend will be available at http://localhost:8000

Frontend Setup

Navigate to frontend directory:
```
cd frontend
```
Install dependencies:
```
npm install
```
Start development server:
```
npm run dev
```

The frontend will be available at http://localhost:3000

🔧 API Endpoints

Image Management

POST /api/images/ - Create new image record
GET /api/images/ - Get all images (with pagination and filtering)
GET /api/images/{image_id} - Get specific image by ID
PUT /api/images/{image_id} - Update image metadata
DELETE /api/images/{image_id} - Delete image record

Search & Discovery

GET /api/images/search/by-tags - Search images by tags
GET /api/images/images_by_audio - Search images using audio description

Statistics & Analytics

GET /api/images/stats/counts - Get image statistics (total, tagged, untagged)
GET /api/images/stats/locations - Get image location data for mapping

🤖 AI Features

Image Analysis

Automatic Tagging: Uses Google Gemini to generate descriptive tags
Object Detection: Identifies objects and their locations in images
Scene Classification: Categorizes images by scene type (indoor, outdoor, etc.)
Color Analysis: Extracts dominant colors from images
Description Generation: Creates natural language descriptions

Audio Processing

Voice Recording: 3-second audio recording capability
Speech Transcription: Converts audio to text using Gemini
Audio Search: Search images using spoken descriptions

Embeddings & Semantic Search

Multi-modal Embeddings: Generate embeddings for images and text
Semantic Search: Find similar images based on content understanding
Vector Storage: Store embeddings for fast similarity searches

🎨 Frontend Components

Core Components

MemoryGallery: Main application container
MemoryGrid: Grid layout for image display
MemoryCard: Individual image card with metadata
MemorySearch: Search interface with filtering
Map: Interactive map showing image locations

Hooks

useMemories: Manages image data fetching and state
useMemorySearch: Handles search functionality and filtering

🔒 Security & Configuration

Environment Variables

GOOGLE_API_KEY: Required for Gemini AI services
SUPABASE_URL: Optional cloud storage URL
SUPABASE_KEY: Optional cloud storage API key

CORS Configuration

The backend is configured to accept requests from http://localhost:3000 for development.

📝 License

This project was created for HTN 2025. Please refer to the hackathon guidelines for usage rights.

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
backend		backend
frontend		frontend
spectacle		spectacle
.gitignore		.gitignore
README.md		README.md
check_audio.py		check_audio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HTN 2025 - AI-Powered Memory Gallery

Screenshots

🚀 Overview

🏗️ Architecture

Backend (FastAPI)

Frontend (React + TypeScript)

📁 Project Structure

🛠️ Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

🔧 API Endpoints

Image Management

Search & Discovery

Statistics & Analytics

🤖 AI Features

Image Analysis

Audio Processing

Embeddings & Semantic Search

🎨 Frontend Components

Core Components

Hooks

🔒 Security & Configuration

Environment Variables

CORS Configuration

📝 License

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

AarshShah9/PhotaMems

Folders and files

Latest commit

History

Repository files navigation

HTN 2025 - AI-Powered Memory Gallery

Screenshots

🚀 Overview

🏗️ Architecture

Backend (FastAPI)

Frontend (React + TypeScript)

📁 Project Structure

🛠️ Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

🔧 API Endpoints

Image Management

Search & Discovery

Statistics & Analytics

🤖 AI Features

Image Analysis

Audio Processing

Embeddings & Semantic Search

🎨 Frontend Components

Core Components

Hooks

🔒 Security & Configuration

Environment Variables

CORS Configuration

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages