Skip to content

SakD2006/floatchat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FloatChat Logo

FloatChat

Ask ocean data questions. Get answers, not files.

FloatChat is a conversational interface for exploring ARGO oceanographic data. Users can query oceanographic datasets in natural language and receive visualizations such as plots, maps, and summaries.

DeepWiki Documentation


Overview

ARGO datasets are powerful but difficult to work with due to their size and format. FloatChat reduces this friction by providing a chat-based interface for data exploration instead of manual data processing.

The project was built to make oceanographic data more accessible for students, researchers, and analysts without requiring prior experience with NetCDF files or scripting workflows.


Features

  • Natural language querying over ARGO float data using RAG pipeline
  • Automatic SQL generation from natural language questions
  • Interactive visualizations with Plotly (plots, maps, heatmaps)
  • Semantic search across float metadata and measurements via ChromaDB
  • User authentication with local accounts and Google OAuth
  • Real-time chat interface with message history
  • Session management for persistent conversations
  • Kubernetes-ready with full deployment manifests

Screenshots

Hero section

Hero Section

Problem and Solution

The Problem & Solution

Key Features

Key Features

Why It Matters

Why It Matters

Interactive prompting interface
Interactive Prompting Interface


Tech Stack

Frontend (Next.js)

  • Next.js 15 with React 19
  • TypeScript
  • TailwindCSS
  • Plotly.js & React-Plotly for data visualization
  • React Markdown for message formatting
  • Framer Motion for animations

Backend API (Node.js)

  • Express.js
  • TypeScript
  • Passport.js (Local & Google OAuth)
  • PostgreSQL with pg driver
  • Session management with connect-pg-simple

AI/ML Service (Python)

  • FastAPI
  • LangChain for RAG (Retrieval-Augmented Generation)
  • Groq API (Llama 3.3 70B)
  • HuggingFace Embeddings (all-MiniLM-L6-v2)
  • LangChain SQL Agent for database queries

Databases

  • PostgreSQL with PostGIS extension
  • ChromaDB for vector embeddings

Data Processing

  • Xarray for NetCDF files
  • Pandas for data manipulation
  • psycopg2 for database operations

Infrastructure

  • Docker & Docker Compose
  • Kubernetes manifests (k8s/)

Architecture

FloatChat follows a microservices architecture with three main services:

1. Frontend (app/)

  • Next.js application with React components
  • Handles user authentication, chat interface, and data visualization
  • Communicates with the backend API for user management
  • Connects to the AI service for chat functionality

2. Backend API (api/)

  • Express.js REST API
  • Manages user authentication (local + Google OAuth)
  • Handles session management with PostgreSQL
  • Provides authentication endpoints

3. AI Service (ai/)

  • FastAPI-based service
  • Implements RAG (Retrieval-Augmented Generation) pipeline
  • Uses LangChain with Groq's Llama 3.3 70B model
  • Queries PostgreSQL database using natural language
  • Retrieves context from ChromaDB vector store
  • Generates SQL queries and returns structured responses

Data Flow

RAG Pipeline Architecture
RAG & MCP Pipeline Architecture

  1. Data Ingestion: ARGO NetCDF files are downloaded and parsed using Xarray
  2. Data Storage: Extracted measurements, profiles, and float metadata are stored in PostgreSQL with PostGIS
  3. Embedding Generation: Profile summaries are generated and embedded using HuggingFace models
  4. Vector Storage: Embeddings are stored in ChromaDB for semantic search
  5. User Query: User sends natural language query through the Next.js frontend
  6. RAG Pipeline:
    • Query is processed by the LangChain agent
    • Relevant context is retrieved from ChromaDB
    • SQL queries are generated to fetch data from PostgreSQL
    • LLM synthesizes the response
  7. Visualization: Response is rendered with Plotly charts in the frontend

Example Database Schema

CREATE TABLE floats (
    float_id INTEGER PRIMARY KEY,
    wmo_number VARCHAR(20),
    data_center VARCHAR(50),
    platform_type VARCHAR(100),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE profiles (
    profile_id SERIAL PRIMARY KEY,
    float_id INTEGER REFERENCES floats(float_id),
    cycle_number INTEGER,
    profile_date TIMESTAMPTZ,
    latitude DECIMAL(10,7),
    longitude DECIMAL(10,7),
    location GEOGRAPHY(POINT, 4326),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE measurements (
    measurement_id SERIAL PRIMARY KEY,
    profile_id INTEGER REFERENCES profiles(profile_id),
    pressure DECIMAL(10,3),
    temperature DECIMAL(10,4),
    salinity DECIMAL(10,4),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Project Structure

floatchat/
├── app/ # Next.js frontend application
│ ├── src/
│ │ ├── app/
│ │ ├── components/
│ │ ├── lib/
│ │ └── utils/
│ ├── public/
│ └── Dockerfile
│
├── api/ # Express.js backend API
│ ├── src/
│ │ ├── controllers/
│ │ ├── models/
│ │ ├── routes/
│ │ ├── passport.ts
│ │ └── server.ts
│ └── Dockerfile
│
├── ai/ # Python FastAPI AI service
│ ├── src/
│ │ ├── api/
│ │ ├── core/
│ │ ├── database/
│ │ ├── llm/
│ │ └── schemas/
│ ├── scripts/
│ ├── data/chroma_db/
│ └── Dockerfile
│
├── data_processing/ # ARGO data scraping scripts
│ └── scrape-argo-data.py
│
├── k8s/ # Kubernetes deployment manifests
│ ├── ai/
│ ├── api/
│ ├── app/
│ ├── db/
│ └── ingress.yaml
│
└── images/ # Project screenshots and assets


Project Status

FloatChat was developed as part of the Smart India Hackathon.

The project cleared the internal hackathon at VIT and was shortlisted for the SIH finals from VIT Vellore.

This repository contains the final version submitted during the selection process.


Team

Team BoyOhBuoy
Team BoyOhBuoy - Smart India Hackathon 2025

Team Members

Mentor: Professor Manoov R

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •