Skip to content

A hackathon geo science project for analyzing well depth and computation using an AI agent

License

Notifications You must be signed in to change notification settings

Fabito97/wellbore-data-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Wellbore Data Agent

Version: 1.0.0
Status: Development (SPE Hackathon Project)

An AI-powered wellbore analysis system leveraging Retrieval-Augmented Generation (RAG) to enable intelligent document analysis, real-time chat interactions, and data extraction from technical wellbore documents.


🎯 Overview

The Wellbore Data Agent is a full-stack application designed to help petroleum engineers and analysts extract valuable insights from technical wellbore documents. By combining modern AI, vector databases, and a responsive web interface, the system enables users to:

  • Upload and process PDF documents containing wellbore data and technical information
  • Query documents using natural language through an intelligent AI agent
  • Extract insights including summaries, tables, and calculated analyses
  • Perform nodal analysis calculations on wellbore data
  • Real-time interaction via WebSocket for responsive, streaming conversations

πŸ“‹ Project Structure

wellbore-data-agent/
β”œβ”€β”€ backend/                      # FastAPI backend service
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ main.py              # FastAPI application entry point
β”‚   β”‚   β”œβ”€β”€ agents/              # AI agent logic (LangGraph-based)
β”‚   β”‚   β”‚   β”œβ”€β”€ agent_graph.py
β”‚   β”‚   β”‚   β”œβ”€β”€ extraction_graph.py
β”‚   β”‚   β”‚   β”œβ”€β”€ summarization_graph.py
β”‚   β”‚   β”‚   └── langgraph_agent.py
β”‚   β”‚   β”œβ”€β”€ api/                 # API endpoints and middleware
β”‚   β”‚   β”‚   β”œβ”€β”€ routes/          # API route handlers
β”‚   β”‚   β”‚   β”œβ”€β”€ middleware/      # CORS, error handling
β”‚   β”‚   β”‚   └── deps.py          # Dependency injection
β”‚   β”‚   β”œβ”€β”€ core/                # Core configuration
β”‚   β”‚   β”œβ”€β”€ db/                  # Database management
β”‚   β”‚   β”œβ”€β”€ models/              # Pydantic data models
β”‚   β”‚   β”œβ”€β”€ rag/                 # RAG pipeline components
β”‚   β”‚   β”‚   β”œβ”€β”€ chunking.py      # Document chunking strategies
β”‚   β”‚   β”‚   β”œβ”€β”€ embeddings.py    # Embedding generation
β”‚   β”‚   β”‚   β”œβ”€β”€ retriever.py     # Document retrieval
β”‚   β”‚   β”‚   └── vector_store_manager.py
β”‚   β”‚   β”œβ”€β”€ services/            # Business logic services
β”‚   β”‚   β”‚   β”œβ”€β”€ llm_service.py
β”‚   β”‚   β”‚   β”œβ”€β”€ document_service.py
β”‚   β”‚   β”‚   └── conversation_service.py
β”‚   β”‚   β”œβ”€β”€ utils/               # Utility functions
β”‚   β”‚   └── validation/          # Request/response validation
β”‚   β”œβ”€β”€ data/                    # Data directory
β”‚   β”‚   β”œβ”€β”€ raw/                 # Raw uploaded documents
β”‚   β”‚   β”œβ”€β”€ processed/           # Processed documents
β”‚   β”‚   β”œβ”€β”€ uploads/             # Temporary upload storage
β”‚   β”‚   └── vector_db/           # Chroma vector database
β”‚   β”œβ”€β”€ scripts/                 # Setup and utility scripts
β”‚   β”œβ”€β”€ requirements.txt         # Python dependencies
β”‚   └── README.md               # Backend documentation
β”‚
β”œβ”€β”€ frontend/                    # React + Vite frontend application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ App.tsx             # Root component
β”‚   β”‚   β”œβ”€β”€ main.tsx            # Entry point
β”‚   β”‚   β”œβ”€β”€ routes.tsx          # Router configuration
β”‚   β”‚   β”œβ”€β”€ components/         # Reusable React components
β”‚   β”‚   β”œβ”€β”€ pages/              # Page components
β”‚   β”‚   β”œβ”€β”€ services/           # API client services
β”‚   β”‚   β”œβ”€β”€ store/              # Redux state management
β”‚   β”‚   β”œβ”€β”€ types/              # TypeScript type definitions
β”‚   β”‚   β”œβ”€β”€ layout/             # Layout components
β”‚   β”‚   └── context/            # React context hooks
β”‚   β”œβ”€β”€ public/                 # Static assets
β”‚   └── package.json            # Node.js dependencies
β”‚
β”œβ”€β”€ docs/                       # Documentation
β”‚   β”œβ”€β”€ architecture.md
β”‚   β”œβ”€β”€ api.md
β”‚   β”œβ”€β”€ agent-workflow.md
β”‚   └── deployment.md
β”‚
β”œβ”€β”€ docker-compose.yml          # Multi-container orchestration
└── README.md                   # This file


πŸ—οΈ Architecture

Backend Stack

  • Framework: FastAPI (Python)
  • AI/ML:
    • LangGraph for agent orchestration
    • LangChain for LLM interactions
    • Ollama for local LLM inference
  • Vector Database: Chroma (persistent vector storage)
  • Embeddings: Sentence Transformers
  • Real-time Communication: WebSocket support via FastAPI
  • Document Processing: PDF extraction using PDFMiner, pdfplumber, PyMuPDF
  • Async Runtime: Uvicorn with async/await support

Frontend Stack

  • Framework: React 19 with TypeScript
  • Build Tool: Vite
  • UI Components: Material-UI (MUI), custom Radix UI components
  • State Management: Redux Toolkit
  • Styling: Tailwind CSS
  • HTTP Client: Axios
  • Real-time: WebSocket integration for live chat
  • Markdown: React-Markdown for rendered content

Infrastructure

  • Containerization: Docker & Docker Compose
  • Communication: Backend (port 8000) ↔ Frontend (port 5173)
  • External: Ollama LLM service (port 11434)

πŸš€ Key Features

1. Document Management

  • Upload PDFs: Drag-and-drop or file selection interface
  • Automatic Processing: Documents are chunked and embedded into vector store
  • Metadata Tracking: Tracks page count, word count, chunk count, and upload timestamps
  • Document Retrieval: List all documents with detailed metadata
  • Document Deletion: Remove documents and associated data

2. AI-Powered Chat

  • Three Chat Endpoints:
    • /chat/ - Simple query endpoint
    • /chat/ask - Question-answering with confidence scores and source citations
    • /chat/stream - Streaming responses for real-time interaction
  • WebSocket Interface (/ws/):
    • question: Get answers to queries about documents
    • summarize: Generate document summaries
    • extract_tables: Extract tables based on natural language queries

3. Intelligent Agent System

  • Built on LangGraph for agentic workflows
  • Tool-based architecture with specialized agents:
    • Extraction Agent: Extract structured data from documents
    • Summarization Agent: Generate concise summaries
    • Analysis Agent: Perform calculations and analysis

4. RAG Pipeline

  • Document Chunking: Intelligent splitting with overlap for context preservation
  • Embedding Generation: Dense embeddings using sentence-transformers
  • Vector Search: Semantic similarity search via Chroma
  • Context Retrieval: Top-K document chunk retrieval for LLM context

5. Nodal Analysis

  • Calculation framework for wellbore nodal analysis
  • Currently includes mocked calculations with extensible architecture

6. Health Monitoring

  • /health endpoint to check system status
  • Validates LLM service connectivity
  • Monitors vector store health
  • Reports detailed service status

πŸ“¦ Technology Stack

Python Packages (Backend)

  • LLM & AI: langchain, langgraph, langchain-ollama, ollama
  • Vector DB: chromadb, langchain-chroma
  • Web: fastapi, uvicorn, python-socketio, websockets
  • PDF Processing: pdfplumber, pdfminer.six, PyPDF2, PyMuPDF, camelot
  • ML: torch, transformers, sentence-transformers, scikit-learn
  • Data: pandas, pydantic, sqlalchemy
  • Utilities: python-dotenv, tenacity, httpx

Node Packages (Frontend)

  • React Ecosystem: react, react-dom, react-router-dom
  • State: redux, @reduxjs/toolkit, react-redux
  • UI: @mui/material, tailwindcss, lucide-react, react-icons
  • Utilities: axios, marked, react-markdown, dompurify
  • Forms: react-dropzone (for file uploads)

πŸ”§ Getting Started

Prerequisites

  • Docker & Docker Compose
  • OR
    • Python 3.10+
    • Node.js 18+
    • Ollama (for local LLM inference)

Quick Start (Docker)

  1. Clone the repository:

    git clone <repository-url>
    cd wellbore-data-agent
  2. Start services:

    docker-compose up --build
  3. Access the application:

Manual Setup

Backend

  1. Install Python dependencies:

    cd backend
    pip install -r requirements.txt
  2. Configure environment:

    cp .env.example .env
    # Edit .env with your settings:
    # - OLLAMA_BASE_URL (default: http://localhost:11434)
    # - OLLAMA_MODEL (default: llama2)
  3. Start Ollama service (if using local LLM):

    ollama serve
  4. Run the application:

    uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend

  1. Install Node dependencies:

    cd frontend
    npm install
  2. Start development server:

    npm run dev
  3. Access:


πŸ“Š Data Flow

User Input
    ↓
Frontend (React)
    ↓
WebSocket/HTTP to Backend
    ↓
FastAPI Router
    ↓
LangGraph Agent
    β”œβ†’ Document Retrieval (Vector Store)
    β”œβ†’ LLM Service (Ollama)
    β””β†’ Tool Execution (Extraction, Summarization, Analysis)
    ↓
Response 
    ↓
Frontend Display

πŸ“„ License

All Rights Reserved.

This project was originally developed for the SPE (Society of Petroleum Engineers) Hackathon and still under review. You are welcome to view the code, explore the architecture, and reference the approach for educational or evaluative purposes.

However, reuse, redistribution, or commercial use of the project is not permitted at this time without prior permission from the author.


πŸ‘₯ Team

Developed for the SPE (Society of Petroleum Engineers) Hackathon


About

A hackathon geo science project for analyzing well depth and computation using an AI agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •