Autonomous Data Analyst Agent

An LLM-powered autonomous agent that accepts a dataset, plans analysis, writes Python code, executes it in a sandbox, self-corrects on errors, and generates actionable insights — all streamed to a React dashboard in real time.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        React Dashboard                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌───────────────┐  │
│  │CSV Upload│  │  Chat UI │  │Code View │  │Reasoning Trace│  │
│  └────┬─────┘  └────┬─────┘  └──────────┘  └───────────────┘  │
│       │              │             ▲               ▲            │
│       │     SSE Stream│             │               │            │
└───────┼──────────────┼─────────────┼───────────────┼────────────┘
        │              │             │               │
        ▼              ▼             │               │
┌───────────────────────────────────────────────────────────────┐
│                     FastAPI Backend                            │
│                                                               │
│  ┌──────────┐  ┌──────────┐  ┌───────────┐  ┌─────────────┐ │
│  │ Planner  │→ │ Executor │→ │Reflection │→ │ Narrative   │ │
│  │(LLM call)│  │(tool loop)│  │ (fix loop) │  │ Generator   │ │
│  └──────────┘  └────┬─────┘  └───────────┘  └─────────────┘ │
│                      │                                        │
│  ┌───────────────────┴───────────────────────────────┐       │
│  │                  Memory Manager                    │       │
│  │  • Conversation Memory   • Task / Step Memory      │       │
│  │  • Error Memory          • Context Variables       │       │
│  └────────────────────────────────────────────────────┘       │
│                      │                                        │
│  ┌───────────────────┴───────────────────────────────┐       │
│  │              Tool Registry                         │       │
│  │  • execute_python    (subprocess sandbox)          │       │
│  │  • query_dataframe   (pandas .query())             │       │
│  │  • summarize_findings (insight recording)          │       │
│  └────────────────────────────────────────────────────┘       │
└───────────────────────────────────────────────────────────────┘

Features

Feature	Details
Tool Calling	OpenAI function calling with 3 tools: Python REPL, DataFrame query, findings recorder
Code Execution Sandbox	Subprocess-isolated Python execution with timeout, stdout/stderr capture, matplotlib→base64
Planning	LLM decomposes the user question into 3-7 ordered analysis steps
Reflection Loop	On execution errors, the LLM diagnoses → rewrites → retries (up to 3 attempts)
Memory Management	Conversation memory, step memory, error memory, and shared context variables
Streaming	Real-time SSE stream from backend to frontend — see every step as it happens
Visual Dashboard	React + Tailwind UI: CSV upload, chat, generated code viewer, plot gallery, reasoning trace

Design Decisions

Decision	Rationale
Subprocess sandbox over `exec()`	Process isolation prevents agent code from crashing the server
SSE over WebSockets	Simpler for unidirectional streaming; works behind most proxies
Zustand over Redux	Minimal boilerplate for simple state management
JSON-mode for planner	Structured output avoids fragile parsing of free-text plans
Context variable serialization	DataFrames round-trip as JSON so they survive across separate subprocess executions
Separate reflection module	Clean separation of concerns; the reflection prompt is specialized for debugging

Failure Cases & Guard Rails

Scenario	Handling
Code times out (>30s)	Subprocess killed; error reported to user
Infinite loop in agent code	Caught by execution timeout
LLM returns malformed JSON	Fallback default plan / graceful error
3 consecutive fix attempts fail	Reflection gives up, reports partial results
Large stdout (>8KB)	Truncated to prevent context overflow
Non-CSV upload	Rejected at API level with 400

Quick Start

Prerequisites

Python 3.11+
Node.js 18+
OpenAI API key

Backend

cd backend
cp .env.example .env       # add your OPENAI_API_KEY
pip install -r requirements.txt
python main.py             # runs on :8000

Frontend

cd frontend
npm install
npm run dev                # runs on :5173, proxies /api → :8000

Docker

cp backend/.env.example backend/.env   # add your key
docker-compose up --build
# Frontend → http://localhost:3000
# Backend  → http://localhost:8000

Project Structure

autonomous-analyst/
├── backend/
│   ├── agent/
│   │   ├── memory.py          # Conversation + step + error memory
│   │   ├── planner.py         # LLM-based task decomposition
│   │   ├── executor.py        # Tool calling loop (LLM → tool → LLM)
│   │   ├── reflection.py      # Error diagnosis + code rewrite loop
│   │   ├── orchestrator.py    # Main agent loop with SSE streaming
│   │   └── tools.py           # OpenAI function definitions
│   ├── api/
│   │   └── routes.py          # FastAPI endpoints (/upload, /analyse, etc.)
│   ├── core/
│   │   ├── config.py          # Settings (env vars, limits)
│   │   └── sandbox.py         # Subprocess code execution engine
│   └── main.py                # FastAPI entry point
├── frontend/
│   └── src/
│       ├── components/
│       │   ├── CSVUploader.tsx
│       │   ├── ChatInterface.tsx
│       │   ├── CodeViewer.tsx
│       │   ├── PlotViewer.tsx
│       │   └── ReasoningTrace.tsx
│       ├── hooks/useAnalysis.ts
│       ├── store/agentStore.ts
│       └── App.tsx
├── docker/
│   ├── Dockerfile.backend
│   ├── Dockerfile.frontend
│   └── nginx.conf
└── docker-compose.yml

Limitations

No persistent storage — sessions are in-memory; restarting the server clears all state.
Single-user sandbox — subprocess execution is not containerized per-request (use Docker sandbox for production).
OpenAI dependency — currently requires an OpenAI API key. Swap in any OpenAI-compatible endpoint (Ollama, vLLM, etc.) by changing the base URL.
No authentication — add API key middleware or OAuth for production.
CSV only — Excel, Parquet, JSON ingestion not implemented yet.

Future Improvements

Docker-in-Docker sandbox — run each code execution in an isolated container
Multi-model support — Llama 3, Claude, Gemini via LiteLLM
SQL tool — connect to real databases (PostgreSQL, SQLite)
Web search tool — let the agent fetch external context
Persistent sessions — Redis/PostgreSQL-backed memory
Multi-file uploads — join multiple CSVs
Export reports — PDF/Markdown download of the final analysis
Latency benchmarks — measure and display per-step timings
Cost tracking — display token usage and estimated cost per analysis

Cost Comparison

Model	Avg. tokens per analysis	Estimated cost
GPT-4o	~8,000–15,000	$0.04–$0.12
GPT-4o-mini	~8,000–15,000	$0.002–$0.005
Llama 3 70B (self-hosted)	~8,000–15,000	Infrastructure only

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
docker		docker
frontend		frontend
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous Data Analyst Agent

Architecture

Features

Design Decisions

Failure Cases & Guard Rails

Quick Start

Prerequisites

Backend

Frontend

Docker

Project Structure

Limitations

Future Improvements

Cost Comparison

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous Data Analyst Agent

Architecture

Features

Design Decisions

Failure Cases & Guard Rails

Quick Start

Prerequisites

Backend

Frontend

Docker

Project Structure

Limitations

Future Improvements

Cost Comparison

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages