Finance Agent

Finance agent is an equity research platform. Ask questions and get answers from 10-K filings, earnings calls, and news.

Live Platform: www.stratalens.ai

10K filings agent blogpost: Blogpost

Agent System

Core agent system implementing Retrieval-Augmented Generation (RAG) with semantic data source routing, research planning, and iterative self-improvement for financial Q&A.

Architecture Overview

                              AGENT PIPELINE
 ═══════════════════════════════════════════════════════════════════════

 ┌──────────┐    ┌───────────────────┐    ┌──────────────────────────┐
 │ Question │───►│ Question Analyzer │───►│  Semantic Data Routing   │
 └──────────┘    │  (LLM via config) │    │                          │
                 │                   │    │  • Earnings Transcripts  │
                 │ Extracts:         │    │  • SEC 10-K Filings      │
                 │ • Tickers         │    │  • Real-Time News        │
                 │ • Time periods    │    │  • Hybrid (multi-source) │
                 │ • Intent          │    └────────────┬─────────────┘
                 └───────────────────┘                 │
                                                       ▼
                 ┌─────────────────────────────────────────────────────┐
                 │              RESEARCH PLANNING                       │
                 │  Agent generates reasoning: "I need to find..."     │
                 └────────────────────────┬────────────────────────────┘
                                          ▼
                 ┌─────────────────────────────────────────────────────┐
                 │                  RETRIEVAL LAYER                     │
                 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
                 │  │  Earnings   │  │  SEC 10-K   │  │   Tavily    │  │
                 │  │ Transcripts │  │  Retrieval  │  │    News     │  │
                 │  │             │  │   Agent     │  │             │  │
                 │  │ Vector DB   │  │ (10-K only) │  │  Live API   │  │
                 │  │ + Hybrid    │  │ Planning +  │  │             │  │
                 │  │   Search    │  │  Iterative  │  │             │  │
                 │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  │
                 └─────────┴───────────┬────┴────────────────┴─────────┘
                                       │ ▲
                                       │ │ Re-query with
                                       │ │ follow-up questions
                                       ▼ │
                 ┌─────────────────────────────────────────────────────┐
                 │               ITERATIVE IMPROVEMENT                  │
                 │                                                      │
                 │    ┌──────────┐    ┌──────────┐    ┌──────────┐     │
                 │    │ Generate │───►│ Evaluate │───►│ Iterate? │─────┼───┐
                 │    │  Answer  │    │ Quality  │    │          │     │   │
                 │    └──────────┘    └──────────┘    └──────────┘     │   │
                 │                                         │ NO        │   │ YES
                 └─────────────────────────────────────────┼───────────┘   │
                                                           ▼               │
                                                    ┌─────────────┐        │
                                                    │   ANSWER    │        │
                                                    │ + Citations │        │
                                                    └─────────────┘        │
                                                           ▲               │
                                                           └───────────────┘

Key Concepts:

Semantic Routing - Routes to data sources based on question intent, not keywords
Research Planning - Agent explains reasoning before searching ("I need to find...")
Multi-Source RAG - Combines earnings transcripts, SEC 10-K filings, and news
Self-Reflection - Evaluates answer quality and iterates until confident
Answer Modes - Configurable iteration depth (2-10 iterations) and quality thresholds (70-95%)
Search-Optimized Follow-ups - Generates keyword phrases for better RAG retrieval
Parallel Multi-Agent Synthesis - Per-ticker subagents run in parallel; results are synthesized into one unified answer

Benchmark: 91% accuracy on FinanceBench (112 10-K questions), ~10s per question, evaluated using LLM-as-a-judge.

Documentation

Document	Description
agent/README.md	Complete agent architecture, pipeline stages, configuration
docs/SEC_AGENT.md	SEC 10-K agent: section routing, table selection, reranking
agent/rag/data_ingestion/README.md	Data ingestion pipelines for transcripts and 10-K filings

Features

Earnings Transcripts (2020-2025) - Word-for-word executive commentary from earnings calls
SEC 10-K Filings (2018-2025) - Official annual reports via specialized retrieval agent (10-Q/8-K coming soon)
Real-Time News - Latest market developments via Tavily search
Financial Screener - Natural language queries over company fundamentals [in development]

Unlike generic LLMs that rely on web content, Finance Agent uses the same authoritative documents that professional analysts depend on.

Tech Stack

Backend: FastAPI, PostgreSQL (pgvector), DuckDB
AI/ML: Cerebras (Qwen-3-235B), OpenAI (fallback), RAG with iterative self-improvement
Search: Hybrid vector (pgvector) + TF-IDF with cross-encoder reranking
Frontend: React + TypeScript, Tailwind CSS

Project Structure

finance_agent/
├── agent/                  # AI agent & RAG system         → see agent/README.md
│   ├── __init__.py        # Public API: Agent, RAGAgent, create_agent()
│   ├── agent_config.py    # Iteration/quality threshold settings
│   ├── prompts.py         # Centralized LLM prompt templates
│   ├── llm/               # Unified LLM client (OpenAI/Cerebras)  → see agent/llm/README.md
│   ├── rag/               # RAG implementation
│   │   ├── rag_agent.py                          # Main orchestration
│   │   ├── sec_filings_service_smart_parallel.py  # SEC 10-K agent
│   │   ├── response_generator.py   # LLM response & evaluation
│   │   ├── question_analyzer.py    # Semantic routing
│   │   ├── search_engine.py        # Hybrid transcript search
│   │   ├── tavily_service.py       # Real-time news
│   │   ├── earnings_transcript_service.py  # Dedicated earnings transcript retrieval agent
│   │   ├── search_planner.py       # Search plan generation and temporal reference resolution
│   │   ├── rag_flow_context.py     # Flow context dataclass for pipeline state
│   │   └── data_ingestion/         # Data pipeline → see data_ingestion/README.md
│   └── screener/          # Financial screener
├── app/                   # FastAPI application
│   ├── routers/           # API endpoints
│   └── schemas/           # Pydantic models
├── frontend/              # React + TypeScript frontend
├── docs/                  # Documentation
│   └── SEC_AGENT.md       # 10-K agent deep dive

Quick Start

Prerequisites

Python 3.9+
PostgreSQL 12+ with pgvector extension
See Requirements for full dependency list

Installation

# Clone repository
git clone https://github.com/kamathhrishi/stratalensai.git
cd finance_agent

# Install dependencies
pip install -r requirements.txt

# Setup environment variables
cp .env.example .env
# Edit .env with your API keys and database credentials

# Configure environment (see Configuration section below)

Configuration

Before running the application, configure the following in .env:

BASE_URL - Set to your server URL (e.g., http://localhost:8000 for local, your production URL for deployed)
RAG_DEBUG_MODE - Set to false for production, true for development debugging
AUTH_DISABLED - Set to true to bypass Clerk auth (dev only), false for production
CLERK_SECRET_KEY / CLERK_PUBLISHABLE_KEY - Required for production authentication (get from Clerk Dashboard)

Frontend env vars (read from root .env via envDir: '../' in vite.config.ts):

VITE_CLERK_PUBLISHABLE_KEY - Same value as CLERK_PUBLISHABLE_KEY (Vite requires VITE_ prefix)
VITE_API_BASE_URL - Leave empty for same-origin requests (default); set to an explicit URL only if backend is on a separate domain

# Ingest data (optional - see agent/rag/data_ingestion/README.md)
python agent/rag/data_ingestion/download_transcripts.py
python agent/rag/data_ingestion/ingest_with_structure.py --ticker AAPL --year-start 2020 --year-end 2025

# Run server
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000

Access the application at http://localhost:8000

Requirements

API Keys

Service	Environment Variable	Required
OpenAI	`OPENAI_API_KEY`	Yes
Cerebras	`CEREBRAS_API_KEY`	Yes
API Ninjas	`API_NINJAS_KEY`	Yes
Clerk	`CLERK_SECRET_KEY`, `CLERK_PUBLISHABLE_KEY`	Yes (production)
Tavily	`TAVILY_API_KEY`	Optional
Logfire	`LOGFIRE_TOKEN`	Optional

Database

PostgreSQL with pgvector extension (DATABASE_URL)
Redis (optional, for caching) (REDIS_URL)

Python Dependencies

See requirements.txt for full list.

API Documentation

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Key Endpoints

POST /message/stream-v2 - Chat with streaming RAG responses
GET /companies/search - Search companies by ticker/name
GET /transcript/{ticker}/{year}/{quarter} - Get specific earnings transcript
POST /screener/query/stream - Natural language financial queries

Data Sources

Data is split between PostgreSQL (embeddings, metadata) and Railway S3 (full filing documents, transcript text). See agent/rag/data_ingestion/README.md for detailed ingestion instructions.

AI Agent Documentation

Document	Description
agent/README.md	Complete agent architecture, pipeline stages, semantic routing, iterative self-improvement
docs/SEC_AGENT.md	SEC 10-K agent: planning-driven retrieval, 91% accuracy on FinanceBench
agent/rag/data_ingestion/README.md	Data ingestion pipelines for transcripts and SEC filings

Development Status

Production (Finance Agent):

Earnings transcript chat with RAG
SEC 10-K filings (2018-2025)
Real-time streaming responses
User authentication

In Development:

Enhanced financial screener
Performance optimizations

Contributing

Contributions welcome! Please open an issue to discuss major changes before submitting PRs.

License

MIT License - see LICENSE file for details

Contact

For questions or access requests: hrishi@stratalens.ai

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
agent		agent
analytics		analytics
app		app
db		db
docs		docs
frontend		frontend
.cursorignore		.cursorignore
.cursorrules		.cursorrules
.env.example		.env.example
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
config.py		config.py
fastapi_server.py		fastapi_server.py
railway.toml		railway.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finance Agent

Agent System

Architecture Overview

Documentation

Features

Tech Stack

Project Structure

Quick Start

Prerequisites

Installation

Configuration

Requirements

API Keys

Database

Python Dependencies

API Documentation

Key Endpoints

Data Sources

AI Agent Documentation

Development Status

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Finance Agent

Agent System

Architecture Overview

Documentation

Features

Tech Stack

Project Structure

Quick Start

Prerequisites

Installation

Configuration

Requirements

API Keys

Database

Python Dependencies

API Documentation

Key Endpoints

Data Sources

AI Agent Documentation

Development Status

Contributing

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages