Finance agent is an equity research platform. Ask questions and get answers from 10-K filings, earnings calls, and news.
Live Platform: www.stratalens.ai
10K filings agent blogpost: Blogpost
Core agent system implementing Retrieval-Augmented Generation (RAG) with semantic data source routing, research planning, and iterative self-improvement for financial Q&A.
AGENT PIPELINE
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββ βββββββββββββββββββββ ββββββββββββββββββββββββββββ
β Question βββββΊβ Question Analyzer βββββΊβ Semantic Data Routing β
ββββββββββββ β (LLM via config) β β β
β β β β’ Earnings Transcripts β
β Extracts: β β β’ SEC 10-K Filings β
β β’ Tickers β β β’ Real-Time News β
β β’ Time periods β β β’ Hybrid (multi-source) β
β β’ Intent β ββββββββββββββ¬ββββββββββββββ
βββββββββββββββββββββ β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RESEARCH PLANNING β
β Agent generates reasoning: "I need to find..." β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RETRIEVAL LAYER β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Earnings β β SEC 10-K β β Tavily β β
β β Transcripts β β Retrieval β β News β β
β β β β Agent β β β β
β β Vector DB β β (10-K only) β β Live API β β
β β + Hybrid β β Planning + β β β β
β β Search β β Iterative β β β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
βββββββββββ΄ββββββββββββ¬βββββ΄βββββββββββββββββ΄ββββββββββ
β β²
β β Re-query with
β β follow-up questions
βΌ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ITERATIVE IMPROVEMENT β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Generate βββββΊβ Evaluate βββββΊβ Iterate? βββββββΌββββ
β β Answer β β Quality β β β β β
β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β NO β β YES
βββββββββββββββββββββββββββββββββββββββββββΌββββββββββββ β
βΌ β
βββββββββββββββ β
β ANSWER β β
β + Citations β β
βββββββββββββββ β
β² β
βββββββββββββββββ
Key Concepts:
- Semantic Routing - Routes to data sources based on question intent, not keywords
- Research Planning - Agent explains reasoning before searching ("I need to find...")
- Multi-Source RAG - Combines earnings transcripts, SEC 10-K filings, and news
- Self-Reflection - Evaluates answer quality and iterates until confident
- Answer Modes - Configurable iteration depth (2-10 iterations) and quality thresholds (70-95%)
- Search-Optimized Follow-ups - Generates keyword phrases for better RAG retrieval
- Parallel Multi-Agent Synthesis - Per-ticker subagents run in parallel; results are synthesized into one unified answer
Benchmark: 91% accuracy on FinanceBench (112 10-K questions), ~10s per question, evaluated using LLM-as-a-judge.
| Document | Description |
|---|---|
| agent/README.md | Complete agent architecture, pipeline stages, configuration |
| docs/SEC_AGENT.md | SEC 10-K agent: section routing, table selection, reranking |
| agent/rag/data_ingestion/README.md | Data ingestion pipelines for transcripts and 10-K filings |
- Earnings Transcripts (2020-2025) - Word-for-word executive commentary from earnings calls
- SEC 10-K Filings (2018-2025) - Official annual reports via specialized retrieval agent (10-Q/8-K coming soon)
- Real-Time News - Latest market developments via Tavily search
- Financial Screener - Natural language queries over company fundamentals [in development]
Unlike generic LLMs that rely on web content, Finance Agent uses the same authoritative documents that professional analysts depend on.
- Backend: FastAPI, PostgreSQL (pgvector), DuckDB
- AI/ML: Cerebras (Qwen-3-235B), OpenAI (fallback), RAG with iterative self-improvement
- Search: Hybrid vector (pgvector) + TF-IDF with cross-encoder reranking
- Frontend: React + TypeScript, Tailwind CSS
finance_agent/
βββ agent/ # AI agent & RAG system β see agent/README.md
β βββ __init__.py # Public API: Agent, RAGAgent, create_agent()
β βββ agent_config.py # Iteration/quality threshold settings
β βββ prompts.py # Centralized LLM prompt templates
β βββ llm/ # Unified LLM client (OpenAI/Cerebras) β see agent/llm/README.md
β βββ rag/ # RAG implementation
β β βββ rag_agent.py # Main orchestration
β β βββ sec_filings_service_smart_parallel.py # SEC 10-K agent
β β βββ response_generator.py # LLM response & evaluation
β β βββ question_analyzer.py # Semantic routing
β β βββ search_engine.py # Hybrid transcript search
β β βββ tavily_service.py # Real-time news
β β βββ earnings_transcript_service.py # Dedicated earnings transcript retrieval agent
β β βββ search_planner.py # Search plan generation and temporal reference resolution
β β βββ rag_flow_context.py # Flow context dataclass for pipeline state
β β βββ data_ingestion/ # Data pipeline β see data_ingestion/README.md
β βββ screener/ # Financial screener
βββ app/ # FastAPI application
β βββ routers/ # API endpoints
β βββ schemas/ # Pydantic models
βββ frontend/ # React + TypeScript frontend
βββ docs/ # Documentation
β βββ SEC_AGENT.md # 10-K agent deep dive
- Python 3.9+
- PostgreSQL 12+ with pgvector extension
- See Requirements for full dependency list
# Clone repository
git clone https://github.com/kamathhrishi/stratalensai.git
cd finance_agent
# Install dependencies
pip install -r requirements.txt
# Setup environment variables
cp .env.example .env
# Edit .env with your API keys and database credentials
# Configure environment (see Configuration section below)Before running the application, configure the following in .env:
BASE_URL- Set to your server URL (e.g.,http://localhost:8000for local, your production URL for deployed)RAG_DEBUG_MODE- Set tofalsefor production,truefor development debuggingAUTH_DISABLED- Set totrueto bypass Clerk auth (dev only),falsefor productionCLERK_SECRET_KEY/CLERK_PUBLISHABLE_KEY- Required for production authentication (get from Clerk Dashboard)
Frontend env vars (read from root .env via envDir: '../' in vite.config.ts):
VITE_CLERK_PUBLISHABLE_KEY- Same value asCLERK_PUBLISHABLE_KEY(Vite requiresVITE_prefix)VITE_API_BASE_URL- Leave empty for same-origin requests (default); set to an explicit URL only if backend is on a separate domain
# Ingest data (optional - see agent/rag/data_ingestion/README.md)
python agent/rag/data_ingestion/download_transcripts.py
python agent/rag/data_ingestion/ingest_with_structure.py --ticker AAPL --year-start 2020 --year-end 2025
# Run server
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000Access the application at http://localhost:8000
| Service | Environment Variable | Required |
|---|---|---|
| OpenAI | OPENAI_API_KEY |
Yes |
| Cerebras | CEREBRAS_API_KEY |
Yes |
| API Ninjas | API_NINJAS_KEY |
Yes |
| Clerk | CLERK_SECRET_KEY, CLERK_PUBLISHABLE_KEY |
Yes (production) |
| Tavily | TAVILY_API_KEY |
Optional |
| Logfire | LOGFIRE_TOKEN |
Optional |
- PostgreSQL with pgvector extension (
DATABASE_URL) - Redis (optional, for caching) (
REDIS_URL)
See requirements.txt for full list.
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
POST /message/stream-v2- Chat with streaming RAG responsesGET /companies/search- Search companies by ticker/nameGET /transcript/{ticker}/{year}/{quarter}- Get specific earnings transcriptPOST /screener/query/stream- Natural language financial queries
Data is split between PostgreSQL (embeddings, metadata) and Railway S3 (full filing documents, transcript text). See agent/rag/data_ingestion/README.md for detailed ingestion instructions.
| Document | Description |
|---|---|
| agent/README.md | Complete agent architecture, pipeline stages, semantic routing, iterative self-improvement |
| docs/SEC_AGENT.md | SEC 10-K agent: planning-driven retrieval, 91% accuracy on FinanceBench |
| agent/rag/data_ingestion/README.md | Data ingestion pipelines for transcripts and SEC filings |
Production (Finance Agent):
- Earnings transcript chat with RAG
- SEC 10-K filings (2018-2025)
- Real-time streaming responses
- User authentication
In Development:
- Enhanced financial screener
- Performance optimizations
Contributions welcome! Please open an issue to discuss major changes before submitting PRs.
MIT License - see LICENSE file for details
For questions or access requests: hrishi@stratalens.ai