An intelligent Formula 1 race analysis application that combines PDF document parsing with real-time OpenF1 API data to reconstruct comprehensive race timelines with interactive visualizations.
F1 Race Intelligence is a Retrieval-Augmented Generation (RAG) system that analyzes F1 race documents (Wikipedia articles, race reports, etc.) and enriches them with live telemetry data from the OpenF1 API. It automatically extracts race events, pit stops, safety car periods, weather changes, overtakes, and moreβthen presents everything in an interactive timeline visualization.
- π PDF Upload & Parsing β Upload race documents and extract key events using LLM-powered analysis
- π OpenF1 API Integration β Automatically fetches real telemetry: pit stops, stints, race control messages, position changes, overtakes
- ποΈ Timeline Reconstruction β Merges PDF-extracted events with API data into a unified, chronological timeline
- π Interactive Visualization β Plotly-powered chart showing all events by lap and driver with color-coded event types
- π Advanced Filtering β Filter by event type, driver, or evidence source
- π¨ 14 Event Type Categories β Safety Car, VSC, Red Flag, Yellow Flag, Pit Stop, Strategy, Weather, Incident, Overtake, Pace, Position, Result, Grid, Info
EXTERNAL SERVICES
ββββββββββββββββββββββ ββββββββββββββββββββββ
β OLLAMA SERVER β β OPENF1 API β
β localhost:11434 β β api.openf1.org β
β β’ llama3 model β β β’ Live telemetry β
βββββββββββ¬βββββββββββ βββββββββββ¬βββββββββββ
β β
βββββββββββββββββββββββββββββββͺββββββββββββββββββββββββββͺββββββββββββββββββββββ
β APPLICATION β
β β
βββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββΌββββββββββββββββββββββ
β β USER INTERFACES β β
β ββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββ΄βββββββββββββββββββ β
β β GRADIO UI (ui_gradio.py) β β
β β β’ π PDF Upload Tab β’ π Visualization Tab (Plotly) β β
β β β’ π Timeline Explorer β’ π Raw Data Tab β β
β β β’ π Event Details Tab β’ 14 Event Type Filters β β
β βββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ β
β β MCP SERVER (server.py) + CLIENT (client.py) β β
β β β’ FastAPI-based Model Context Protocol server β β
β β β’ Exposes tools: ingest_pdf, build_timeline, query_timeline β β
β β β’ Enables AI assistant integration β β
β βββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β APP SERVICE (rag/app_service.py) β
β β’ Orchestrates all components β’ Metadata extraction (year, GP, session)β
β β’ Coordinates PDF ingestion β’ JSON serialization β
ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
β RAG PIPELINE β β OPENF1 CLIENT (openf1/api.py) β
β β β β
β ββββββββββββββββββββββββββββββ β β β’ Sessions lookup & resolution β
β β Ingest (ingest.py) β β β β’ Race control messages (SC, VSC) β
β β β’ PDF text extraction β β β β’ Pit stops & stint data β
β β β’ Text chunking β β β β’ Position changes tracking β
β βββββββββββββββ¬βββββββββββββββ β β β’ Weather data β
β βΌ β β β’ Overtakes detection β
β ββββββββββββββββββββββββββββββ β β β’ Starting grid positions β
β β Embed (embed.py) β β β β’ Session results β
β β β’ Sentence embeddings β β β β’ Rate limiting & caching β
β βββββββββββββββ¬βββββββββββββββ β β β
β βΌ β ββββββββββββββββββββββββββββββββββββββββ
β ββββββββββββββββββββββββββββββ β
β β Store (store.py) β β
β β β’ In-memory vector store β β
β βββββββββββββββ¬βββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββ β
β β Retrieve (retrieve.py) β β
β β β’ Similarity search β β
β β β’ Top-K retrieval β β
β βββββββββββββββ¬βββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββ β
β β LLM (llm.py) β β
β β β’ Ollama interface β β
β β β’ Event extraction β β
β β β’ Prompts (prompts.py) β β
β ββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββ β
β β Agent (agent.py) β β
β β β’ Query orchestration β β
β ββββββββββββββββββββββββββββββ β
ββββββββββββββββββββ¬ββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TIMELINE BUILDER (rag/timeline.py) β
β β’ Merges PDF events + OpenF1 events β’ Impact analysis scoring β
β β’ Deduplication & conflict resolution β’ Event categorization (14 types) β
β β’ Schemas (rag/schemas.py): TimelineEvent, TimelineEventType, etc. β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
f1_race_intelligence/
βββ ui_gradio.py # Main Gradio web interface
βββ server.py # FastAPI MCP server
βββ client.py # MCP client
βββ requirements.txt # Python dependencies
βββ pytest.ini # Test configuration
β
βββ openf1/ # OpenF1 API client
β βββ __init__.py
β βββ api.py # API client with caching & rate limiting
β
βββ rag/ # RAG pipeline components
β βββ __init__.py
β βββ app_service.py # Main orchestration service
β βββ timeline.py # Timeline builder & merger
β βββ schemas.py # Pydantic models (TimelineEvent, etc.)
β βββ ingest.py # PDF parsing & chunking
β βββ embed.py # Text embeddings
β βββ store.py # Vector storage
β βββ retrieve.py # Similarity search
β βββ llm.py # Ollama LLM interface
β βββ prompts.py # LLM prompt templates
β βββ agent.py # Agent orchestration
β
βββ output/ # Generated outputs
β βββ race_brief.json
β βββ race_brief.md
β
βββ tests/ # Test files
- Python 3.10+
- Ollama with
llama3model: In a separte Terminalbrew install ollama ollama pull llama3 ollama serve
-
Navigate to the project folder:
cd path\to\Text Mining and NLP
-
Activate the virtual environment:
.\.venv\Scripts\activate
If activation is blocked, run once:
Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
-
Install dependencies (first run only):
python -m pip install -r f1_race_intelligence\requirements.txt
cd f1_race_intelligence
python ui_gradio.pyOpen http://localhost:7860 (or the port shown in the terminal).
- Upload PDF β Go to "π Ingest" tab and upload a race document
- Build Timeline β Click "Build Timeline" to extract events and fetch OpenF1 data
- Explore β Use the "π Timeline" tab to browse events with filters
- Visualize β Go to "π Visualization" to see the interactive chart
- Filter β Use the category filters (Race Control, Strategy, Session Info) to focus on specific event types
| Category | Events |
|---|---|
| π¨ Race Control | Safety Car, VSC, Red Flag, Yellow Flag, Incident |
| π§ Strategy | Pit Stop, Stint Change, Pace Update, Overtake, Weather |
| π Session Info | Starting Grid, Results, Position, Info |
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | Gradio 6.x | Web UI with tabs for upload, timeline, visualization |
| Visualization | Plotly | Interactive timeline charts |
| API Server | FastAPI | MCP (Model Context Protocol) server |
| Data Validation | Pydantic | Schemas for TimelineEvent, EventType, etc. |
| LLM Runtime | Ollama (localhost:11434) | Local LLM inference |
| LLM Model | llama3 | Event extraction & text analysis |
| Embeddings | Sentence Transformers | Text vectorization |
| Vector Store | In-memory | Similarity search & Top-K retrieval |
| PDF Parsing | PyPDF / pdfplumber | Document text extraction & chunking |
| External API | OpenF1 API | Live telemetry, pit stops, race control data |
| Caching | In-memory | Rate limiting & API response caching |
| Language | Python 3.10+ | Core application runtime |
MIT License