Upload NDJSON log files and chat with them using AI. Ask questions like "Show me all errors" or "Find logs related to database connections."
- Docker installed
- Ports 5173 and 8000 should be free
- Pinecone Cloud API key
- Gemini API key
Create backend/.env by copying backend/.env.example and replace the API keys with your actual keys.
docker compose up --build- Go to http://localhost:5173
- Upload a NDJSON log file
- Chat with your logs!
NDJSON (Newline-Delimited JSON) - one JSON object per line:
{"timestamp": "2024-01-01T10:00:00Z", "level": "INFO", "message": "Application started"}
{"timestamp": "2024-01-01T10:01:00Z", "level": "ERROR", "message": "Database connection failed"}- "Show me all error messages"
- "Find logs related to database connections"
-
Upload & Parsing: NDJSON log files are parsed line by line, where each line becomes a separate log entry with a unique UUID
-
Vector Embedding: Each log entry is converted to a high-dimensional vector using Hugging Face's
all-MiniLM-L6-v2model, which captures semantic meaning of the log content -
Vector Storage: Embeddings are stored in Pinecone Cloud vector database, enabling fast similarity search across all your logs
-
Query Processing: When you ask a question, it's also converted to a vector and Pinecone finds the most similar log entries based on semantic similarity
-
AI Analysis: The retrieved logs are sent to a CrewAI agent powered by Gemini LLM, which:
- Analyzes the logs in context of your question
- Provides a summary of what the logs reveal
- Intelligently selects the 10-15 most relevant log entries
- Returns both the analysis and selected logs for display
Model Pricing (as of 2025):
- Embedding Model: OpenAI
text-embedding-3-small- $0.02 per 1M tokens - LLM Model: Google Gemini 2.0 Flash - $0.30 per 1M input tokens, $2.50 per 1M output tokens
Detailed Cost Calculation:
Scenario: 1M ingested logs + 1,000 queries
- Each query fetches 100 entries from Pinecone
- LLM processes 100 entries per query
- LLM outputs 50 selected logs per query
Sample Log Entry (assuming a typical log like this):
{"attributes":null,"body":"GetSupportedCurrencies successful","fields":{"flags":1,"severity_number":9,"severity_text":"INFO","span_id":"0f82718e85e0313b","trace_id":"ff393823145c862cb9542cc3e3ea2a0e"},"instrumentation_scope":{"name":"currency","version":"1.19.0"},"meta":{"datastream_id":42620911,"ingestion_timestamp":1756854369940440776,"observation_kind":"otellogs","schema_version":"1.0","token_id":"ds1WHkTO4b2vt863hyAT"},"resource_attributes":{"deployment.environment":"otel","k8s.deployment.name":"currency","k8s.namespace.name":"default","k8s.node.name":"gke-otel-demo-default-pool-27217e98-9vqg","k8s.pod.ip":"10.112.10.5","k8s.pod.name":"currency-d666cf966-lrq7q","k8s.pod.start_time":"2025-09-02T04:06:15Z","k8s.pod.uid":"1e62aae5-80e4-49aa-99ae-12e93f74067a","service.name":"currency","service.namespace":"opentelemetry-demo","service.version":"2.0.1","telemetry.sdk.language":"cpp","telemetry.sdk.name":"opentelemetry","telemetry.sdk.version":"1.19.0"},"timestamp":"1756854369922128805"}Token Count Analysis (assumptions for cost calculation):
- Average log tokens: ~200 tokens (assumed for calculation)
- Query text: ~20 tokens (typical user question)
Step 1: Cost for ingesting 1 Million Logs
| Component | Calculation | Tokens | Cost |
|---|---|---|---|
| Log Embedding | 1M logs × 200 tokens | 200M tokens | $4.00 |
Step 2: Cost for 1 Query (LLM Analysis)
| Component | Calculation | Tokens | Cost |
|---|---|---|---|
| Query Embedding | 1 query × 20 tokens | 20 tokens | $0.0000004 |
| LLM Input | Query (20 tokens) + LLM Instructions (~100 tokens) + 100 logs (100 × 200 tokens) | 20,120 tokens | $0.0060 |
| LLM Output | 1 query × 50 logs × 200 tokens | 10,000 tokens | $0.0250 |
| Total per query | $0.0310 |
Note: This involves CrewAI agent with Gemini LLM. Input: user query + 100 logs from Pinecone. Output: summary + 50 selected logs.
Important: LLM costs are highly dependent on:
- Input volume: How many logs you retrieve from Pinecone similarity search (currently 100 logs)
- Output volume: How many logs you want the LLM to return (currently 50 logs)
- Log complexity: Token count per log affects both input and output costs
- LLM Model: Using Google Gemini 2.0 Flash for analysis (costs vary by model)
Cost for 1,000 Queries
| Component | Calculation | Cost |
|---|---|---|
| Query Embedding | 1,000 queries × $0.0000004 | $0.0004 |
| LLM Input | 1,000 queries × $0.0060 | $6.00 |
| LLM Output | 1,000 queries × $0.0250 | $25.00 |
| Total for 1,000 queries | $31.00 |
Final Cost Summary:
- One-time ingestion: $4.00 (for 1M logs)
- Per query cost: $0.0310