Skip to content

asifahmed70707/relvy_assignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Relvy Log Analysis

Upload NDJSON log files and chat with them using AI. Ask questions like "Show me all errors" or "Find logs related to database connections."

Prerequisites

  • Docker installed
  • Ports 5173 and 8000 should be free
  • Pinecone Cloud API key
  • Gemini API key

🚀 Quick Start

1. Setup Environment

Create backend/.env by copying backend/.env.example and replace the API keys with your actual keys.

2. Run the App

docker compose up --build

3. Use the App

  1. Go to http://localhost:5173
  2. Upload a NDJSON log file
  3. Chat with your logs!

📝 Log File Format

NDJSON (Newline-Delimited JSON) - one JSON object per line:

{"timestamp": "2024-01-01T10:00:00Z", "level": "INFO", "message": "Application started"}
{"timestamp": "2024-01-01T10:01:00Z", "level": "ERROR", "message": "Database connection failed"}

🔍 Example Queries

  • "Show me all error messages"
  • "Find logs related to database connections"

🔧 How It Works

  1. Upload & Parsing: NDJSON log files are parsed line by line, where each line becomes a separate log entry with a unique UUID

  2. Vector Embedding: Each log entry is converted to a high-dimensional vector using Hugging Face's all-MiniLM-L6-v2 model, which captures semantic meaning of the log content

  3. Vector Storage: Embeddings are stored in Pinecone Cloud vector database, enabling fast similarity search across all your logs

  4. Query Processing: When you ask a question, it's also converted to a vector and Pinecone finds the most similar log entries based on semantic similarity

  5. AI Analysis: The retrieved logs are sent to a CrewAI agent powered by Gemini LLM, which:

    • Analyzes the logs in context of your question
    • Provides a summary of what the logs reveal
    • Intelligently selects the 10-15 most relevant log entries
    • Returns both the analysis and selected logs for display

💰 Cost Analysis

Model Pricing (as of 2025):

  • Embedding Model: OpenAI text-embedding-3-small - $0.02 per 1M tokens
  • LLM Model: Google Gemini 2.0 Flash - $0.30 per 1M input tokens, $2.50 per 1M output tokens

Detailed Cost Calculation:

Scenario: 1M ingested logs + 1,000 queries

  • Each query fetches 100 entries from Pinecone
  • LLM processes 100 entries per query
  • LLM outputs 50 selected logs per query

Sample Log Entry (assuming a typical log like this):

{"attributes":null,"body":"GetSupportedCurrencies successful","fields":{"flags":1,"severity_number":9,"severity_text":"INFO","span_id":"0f82718e85e0313b","trace_id":"ff393823145c862cb9542cc3e3ea2a0e"},"instrumentation_scope":{"name":"currency","version":"1.19.0"},"meta":{"datastream_id":42620911,"ingestion_timestamp":1756854369940440776,"observation_kind":"otellogs","schema_version":"1.0","token_id":"ds1WHkTO4b2vt863hyAT"},"resource_attributes":{"deployment.environment":"otel","k8s.deployment.name":"currency","k8s.namespace.name":"default","k8s.node.name":"gke-otel-demo-default-pool-27217e98-9vqg","k8s.pod.ip":"10.112.10.5","k8s.pod.name":"currency-d666cf966-lrq7q","k8s.pod.start_time":"2025-09-02T04:06:15Z","k8s.pod.uid":"1e62aae5-80e4-49aa-99ae-12e93f74067a","service.name":"currency","service.namespace":"opentelemetry-demo","service.version":"2.0.1","telemetry.sdk.language":"cpp","telemetry.sdk.name":"opentelemetry","telemetry.sdk.version":"1.19.0"},"timestamp":"1756854369922128805"}

Token Count Analysis (assumptions for cost calculation):

  • Average log tokens: ~200 tokens (assumed for calculation)
  • Query text: ~20 tokens (typical user question)

Step 1: Cost for ingesting 1 Million Logs

Component Calculation Tokens Cost
Log Embedding 1M logs × 200 tokens 200M tokens $4.00

Step 2: Cost for 1 Query (LLM Analysis)

Component Calculation Tokens Cost
Query Embedding 1 query × 20 tokens 20 tokens $0.0000004
LLM Input Query (20 tokens) + LLM Instructions (~100 tokens) + 100 logs (100 × 200 tokens) 20,120 tokens $0.0060
LLM Output 1 query × 50 logs × 200 tokens 10,000 tokens $0.0250
Total per query $0.0310

Note: This involves CrewAI agent with Gemini LLM. Input: user query + 100 logs from Pinecone. Output: summary + 50 selected logs.

Important: LLM costs are highly dependent on:

  • Input volume: How many logs you retrieve from Pinecone similarity search (currently 100 logs)
  • Output volume: How many logs you want the LLM to return (currently 50 logs)
  • Log complexity: Token count per log affects both input and output costs
  • LLM Model: Using Google Gemini 2.0 Flash for analysis (costs vary by model)

Cost for 1,000 Queries

Component Calculation Cost
Query Embedding 1,000 queries × $0.0000004 $0.0004
LLM Input 1,000 queries × $0.0060 $6.00
LLM Output 1,000 queries × $0.0250 $25.00
Total for 1,000 queries $31.00

Final Cost Summary:

  • One-time ingestion: $4.00 (for 1M logs)
  • Per query cost: $0.0310

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors