Skip to content

Ashutosh9993/LOG-RAG

Repository files navigation

LogRAG – RAG-Powered Anomaly Detection from System Logs

LogRAG is a Retrieval-Augmented Generation (RAG)-based framework that helps identify and explain system anomalies and potential cyberattacks by analyzing log data using large language models (LLMs). It allows security teams to ask high-level questions about logs and receive intelligent, context-aware summaries of suspicious behavior.


🛡️ Key Features

  • 🚨 Detects anomalies like:
    • Repeated failed logins
    • Suspicious IP access patterns
    • Port scans, probing attempts, DDoS spikes
    • Abnormal command executions or privilege escalations
  • 🤖 Uses LLMs to explain logs in human-readable form
  • 🔍 Semantic log retrieval based on query context
  • 📂 Supports multi-source logs (e.g., auth logs, system logs, network traffic logs)
  • 🛠 Easily extendable for custom threat patterns

🔧 How It Works

  1. Log Ingestion
    Raw logs are collected from various sources (e.g., syslogs, firewall, SSH, nginx).

  2. Preprocessing
    Logs are parsed and normalized into a consistent structure.

  3. Embedding
    Logs are converted into vector embeddings using sentence transformers or OpenAI embeddings.

  4. Vector Indexing
    Stored in a vector database like FAISS for fast retrieval.

  5. Query + RAG

    • User asks: "Show unusual access patterns in the last 24 hours"
    • Top-k relevant logs are retrieved
    • An LLM (e.g., GPT) summarizes potential anomalies and flags risks

💡 Example Use Cases

  • 🔓 Detect brute-force login attempts
  • 🌐 Spot sudden traffic spikes from unknown IPs
  • 🐚 Find unauthorized shell commands
  • ⚙️ Analyze logs during an incident response

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages