Skip to content

dataelvisliang/SearchGPT

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

62 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Relevance Search - AI-Powered Search Engine

An intelligent search engine that combines real-time web search with AI-powered answer generation using OpenRouter API.

🌐 Live Demo | Built on the excellent work of Wilson-ZheLin/SearchGPT

✨ Features

  • πŸ” Real-time Web Search via Serper (Google API)
  • πŸ€– AI-Powered Answers using OpenRouter (multiple free models available)
  • 🌐 Beautiful Streamlit Interface with Lottie animations
  • πŸ“š Semantic Search with ChromaDB vector database
  • 🎯 Smart Document Retrieval using text-embedding-3-small or Gitee BGE-M3
  • πŸ”— Source Citations with clickable references
  • 🌍 Multi-language Support (auto-detects Chinese/English)
  • ⚑ Multi-threaded Web Scraping for fast content extraction
  • πŸ’Ύ Export Results as TXT or JSON
  • 🎨 No LangChain Required - lightweight and fast
  • πŸ“Š Complete Pipeline Tracing - see every step with timing, API calls, and similarity scores
  • πŸ”Ž Full Prompt Visibility - inspect exactly what's sent to the LLM
  • ⏱️ Performance Metrics - track time spent on each pipeline step

πŸš€ Quick Start

Prerequisites

Installation

  1. Clone the repository
git clone <your-repo-url>
cd Relevance Search
  1. Install dependencies
pip install -r requirements.txt
  1. Configure API Keys

You can either:

  • Enter them in the Streamlit UI when running the app, OR
  • Save them in src/config/config.yaml:
model_name: x-ai/grok-4.1-fast:free
openrouter_api_key: "your-openrouter-key-here"
serper_api_key: "your-serper-key-here"

Running the Application

Streamlit Web Interface (Recommended):

streamlit run app.py

Command Line:

python src/main.py

🎯 Available Models

All models are completely free via OpenRouter:

  • amazon/nova-2-lite-v1:free - Amazon's Nova 2 Lite model (default)
  • nvidia/nemotron-nano-9b-v2:free - NVIDIA's efficient 9B model
  • qwen/qwen3-4b:free - Alibaba's Qwen3 4B compact model
  • alibaba/tongyi-deepresearch-30b-a3b:free - Alibaba's research-focused 30B model

πŸ“ Project Structure

Relevance Search/
β”œβ”€β”€ app.py                      # Streamlit web interface
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.py                # CLI entry point
β”‚   β”œβ”€β”€ fetch_web_content.py   # Web scraping with multi-threading
β”‚   β”œβ”€β”€ serper_service.py      # Serper API integration
β”‚   β”œβ”€β”€ retrieval.py           # Vector database & embeddings
β”‚   β”œβ”€β”€ llm_answer.py          # AI answer generation
β”‚   β”œβ”€β”€ llm_service.py         # OpenRouter API service
β”‚   β”œβ”€β”€ text_utils.py          # Text processing utilities
β”‚   └── config/
β”‚       └── config.yaml        # Configuration file
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md                  # This file

πŸ”§ Configuration

The src/config/config.yaml file supports:

  • model_name: AI model to use
  • openrouter_api_key: Your OpenRouter API key
  • serper_api_key: Your Serper API key
  • template: Custom prompt template for AI responses

πŸ“Έ Screenshots

Main Interface

![Relevance Search Interface](assets/Relevance Search.png)

Pipeline Trace - Search & Scraping

Pipeline Trace 1

Pipeline Trace - Embeddings & Retrieval

Pipeline Trace 2

Pipeline Trace - Chunks & Similarity Scores

Pipeline Trace 3

Pipeline Trace - Full Prompt & Generation

Pipeline Trace 4

πŸ“– Usage Examples

Via Streamlit UI

  1. Launch the app: streamlit run app.py
  2. Enter your API keys in the sidebar (or use saved keys)
  3. Select your preferred AI model
  4. Choose an AI profile (Researcher, Technical Expert, etc.)
  5. Enter your search query
  6. Click "πŸš€ Search"

Via Command Line

Edit the query in src/main.py and run:

python src/main.py

πŸ› οΈ Key Technologies

  • OpenRouter API - Access to multiple AI models
  • Serper API - Fast Google search results
  • ChromaDB - Vector database for semantic search
  • Streamlit - Modern web interface
  • BeautifulSoup4 - Web scraping
  • text-embedding-3-small - Efficient text embeddings

πŸ” How It Works

  1. Search: Query sent to Serper API for real-time web results
  2. Scrape: Multi-threaded extraction of content from top results
  3. Embed: Text chunked and converted to vector embeddings (OpenRouter or Gitee AI)
  4. Retrieve: Semantic search finds most relevant content with similarity scores
  5. Generate: AI model creates comprehensive answer with citations

Pipeline Tracing

Relevance Search provides complete visibility into every step of the RAG pipeline:

  • Step 1 - Search: See all URLs, titles, and snippets returned by Serper
  • Step 2 - Scraping: Track success/failure for each page with content previews
  • Step 3 - Embeddings: View API calls made, timing, and chunks processed
  • Step 4 - Retrieval: Inspect similarity scores for each retrieved chunk (color-coded by relevance)
  • Step 5 - Generation: See the exact prompt sent to the LLM and all context chunks

All steps include timing information to help identify bottlenecks and optimize performance.

πŸ“ Logging

Comprehensive logging is built-in. Check your terminal for detailed logs:

  • Serper API requests and responses
  • Web scraping progress (per thread)
  • Embedding generation status
  • AI model responses
  • Error diagnostics

🀝 Contributing

Contributions are welcome! Feel free to:

  • Report bugs
  • Suggest features
  • Submit pull requests

πŸ“„ License

This project is licensed under the MIT License.

πŸ™ Acknowledgments

This project is built on the foundation of Wilson-ZheLin/SearchGPT. Significant enhancements include:

  • Migration from OpenAI to OpenRouter API
  • Removal of LangChain dependencies
  • Addition of Streamlit web interface
  • Modern ChromaDB integration
  • Comprehensive logging system
  • Updated embedding models

⭐ Star This Repo

If you find this project useful, please give it a star! ⭐

About

GPT Enhanced with Real-Time Web Browsing πŸ”—

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%