🧠 DocuSage: Your Documentation Expert

Ask complex questions, get expert answers. DocuSage is an AI-powered chat agent that learns any technical documentation you provide and becomes a specialized assistant, helping you debug code, understand concepts, and build faster.

📖 Description

Have you ever been stuck trying to find a specific piece of information in dense, multi-page documentation? DocuSage solves this by leveraging a powerful AI architecture called Retrieval-Augmented Generation (RAG).

You provide a URL to a documentation website. DocuSage reads, processes, and indexes that content into a specialized knowledge base. You can then ask questions in natural language, and the AI will synthesize answers, grounded in the provided documentation, to help you with your development tasks. The inspiration for building this came when i was stuck with Qiskit errors!

✨ Features

Web Crawler: Intelligently scrapes and cleans content from a starting URL and relevant sub-pages.
AI-Powered Search: Uses vector embeddings to search for information by semantic meaning, not just keywords.
Expert Q&A: Leverages Google's Gemini LLM to understand your questions, correct syntax errors, and provide expert-level answers.
Trustworthy & Transparent: The "Show Sources" feature displays the exact text excerpts the AI used to generate its answer.
Streaming Responses: Answers appear word-by-word for a fast, interactive experience.
Modern UI: A clean, dark-themed, and easy-to-use interface built with Gradio.

⚙️ How It Works (The RAG Pipeline)

DocuSage is built on a Retrieval-Augmented Generation (RAG) pipeline:

Scrape & Crawl: The user provides a URL. The application scrapes the content and follows same-domain links to build a comprehensive text corpus.
Chunk: The collected text is broken down into smaller, manageable chunks.
Embed & Index: Each chunk is converted into a numerical representation (an embedding) using a SentenceTransformer model. These embeddings are stored in a ChromaDB vector database.
Retrieve: When a user asks a question, it's also converted into an embedding. The vector database performs a similarity search to find the most relevant text chunks from the documentation.
Generate: The user's question and the retrieved chunks are passed to the Gemini LLM with a specialized prompt. The LLM then generates a high-quality, context-aware answer.

🛠️ Tech Stack

Backend: Python
AI & ML:
- LLM: Google Gemini 1.5 Flash
- Embeddings: sentence-transformers
- Vector Database: ChromaDB
- Framework: langchain
Web Scraping: requests, trafilatura, BeautifulSoup4
UI: Gradio

🚀 Setup and Installation

Clone the repository:

git clone [https://github.com/Bhavya445/docu-sage.git](https://github.com/Bhavya445/docu-sage)
cd DocuSage

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install the required libraries:
```
pip install -r requirements.txt
```
Set up your Google AI API Key:
- Get your key from Google AI Studio.
- Set it as an environment variable in your terminal:
```
export GOOGLE_API_KEY='YOUR_API_KEY'
```

▶️ How to Run

With your environment set up and the API key exported, start the application:

python app.py

Open your web browser and navigate to the local URL provided in the terminal (usually http://1227.0.0.1:7860).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
docusage.jpeg		docusage.jpeg
flowchart.png		flowchart.png
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 DocuSage: Your Documentation Expert

📖 Description

✨ Features

⚙️ How It Works (The RAG Pipeline)

🛠️ Tech Stack

🚀 Setup and Installation

▶️ How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 DocuSage: Your Documentation Expert

📖 Description

✨ Features

⚙️ How It Works (The RAG Pipeline)

🛠️ Tech Stack

🚀 Setup and Installation

▶️ How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages