Enterprise Knowledge Base Assistant

A production-ready intelligent knowledge management system using LlamaIndex that enables organizations to query, analyze, and extract insights from multiple data sources (documents, databases, APIs, web content) through natural language.

Key Features

Multi-Source Data Ingestion: Documents (PDF, Word, Markdown), databases, APIs, web content
Advanced Query Engines: Sub-question decomposition, SQL generation, router query engine, hybrid search
Hybrid Search: Vector similarity + keyword search for better retrieval
Query Analytics: Track query patterns, popular topics, knowledge gaps
Multi-Tenant Support: Organization-level data isolation
Real-Time Updates: Incremental indexing, document versioning

Tech Stack

LlamaIndex: Data indexing, query engines, RAG
Gemini: GPT-4 for query understanding and generation
FastAPI: REST API backend
Streamlit: Web UI for querying and management
ChromaDB/Pinecone: Vector storage
PostgreSQL: Metadata and analytics storage
Redis: Query caching

Quick Start

1. Install Dependencies

cd enterprise_knowledge_base
pip install -r requirements.txt

2. Set Up Environment

Create a .env file in the project root:

# Copy the example
cp .env.example .env

# Edit .env and add your API keys
GEMINI_API_KEY=your_GEMINI_API_KEY_here
GEMINI_MODEL=gemini-3-flash-preview

3. Run Backend API

In one terminal:

cd enterprise_knowledge_base
python -m uvicorn backend.api.main:app --reload --port 8000

The API will be available at http://localhost:8000

4. Run Frontend

In another terminal:

cd enterprise_knowledge_base
streamlit run frontend/app.py --server.port 8501

The UI will be available at http://localhost:8501

5. Test the System

Open the Streamlit UI at http://localhost:8501
Go to "Ingest Documents" tab
Upload a PDF or document
Go to "Query" tab
Ask a question about your document

Architecture

User Query
    ↓
[Query Router] → Determines query type
    ↓
    ├─→ [Document Query Engine] → RAG over documents
    ├─→ [SQL Query Engine] → Natural language to SQL
    ├─→ [API Query Engine] → Query external APIs
    └─→ [Hybrid Query Engine] → Combines multiple sources
    ↓
[Response Synthesizer] → Combines results
    ↓
[Citation Generator] → Adds source citations
    ↓
Response + Sources

Project Structure

enterprise_knowledge_base/
├── backend/
│   ├── core/           # Core LlamaIndex setup
│   ├── engines/        # Query engines
│   ├── ingestion/      # Data ingestion
│   ├── api/            # FastAPI endpoints
│   ├── models/         # Database models
│   └── utils/          # Utilities
├── frontend/           # Streamlit UI
├── data/               # Data storage
└── tests/              # Tests

Usage Examples

Query Documents

from backend.core.knowledge_base import KnowledgeBase

kb = KnowledgeBase()
response = kb.query("What are the key features of our product?")
print(response)

Ingest Documents

kb.ingest_document("path/to/document.pdf", organization_id="org_123")

Query with SQL Generation

response = kb.query_sql("What are the top 10 customers by revenue?")

Tech Stack

LlamaIndex: Data indexing, query engines, RAG
FastAPI: REST API backend
Streamlit: Web UI for querying and management
ChromaDB/Pinecone: Vector storage
PostgreSQL: Metadata and analytics storage
Redis: Query caching

Use Cases

Enterprise Knowledge Management: Centralized knowledge base for organizations
Document Q&A: Natural language querying of documents
Data Integration: Query across multiple data sources
RAG Applications: Retrieval-augmented generation systems
Multi-Tenant Knowledge: Organization-level data isolation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTRIBUTING.md		CONTRIBUTING.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
example_usage.py		example_usage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enterprise Knowledge Base Assistant

Key Features

Tech Stack

Quick Start

1. Install Dependencies

2. Set Up Environment

3. Run Backend API

4. Run Frontend

5. Test the System

Architecture

Project Structure

Usage Examples

Query Documents

Ingest Documents

Query with SQL Generation

Tech Stack

Use Cases

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enterprise Knowledge Base Assistant

Key Features

Tech Stack

Quick Start

1. Install Dependencies

2. Set Up Environment

3. Run Backend API

4. Run Frontend

5. Test the System

Architecture

Project Structure

Usage Examples

Query Documents

Ingest Documents

Query with SQL Generation

Tech Stack

Use Cases

Contributing

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages