Skip to content

tobiashaab/NaiveRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NaiveRAG

Python Version License Status Last Commit Contributors


🚀 Introduction

NaiveRAG is a simple Retrieval-Augmented Generation (RAG) pipeline created for education purposes.

Future plans include:

  • Support for different strategies - such as recursive token splitting.
  • CLI args and a runnable bash script.

📦 Installation

1. Create a Virtual Environment

Using conda is recommended:

conda create -n naiverag python=3.12
conda activate naiverag

2. Install the Project

Install in editable mode:

pip install -e .

3. Set Environment Variables

Create a .env file in the project root:

API_KEY=...

4. Configure the Pipeline

Adjust the settings in config.yaml!

5. Run the Pipeline

After setting everything up, you can run the pipeline using the following command:

python run.py

📚 Documentation

1. Project Structure

The actual RAG pipeline is located in the /rag_pipeline directory, which contains the following subdirectories:

  • /api – Implements API endpoints for language and embedding model APIs
  • /chunking – Contains strategies for chunking documents
  • /db – Contains vector store implementations

Each of these directories includes a base.py file that defines an abstract base class. These classes are then implemented in specific files (i.e. gemini.py for the Gemini API).

The correct implementation is automatically selected based on your settings in config.yaml.

Utility functions (i.e., for loading the config) are located in /util.

About

A simple NaiveRAG pipeline.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages