| title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned |
|---|---|---|---|---|---|---|---|
Hybrid RAG |
📚 |
blue |
green |
streamlit |
1.49.1 |
app.py |
false |
A lightweight hybrid RAG prototype combining retrieval and language model generation to answer queries over documents or knowledge sources.
Accesible in Hugging Face space - https://huggingface.co/spaces/polojuan/hybridrag
- Hybrid retrieval (semantic + keyword-based) over document corpus
- Connects retrieval output into an LLM prompt for answer generation
- Simple, modular architecture for easy experimentation
- Docker support for reproducible environments
- Python-based, minimal dependencies
-
Document ingestion & preprocessing
- Load documents (e.g. PDF, text)
- Chunk / split into manageable passages
-
Embedding & indexing
- Generate embeddings for chunks using e.g. sentence-transformer
- Store embeddings in a vector index
-
Retrieval (hybrid)
- Semantic (vector) search
- Keyword / sparse search (optional)
- Fuse or rerank results
-
Prompt construction & generation
- Construct the prompt combining query + retrieved context
- Send to a language model (e.g. via OpenAI API or local LLM)
- Return answer
-
(Optional) Postprocessing & filtering
- Clean up output, optionally verify or validate
- Python 3.8+
- Docker (if you choose to use container)
git clone https://github.com/palscruz23/rag.git
cd rag
# Install dependencies
pip install -r requirements.txt
# Run streamlit app
streamlit run app.py.
├── app.py # Main entry / API or UI driver
├── Dockerfile # Docker configuration
├── requirements.txt # Python dependencies
├── utils/ # Utility modules & helpers
└── README.md # This documentation
You may also have submodules under utils (e.g. embedding, retrieval, prompt, generation).
This repository is licensed under the MIT License. See the LICENSE file for more details.