Skip to content

Retrieval-Augmented Generation (RAG) example using Ollama, LangChain, and ChromaDB with local documents.

Notifications You must be signed in to change notification settings

zeynepcol/rag-system-optimization-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RAG System with Ollama & LangChain

πŸ“Œ Project Overview

This repository contains a Retrieval-Augmented Generation (RAG) implementation built with LangChain, ChromaDB, and Ollama.

The project demonstrates how a local LLM can answer user questions by retrieving relevant context from a small set of documents stored in a vector database.

This implementation focuses on core RAG concepts rather than full-scale optimization or sector-specific experimentation.

βš™οΈ Technologies Used

  • LangChain – RAG pipeline and LLM interface
  • ChromaDB – Vector store for similarity search
  • Ollama – Local LLM runtime
  • Ollama Embeddings – Vector embeddings for documents

🧠 How the RAG System Works

  1. A set of local documents is defined in the script
  2. Documents are converted into embeddings using OllamaEmbeddings
  3. Embeddings are stored in ChromaDB
  4. User questions are matched against the vector store
  5. The most relevant document is retrieved
  6. The retrieved context is injected into the LLM prompt
  7. The LLM generates an answer based on the retrieved context

πŸ§ͺ Example Documents

  • Zeynep Col has lived in NYC for 10 years.
  • Zeynep Col is an imaginary LLM engineer in the movie 'The Matrix'.
  • New York City's subway system is the oldest in the world.

Runtime Flow & Output

▢️ Runtime Flow

When the script is executed, the following steps occur in order:

  • Available Ollama models are listed from the local Ollama runtime
  • The user selects a model interactively
  • The user enters questions in a continuous loop
  • The system retrieves the most relevant document from ChromaDB
  • The retrieved context is injected into the LLM prompt
  • The LLM generates and prints the final response

πŸ“Š Output Section

During execution, the program prints:

  • The user question
  • The retrieved RAG context
  • The final prompt sent to the LLM
  • The LLM-generated response

This output flow makes it easy to observe how retrieval affects the final answer.

πŸ“Š Output Image

Image

🏁 Summary

This runtime-focused README documents the interactive behavior and output structure of the RAG system.
It complements the main README by explaining how the system behaves during execution and how retrieved context influences LLM responses.

🀝 Contributing

Contributions are welcome!

πŸ“‘ Contact

For any queries or collaborations, feel free to reach out!

🌐 GitHub: zeynepcol
πŸ‘€ LinkedIn: zeynep-col

About

Retrieval-Augmented Generation (RAG) example using Ollama, LangChain, and ChromaDB with local documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages