Local RAG Demo

This project is an extension of Local RAG agent with LLaMA3

This is a demo project showcasing the use of Langchain in a RAG context. It relies on an Ollama server running a Llama3 llm.

Building blocks :

streamlit GUI for LLM interaction
In-memory vector store (see Rag App)
Nomic Embedding model
Extra data used for answer correctness are provided through websearch using the Tavily API.

Running locally

The app requires a local ollama server running
See .vscode/launch.json for running the RAG app locally. The environment variables to set are the same as Environment variables. Entrypoint is streamlit streamlit_app.py (see Running your streamlit app)

Running in containers

Via compose

docker-compose -f .\Dockerfiles\compose.yaml -p "local-rag-stack" up

Manually

Ollama agent :

Run the container on a dedicated network (docker network create rag-network)

docker build -t llama32-3b -f Dockerfiles/ollama.dockerfile Dockerfiles

docker run -d \
--name ollama \
--gpus=all \
--network=rag-network \
--restart=always \
-p 11434:11434 \
-v ollama:/root/.ollama \
--stop-signal=SIGKILL \
llama32-3b

RAG App :

docker build -t local-rag -f Dockerfiles/local_rag.dockerfile .

docker run -d \
--name local-rag \
--gpus=all \
--network=rag-network \
--restart=always \
--security-opt=label=disable \
--network="host" `# Necessary if the ollama server is running locally ie at the hosts localhost` \
-p 8501:8501 \
-e GPU_DEVICE='cuda' \
-e NOMIC_EMBEDDING_MODEL='nomic-embed-text-v1.5' \
-e TAVILY_API_KEY='<Tavily_API_Key_goes_here>' \
-e LLM_HOST='<LLM_host_url_goes_here>' \
-e LLM_MODEL='<Model_name_goes_here>' \
local-rag

The server runs at http://localhost:8501

Environment variables

GPU_DEVICE : target device to run the nomic embedding model on
NOMIC_EMBEDDING_MODEL : the nomic embedding model version (see Huggingface)
TAVILY_API_KEY : your Tavily API Key for the websearch functionnality
LLM_HOST : host and port of the server running your model ([http://localhost:11434](http://localhost:11434 by default)
LLM_MODEL : name of the model running on the ollama instance (llama3.2:3b-instruct-fp16 if using the provided ollama.dockerfile)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.devcontainer		.devcontainer
.vscode		.vscode
Dockerfiles		Dockerfiles
LocalRAG		LocalRAG
.env-template		.env-template
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local RAG Demo

Running locally

Running in containers

Via compose

Manually

Ollama agent :

RAG App :

Environment variables

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Paul-Bantz/Local-RAG

Folders and files

Latest commit

History

Repository files navigation

Local RAG Demo

Running locally

Running in containers

Via compose

Manually

Ollama agent :

RAG App :

Environment variables

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages