🏟️ FC Barcelona HR RAG - Local Files, No Database

A lightweight Retrieval-Augmented Generation system using Markdown files as knowledge base.

Welcome to the FC Barcelona HR RAG System - a simple, self-contained Retrieval-Augmented Generation (RAG) project that loads Markdown-based HR/player records directly from the filesystem into memory.

No database. No vector DB. No embeddings. Just pure local markdown → dictionary → context injection → LLM.

Perfect for small projects, demo apps, or experimenting with RAG fundamentals.

🚀 Features

✔ Local File-Based Knowledge Store

HR/player records stored as .md files
Loaded automatically into a Python dict
Easy to modify, easy to version-control

✔ Context-Injection RAG

User query analyzed for keyword matches
Relevant markdown snippets injected into system prompt
LLM answers with higher accuracy using only local data

✔ OpenAI API Integration

Uses gpt-4.1-nano for fast + cheap responses
System prompt built specifically around FC Barcelona employees

✔ Simple Gradio Chat UI

Full chat interface
Local browser launch
Debug logs enabled

✔ Fully Offline Knowledge

No vectors
No external DB
No third-party storage
Everything lives inside data/employees/*.md

📁 Project Structure

/
├── data/
│ └── employees/
│ ├── <employee1>.md
│ ├── <employee2>.md
│ └── ...
│
├── src/
│ ├── main.py
│
├── tests/
│ ├── test_context.py
│
├── synthetic_data_generator.py
├── requirements.txt
└── README.md

🧠 How It Works

1️⃣ Load all `.md` files into memory

knowledge = load_markdown_files()

Each employee record becomes a key-value entry:

{name: markdown_content}

2️⃣ Detect keywords in user query

get_relevant_context(message)

If words from the question match employee names → relevant markdown is returned.

3️⃣ Inject context into system prompt

system_message = SYSTEM_PREFIX + additional_context(message)

4️⃣ Chat model replies with improved accuracy

openai.chat.completions.create(...)

5️⃣ All wrapped in a clean Gradio UI

gr.ChatInterface(...).launch(inbrowser=True)

▶️ Running the Project

1. Install dependencies

pip install -r requirements.txt

2. Add your OpenAI API key

Create a .env file:

OPENAI_API_KEY=your_key_here

3. Put employee/player files into

data/employees/*.md

4. Run the app

python main.py

This opens the Gradio chat UI in your browser.

✍️ Example Query

"What is the salary of Pedri?" "How did João Félix perform in 2022?" "Tell me about the injuries of Ansu Fati."

The system will automatically: ✔ extract keywords ✔ search local markdown files ✔ inject only relevant context ✔ respond like an FC Barcelona HR expert

🛠 Tech Stack

Python 3.10+
OpenAI API
Gradio
dotenv
Local markdown RAG (no vector DB)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏟️ FC Barcelona HR RAG - Local Files, No Database

🚀 Features

✔ Local File-Based Knowledge Store

✔ Context-Injection RAG

✔ OpenAI API Integration

✔ Simple Gradio Chat UI

✔ Fully Offline Knowledge

📁 Project Structure

🧠 How It Works

1️⃣ Load all `.md` files into memory

2️⃣ Detect keywords in user query

3️⃣ Inject context into system prompt

4️⃣ Chat model replies with improved accuracy

5️⃣ All wrapped in a clean Gradio UI

▶️ Running the Project

1. Install dependencies

2. Add your OpenAI API key

3. Put employee/player files into

4. Run the app

✍️ Example Query

🛠 Tech Stack

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data/employees		data/employees
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
synthetic_data_generator.py		synthetic_data_generator.py

mansoorceksport/RAG-Local-FS

Folders and files

Latest commit

History

Repository files navigation

🏟️ FC Barcelona HR RAG - Local Files, No Database

🚀 Features

✔ Local File-Based Knowledge Store

✔ Context-Injection RAG

✔ OpenAI API Integration

✔ Simple Gradio Chat UI

✔ Fully Offline Knowledge

📁 Project Structure

🧠 How It Works

1️⃣ Load all .md files into memory

2️⃣ Detect keywords in user query

3️⃣ Inject context into system prompt

4️⃣ Chat model replies with improved accuracy

5️⃣ All wrapped in a clean Gradio UI

▶️ Running the Project

1. Install dependencies

2. Add your OpenAI API key

3. Put employee/player files into

4. Run the app

✍️ Example Query

🛠 Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1️⃣ Load all `.md` files into memory

Packages