ExtractRAG: The GLiNER-based Metadata-Filtered RAG

This is a starter project to help you get started with developing a GLiNER-based Metadata-Filtered RAG Research agent using LangGraph in LangSmith Studio.

GLiNER is an efficient model used for Named Entity Recognition(NER), Classification and Extraction. It has excellent support for CPU.
Since using LLMs to filter unstructured data (Articles, Legal Docs, Reports etc) can be very costly, GLiNER-based Filtered RAG pipeline provide an efficient and robust solution.
In ingestor.py, the data is first chunked into LangChain Documents, these documents are then classified using GLiNER, the classified labels are stored in the document's metadata and then finally document indexing in the VectorDB is performed.
At the time of retrieval, the LLM sends back multiple (default 3) queries and their corresponding filters(if any), which are then used to retrieve data from the VectorDB.

What it does

This project has two graphs:

a "retrieval" graph (src/retrieval_graph/graph.py)
a "researcher" subgraph (part of the retrieval graph) (src/retrieval_graph/researcher_graph/graph.py)

The retrieval graph manages a chat history and responds based on the fetched documents. Specifically, it:

Takes a user query as input
Then the researcher subgraph runs these steps:

it first generates a list of queries (default 3) along with metadata filters (if any).
it then retrieves the relevant documents in parallel for all queries+filters and return the documents to the LLM.

Finally, the LLM generates a response based on the retrieved documents and the conversation context.

Getting Started

Create a .env file:

cp .env.example .env

Setup Qdrant:

docker run -p 6333:6333 -p 6334:6334 -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant

Qdrant is a fast vectordb. It has an extensive support for metadata filtering.

Install Dependencies:

uv sync

Ingest Documents from ./docs:

uv run python ingestor.py

The documents in ./docs are processed versions of the .sgm files of the Reuters-21578 text categorization data collection.

The processing is done using doc_processor.py.

If you want to add more docs:

Then run the processor script:

uv run python doc_processor.py

Make sure to provide the right .sgm file here.

Start Langsmith Studio:

uv run langgraph dev --allow-blocking

Next, open the retrieval_graph using the dropdown in the top-left. Ask it questions about LangChain to confirm it can fetch the required information!

Setup Model

The default values for response_model, query_model are shown below:

response_model: google_genai/gemini-2.0-flash-lite
query_model: google_genai/gemini-2.0-flash-lite

Google AI Studio

To use Google Gemini's chat models:

Sign up for an Google AI Studio API key.
Once you have your API key, add it to your .env file:

GOOGLE_API_KEY=your-api-key

Setup Embedding Model

The default values for embedding_model are shown below:

embedding_model: fastembed/BAAI/bge-base-en-v1.5

How to customize

You can customize this retrieval agent template in several ways:

Modify the embedding model: You can change the embedding model used for document indexing and query embedding by updating the embedding_model in the configuration. Options include various fastembed models.
Customize the response generation: You can modify the response_system_prompt to change how the agent formulates its responses. This allows you to adjust the agent's personality or add specific instructions for answer generation.
Change the language model: Update the response_model in the configuration to use different language models for response generation. Options include various Claude models from Anthropic, as well as models from other providers like Fireworks AI.
Extend the graph: You can add new nodes or modify existing ones in the src/retrieval_graph/graph.py file to introduce additional processing steps or decision points in the agent's workflow.
Add tools: Implement tools to expand the researcher agent's capabilities beyond simple retrieval generation.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
docs		docs
src		src
static		static
tests		tests
.codespellignore		.codespellignore
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
doc_processor.py		doc_processor.py
ingestor.py		ingestor.py
langgraph.json		langgraph.json
pyproject.toml		pyproject.toml
qdrant_query.py		qdrant_query.py
reut2-002.sgm		reut2-002.sgm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExtractRAG: The GLiNER-based Metadata-Filtered RAG

What it does

Getting Started

If you want to add more docs:

Setup Model

Google AI Studio

Setup Embedding Model

How to customize

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

License

amany9000/extract-rag

Folders and files

Latest commit

History

Repository files navigation

ExtractRAG: The GLiNER-based Metadata-Filtered RAG

What it does

Getting Started

If you want to add more docs:

Setup Model

Google AI Studio

Setup Embedding Model

How to customize

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages