Skip to content
/ extract-rag Public template

A langGraph-based RAG which utilised GLiNER2 to generate metadata for a filtered similarity search

License

Notifications You must be signed in to change notification settings

amany9000/extract-rag

Repository files navigation

ExtractRAG: The GLiNER-based Metadata-Filtered RAG

Open in - LangGraph Studio

This is a starter project to help you get started with developing a GLiNER-based Metadata-Filtered RAG Research agent using LangGraph in LangSmith Studio.

  • GLiNER is an efficient model used for Named Entity Recognition(NER), Classification and Extraction. It has excellent support for CPU.

  • Since using LLMs to filter unstructured data (Articles, Legal Docs, Reports etc) can be very costly, GLiNER-based Filtered RAG pipeline provide an efficient and robust solution.

  • In ingestor.py, the data is first chunked into LangChain Documents, these documents are then classified using GLiNER, the classified labels are stored in the document's metadata and then finally document indexing in the VectorDB is performed.

  • At the time of retrieval, the LLM sends back multiple (default 3) queries and their corresponding filters(if any), which are then used to retrieve data from the VectorDB.

Graph view in LangGraph studio UI

What it does

This project has two graphs:

  • a "retrieval" graph (src/retrieval_graph/graph.py)
  • a "researcher" subgraph (part of the retrieval graph) (src/retrieval_graph/researcher_graph/graph.py)

The retrieval graph manages a chat history and responds based on the fetched documents. Specifically, it:

  1. Takes a user query as input
  2. Then the researcher subgraph runs these steps:
  • it first generates a list of queries (default 3) along with metadata filters (if any).
  • it then retrieves the relevant documents in parallel for all queries+filters and return the documents to the LLM.
  1. Finally, the LLM generates a response based on the retrieved documents and the conversation context.

Getting Started

  1. Create a .env file:
cp .env.example .env
  1. Setup Qdrant:
docker run -p 6333:6333 -p 6334:6334 -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant 

Qdrant is a fast vectordb. It has an extensive support for metadata filtering.

  1. Install Dependencies:
uv sync
  1. Ingest Documents from ./docs:
uv run python ingestor.py

The documents in ./docs are processed versions of the .sgm files of the Reuters-21578 text categorization data collection.

The processing is done using doc_processor.py.

If you want to add more docs:

Then run the processor script:

uv run python doc_processor.py

Make sure to provide the right .sgm file here.

  1. Start Langsmith Studio:
uv run langgraph dev --allow-blocking
  1. Next, open the retrieval_graph using the dropdown in the top-left. Ask it questions about LangChain to confirm it can fetch the required information!

Setup Model

The default values for response_model, query_model are shown below:

response_model: google_genai/gemini-2.0-flash-lite
query_model: google_genai/gemini-2.0-flash-lite

Google AI Studio

To use Google Gemini's chat models:

  1. Sign up for an Google AI Studio API key.
  2. Once you have your API key, add it to your .env file:
GOOGLE_API_KEY=your-api-key

Setup Embedding Model

The default values for embedding_model are shown below:

embedding_model: fastembed/BAAI/bge-base-en-v1.5

How to customize

You can customize this retrieval agent template in several ways:

  1. Modify the embedding model: You can change the embedding model used for document indexing and query embedding by updating the embedding_model in the configuration. Options include various fastembed models.

  2. Customize the response generation: You can modify the response_system_prompt to change how the agent formulates its responses. This allows you to adjust the agent's personality or add specific instructions for answer generation.

  3. Change the language model: Update the response_model in the configuration to use different language models for response generation. Options include various Claude models from Anthropic, as well as models from other providers like Fireworks AI.

  4. Extend the graph: You can add new nodes or modify existing ones in the src/retrieval_graph/graph.py file to introduce additional processing steps or decision points in the agent's workflow.

  5. Add tools: Implement tools to expand the researcher agent's capabilities beyond simple retrieval generation.

About

A langGraph-based RAG which utilised GLiNER2 to generate metadata for a filtered similarity search

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors