Skip to content

Yuumiya/rag-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG-system

Installation Steps:

1. Clone repository

2. Create virtual environment:

  • Create:
    python -m venv .venv
  • Activate:
    .\.venv\Scripts\activate

3. Install dependencies

  • Using pip:

      pip install -r requirements.tx
  • Using uv:

    uv pip install -r requirements.txt

This will install the necessary libraries including:

  • ruff==0.11.4
  • faiss-cpu>=1.7.2
  • numpy>=1.23.0
  • openai>=0.27.0
  • tqdm>=4.62.3
  • pathlib>=1.0.0

Setup

In order to start sending queries make sure to:

  • Have an OPENAI_API_KEY as an environmental variable.
  • Load the target repository:
    python scripts/run_load.py <repo_url> [--destination <destination_path>]
    example:
    python scripts/run_load.py https://github.com/viarotel-org/escrcpy
  • Make chunks:
    python scripts/run_chunk.py <repo_path> [--chunk_size <size>] [--overlap <overlap>] [--output <output_path>]
    example:
    python scripts/run_chunk.py data/escrcpy
  • Embed chunks into an index file
    python scripts/run_embed.py [--chunks_path <chunks_file>] [--index_path <index_file>]
    example:
    python scripts/run_embed.py

How to use

Now you can run your queries in CLI as follows:

python scripts/run_answer.py <query> [--top_k <k>] [--index_path <index_file>] [--chunk_path <chunk_file>] [--rerank]

example:

python scripts/run_answer.py "How does the SelectDisplay component handle the device options when retrieving display IDs?" --rerank

--rerank: Indicates whether to perform reranking on the retrieved chunks after they are fetched. Use --rerank to enable.

Metrics

To see the metrics of the retriever you can use script run_metrics.py:

python scripts/run_metrics.py [--ground_truth_path <gt_file>] [--top_k <k>] [--index_path <index_file>] [--chunks_path <json_file>] [--rerank]

example:

python scripts/run_metrics.py --rerank

Results:

The current pipeline achieved the values of

  • Recall@10: 0.6471
  • MRR@10: 0.4281

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages