PDF Q&A System

A multilingual (Hebrew/English) PDF question-answering system with grounded answers and citations.

Quick Start

Model Setup

The application uses embedding models for semantic search. You have two options:

Automatic online download (default):
- Models will be downloaded automatically from Hugging Face on first run
- Requires internet connection
- Can be slower for initial startup
Local models (recommended):
- Run the included script: download_models.bat
- This downloads models to the configured local directory (default: D:/models/)
- Faster startup and no internet dependency

To use local models, ensure use_local_models = True in app/core/config.py.

Running Locally

Setup environment:

cp .env.example .env
# Edit .env with your OpenAI API key

Place your PDFs:
- Put your PDF files in the data/pdfs folder
Start the application:
- Run the included batch file: run_app.bat
- Or manually: venv\Scripts\python -m uvicorn app.main:app --host 0.0.0.0 --port 8000
- Access the UI at: http://localhost:8000
If you encounter issues:
- Try the development version: venv\Scripts\python run_dev.py
- Access the development UI at: http://localhost:8001

Common Issues and Solutions

Hebrew text issues: The application now supports Hebrew text properly.
Connection timeout: For large PDFs, increase the timeout in app/ui/app_ui.py.
Slow first run: The first run will download language models, which may take time.

Using Docker (Coming Soon)

Docker support is planned for a future update.

Features

Multilingual support for Hebrew and English PDFs
Contextual Q&A with source citations
Evidence panel showing exact sources
Language selection for answers

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
data/tiktoken		data/tiktoken
scripts		scripts
.gitignore		.gitignore
CHANGES.md		CHANGES.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
direct_ingest.bat		direct_ingest.bat
direct_ingest_offline.bat		direct_ingest_offline.bat
docker-compose.yml		docker-compose.yml
download_models.bat		download_models.bat
pdfqa.log		pdfqa.log
quickstart.sh		quickstart.sh
requirements.txt		requirements.txt
run_app.bat		run_app.bat
run_dev.py		run_dev.py
run_ingest.py		run_ingest.py
startup_log.txt		startup_log.txt
test_offline.py		test_offline.py
test_tiktoken.py		test_tiktoken.py
test_tokenizers.py		test_tokenizers.py
נספחים יג_יד_טז_יז_יח_כ להסכם קבלן ראשי.pdf		נספחים יג_יד_טז_יז_יח_כ להסכם קבלן ראשי.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Q&A System

Quick Start

Model Setup

Running Locally

Common Issues and Solutions

Using Docker (Coming Soon)

Features

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF Q&A System

Quick Start

Model Setup

Running Locally

Common Issues and Solutions

Using Docker (Coming Soon)

Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages