A multilingual (Hebrew/English) PDF question-answering system with grounded answers and citations.
The application uses embedding models for semantic search. You have two options:
-
Automatic online download (default):
- Models will be downloaded automatically from Hugging Face on first run
- Requires internet connection
- Can be slower for initial startup
-
Local models (recommended):
- Run the included script:
download_models.bat - This downloads models to the configured local directory (default: D:/models/)
- Faster startup and no internet dependency
- Run the included script:
To use local models, ensure use_local_models = True in app/core/config.py.
-
Setup environment:
cp .env.example .env # Edit .env with your OpenAI API key -
Place your PDFs:
- Put your PDF files in the
data/pdfsfolder
- Put your PDF files in the
-
Start the application:
- Run the included batch file:
run_app.bat - Or manually:
venv\Scripts\python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 - Access the UI at: http://localhost:8000
- Run the included batch file:
-
If you encounter issues:
- Try the development version:
venv\Scripts\python run_dev.py - Access the development UI at: http://localhost:8001
- Try the development version:
- Hebrew text issues: The application now supports Hebrew text properly.
- Connection timeout: For large PDFs, increase the timeout in app/ui/app_ui.py.
- Slow first run: The first run will download language models, which may take time.
Docker support is planned for a future update.
- Multilingual support for Hebrew and English PDFs
- Contextual Q&A with source citations
- Evidence panel showing exact sources
- Language selection for answers