Fast Reader is a Streamlit application designed to process and summarize PDF documents efficiently. It leverages OpenAI's language models to extract and summarize sections from textbooks or other structured documents.
- Smart Structure Analysis: Analyzes textbook structure first for more accurate section detection
- Precise Section Extraction: Uses AI and binary search to accurately locate section ranges with validation
- Intelligent Chunking: Smart text segmentation that avoids breaking paragraphs or mathematical expressions
- Hierarchical Summarization: Tree-based approach enables summarizing very long sections without hitting token limits
- Multi-format Support: Handles math expressions (LaTeX), code blocks, and tables in summaries
- Batch Processing: Summarize single sections or entire chapter ranges
- Extracts sections from PDFs using AI and binary search to accurately locate section ranges.
- Summarizes text using OpenAI's language models with support for math, code, and tables.
- Provides a user-friendly interface for uploading PDFs and selecting sections to summarize.
- Supports batch summarization of multiple sections.
| Feature | Impact |
|---|---|
| Structure analysis first | More accurate section detection across different textbooks |
| Smarter chunking | Avoids breaking paragraphs or math, improves summarization quality |
| Tree summarization | Enables summarizing very long sections without hitting token limits |
| Section range validation | Catches embedded sub-sections you might have missed |
- Python 3.10+
- Streamlit
- PyMuPDF
- OpenAI Python Client
-
Clone the repository:
git clone <repository-url> cd Fast_Reader
-
Create a virtual environment:
python -m venv myenv source myenv/bin/activateOn Windows use:
myenv\Scripts\activate
-
Install the required packages:
pip install -r requirements.txt
-
Set up your OpenAI API key:
- Create a
.envfile in the root directory. - Add your OpenAI API key to the
.envfile:OPENAI_API_KEY=your_openai_api_key_here
- Create a
-
Run the Streamlit app:
streamlit run app.py
-
Open your web browser and go to
http://localhost:8501to access the app. -
Upload a PDF document using the file uploader.
-
Enter the section range you wish to summarize and click "Process PDF".
-
Once sections are extracted, choose to summarize a single section or all sections in the range.
-
View and download the generated summaries.
We welcome contributions to improve Fast Reader! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix:
git checkout -b feature-name
- Make your changes and commit them with descriptive messages.
- Push your changes to your fork:
git push origin feature-name
- Open a pull request with a detailed description of your changes.
This project is licensed under the MIT License. See the LICENSE file for more information.
- Prototyped by Cursor.
- Thanks to OpenAI for providing the language models.
- Thanks to the Streamlit community for their support and resources.