Skip to content

kdeepak2001/gemini-pdf-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 AI PDF Extractor with Google Gemini

Streamlit App Python 3.8+ Google Gemini License: MIT

🌐 Live Demo

🚀 Try it now (no installation required):

👉 https://gemini-pdf-extractor-kwsnfw8xpzpn7pax2gavyf.streamlit.app/

Upload a PDF and experience AI-powered document analysis instantly!


Ultra-modern AI-powered PDF extraction tool with 2025 glassmorphism UI design

✨ Key Features

Feature Description
📄 PDF Text Extraction Extract text from any PDF with complete metadata
🤖 AI Summarization Generate intelligent summaries using Gemini 2.5 Flash
🔍 Information Extraction Auto-identify topics, entities, dates, and key insights
📊 Export Results Download analysis in JSON/TXT format
🎨 Modern UI 2025 glassmorphism design with animated gradients
💯 100% Free Uses Google Gemini's generous free tier (no credit card!)

🚀 Quick Start

Prerequisites

  • Python 3.8 or higher
  • Google Gemini API key (free)

Installation

1. Clone the repository git clone https://github.com/yourusername/gemini-pdf-extractor.git cd gemini-pdf-extractor

2. Create virtual environment

Windows python -m venv venv venv\Scripts\activate

3. Install dependencies pip install -r requirements.txt

4. Get your FREE Gemini API key

5. Create .env file Create .env file in project root echo GOOGLE_API_KEY=your_api_key_here > .env

6. Run the application streamlit run app.py

The app will open at http://localhost:8501 🎉


📖 Usage Guide

1️⃣ Upload & Extract Tab

  • Upload your PDF (max 10MB)
  • Click "EXTRACT TEXT" button
  • View document statistics (pages, words, characters)

2️⃣ AI Analysis Tab

  • Click "GENERATE AI SUMMARY" for intelligent summary
  • Click "EXTRACT KEY INFORMATION" for structured data
  • View topics, entities, dates, and takeaways

3️⃣ Complete Results Tab

  • View all analysis metrics
  • Download results in JSON or TXT format

4️⃣ About Tab

  • View features, tech stack, and roadmap
  • Check system configuration

🛠️ Tech Stack

Technology Purpose
Python 3.8+ Core programming language
Google Gemini 2.5 Flash AI model for summarization & analysis
Streamlit Modern web framework with beautiful UI
PyPDF2 PDF text extraction library
python-dotenv Secure environment configuration

📂 Project Structure

File / Folder Description
app.py Main Streamlit application with modern UI
config.py Configuration handling & API validation
requirements.txt Python dependencies list
.env Environment variables (API keys) – ignored in Git
.gitignore Git ignore rules
README.md Project documentation
venv/ Virtual environment – ignored in Git

File Descriptions

File Purpose Lines Status
app.py Main application with glassmorphism UI, PDF extraction, AI analysis ~600 ✅ Production
config.py Configuration class with validation ~20 ✅ Production
requirements.txt Dependencies: streamlit, google-generativeai, PyPDF2, python-dotenv ~5 ✅ Production
.env API keys (never committed to Git) ~1 🔒 Local only
.gitignore Protects sensitive files (.env, venv/, etc.) ~10 ✅ Production
README.md Complete documentation ~200 ✅ Production

🔒 Security Features

✅ API keys stored in .env file (never committed to Git)
✅ Input validation and error handling
✅ Secure file operations with temporary files
✅ Environment variable validation on startup


📊 Gemini Free Tier Limits

Feature Limit
Requests per minute 15
Tokens per day 1,000,000
Requests per day 1,500
Cost FREE (no credit card required)

Perfect for production applications and portfolio projects!


🎨 UI Features (2025 Design Trends)

  • Glassmorphism - Frosted glass effects with backdrop blur
  • 🌈 Animated Gradients - Smooth color transitions
  • 📊 Bento Grid Layout - Modern card-based design
  • 🎯 Micro-interactions - Smooth hover animations
  • 📱 Responsive Design - Works on all devices
  • 💫 Loading States - Visual feedback for all actions

🚀 Upcoming Features

Feature Status
📊 Advanced Analytics Dashboard Planned
📁 Batch Processing (Multiple PDFs) Planned
🔍 Smart Semantic Search Planned
💬 Chat with PDF (RAG) Planned
🌍 Multi-language Support (50+ languages) Planned
🔗 RESTful API Planned

🐛 Troubleshooting

API Key Error?

PDF Upload Error?

  • Maximum file size: 10MB
  • Ensure valid PDF format
  • Try a different PDF

Module Not Found?

  • Ensure virtual environment is activated
  • Run: pip install -r requirements.txt

📝 License

MIT License - feel free to use for personal or commercial projects!


👨‍💻 Author

Your Name
Built for AI Agent role applications | Portfolio Project

Connect with me:


🙏 Acknowledgments

  • Google Gemini AI for providing free API access
  • Streamlit for the amazing web framework
  • PyPDF2 for reliable PDF extraction

📈 Project Status

Status Maintenance

⭐ Star this repo if you find it helpful!


Last updated: October 2025

About

AI-powered PDF extraction tool with Google Gemini 2.5 and modern glassmorphism UI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages