Welcome to the Multi-Site PDF Scraper documentation! This guide will help you navigate the documentation based on your role and needs.
- Main README - Project overview and quick start
- Deployment Guide - Production deployment instructions
- Developer Guide - Development environment setup
- Deployment Guide - How to deploy the application
- Common Operations Runbook - Day-to-day usage
- Backend Migration Guide - Switching between backends
- Troubleshooting - Solutions to common issues
- Developer Guide - Architecture and development setup
- Example Scraper Walkthrough - Creating new scrapers
- Backend Developer Guide - Adding new backends
- Error Handling - Exception patterns and logging
- Logging and Error Standards - Logging best practices
- Configuration & Services - Service architecture
- Deployment Guide - Production deployment
- Common Operations Runbook - Operational procedures
- Migration & State Repair - State management
- Secrets Rotation - Security maintenance
- CONTRIBUTING.md - How to contribute
- Developer Guide - Development guidelines
- Example Scraper Walkthrough - Adding scrapers
docs/
├── README.md # This file - documentation index
├── CHANGELOG.md # Version history and release notes
├── TODO.md # Roadmap and future plans
│
├── development/ # Developer guides and references
│ ├── DEVELOPER_GUIDE.md # Core development guide
│ ├── BACKEND_DEVELOPER_GUIDE.md # Backend development
│ ├── EXAMPLE_SCRAPER_WALKTHROUGH.md # Step-by-step scraper creation
│ ├── CONFIG_AND_SERVICES.md # Configuration architecture
│ ├── ERROR_HANDLING.md # Error patterns and exceptions
│ └── LOGGING_AND_ERROR_STANDARDS.md # Logging best practices
│
├── operations/ # Deployment and operations
│ ├── DEPLOYMENT_GUIDE.md # Production deployment
│ ├── RUNBOOK_COMMON_OPERATIONS.md # Day-to-day operations
│ ├── BACKEND_MIGRATION_GUIDE.md # Backend switching guide
│ ├── MIGRATION_AND_STATE_REPAIR.md # State management
│ ├── SECRETS_ROTATION.md # Security procedures
│ └── troubleshooting/ # Troubleshooting guides
│ └── ragflow_scraper_audit.md # RAGFlow debugging
│
├── reference/ # Technical specifications
│ └── METADATA_SCHEMA.md # Document metadata format
│
├── archive/ # Historical documentation
│ ├── README.md # Archive index
│ ├── plans/ # Historical planning docs
│ └── jules/ # Design explorations
│
└── screenshots/ # Application screenshots
└── current.png
- Developer Guide - Architecture - System overview
- Config & Services - Service layer design
- Backend Developer Guide - Backend architecture
- Config & Services - Configuration system
- Deployment Guide - Environment Variables
- Backend Migration Guide - Backend configuration
- Example Scraper Walkthrough - Creating scrapers
- Developer Guide - Scrapers - Scraper architecture
- Error Handling - Error patterns
- Common Operations Runbook - Daily operations
- Deployment Guide - Deployment procedures
- Migration & State Repair - State management
- Secrets Rotation - Security maintenance
- Backend Developer Guide - Creating backends
- Backend Migration Guide - Using different backends
- Config & Services - Backend integration
- Developer Guide - Testing - Test strategy
- Main README - Running Tests - Test commands
- SECURITY.md - Security policy and reporting
- Secrets Rotation - Credential management
- Deployment Guide - Security - Security best practices
- Common Operations Runbook
- Troubleshooting Directory - Specific issues
- Migration & State Repair - Recovery procedures
When adding or updating documentation:
- Placement: Choose the appropriate directory (operations, development, reference)
- Linking: Update this index when adding new documents
- Format: Use Markdown with clear headings and examples
- Maintenance: Update last-modified dates when making significant changes
- Audience: Write for the intended audience (users, developers, operators)
- Use clear, concise language
- Include code examples where helpful
- Provide both quick reference and detailed explanations
- Link to related documentation
- Keep documents focused on a single topic
- Use consistent formatting and structure
Can't find what you're looking for?
- Search: Use GitHub's search to find keywords
- Issues: Check existing issues for discussions
- Create Issue: Open a new issue if documentation is missing/unclear
- Contribute: Submit a PR to improve documentation
See CHANGELOG.md for recent documentation changes and project updates.
Historical planning documents and implementation notes are preserved in archive/. These may be outdated but are kept for historical context.
Need something specific? Use the navigation above or the search function to find what you need. If documentation is missing or unclear, please open an issue!