Selenium Automated Full-page Screenshot

A robust Python automation tool that captures full-page screenshots of web pages, with special handling for dynamic content and lazy-loaded elements. Built with Selenium WebDriver and optimized for reliability.

Key Features

Reliable Full-page Screenshots
- Handles dynamic content and lazy-loaded images
- Optimized layout calculations for accurate dimensions
- Multiple fallback capture methods
- Smart scroll handling for content loading
Performance Optimizations
- Headless Chrome with optimized flags
- Network and resource handling optimizations
- Efficient memory management
- Smart timeouts and wait conditions

Installation

Prerequisites

# Required
- Python 3.7+
- Google Chrome browser

Clone and Setup

git clone <repository_url>
cd <repository_directory>
pip install -r requirements.txt

Configure environment variables:

Create a .env file in the project's root directory and add the following variables:
```
FOLDER_ID=<your_google_drive_folder_id>
SPREADSHEET_ID=<your_google_sheet_id>
URL_RANGE=<range_containing_urls>  # e.g., 'Sheet1!A2:A'
CHROME_PATH=<optional_path_to_chrome_executable>
COOKIES_PATH=cookies.json # Path to cookies.json
```
Replace the placeholders with your actual values. The CHROME_PATH is optional, but can be used to specify a specific Chrome installation. The script will attempt to auto-detect Chrome if this is not provided.
Place cookies.json:

If you need to access websites that require login, create a cookies.json file in the project's root directory containing the necessary cookies. The format of this file should be a JSON array of cookie objects.

How to get cookies.json:
1. Install a cookie editor extension in your Chrome browser (e.g., "Cookie Editor").
2. Log in to the website you need to take screenshots of.
3. Open the cookie editor extension.
4. Export the cookies in JSON format and save the file as cookies.json in the project's root directory.
Place credentials.json:

You need to set up Google Cloud Project to get the credentials.json file.

How to get credentials.json:
1. Go to Google Cloud Console.
2. Create a new project or select an existing one.
3. Enable the Google Drive API and Google Sheets API for your project.
4. Look into your left panel, if you have not in the APIs and Services panel, go there and then go to "Credentials" in the left sidebar.
5. Click "Create Credentials" and choose "Service account".
6. Give your service account a name and click "Create and Continue"
7. Grant your service account the "Editor" role under "Grant this service account access to project (optional)" and click "Continue".
8. Click "Done".
9. Click on the service account email address you just created.
10. Go to the "Keys" tab.
11. Click "Add Key" and choose "Create new key".
12. Select JSON as the key type and click "Create".
13. Download the credentials.json file and place it in the project's root directory.
Share Google Drive Folder and Google Sheet:
- Share the Google Drive folder with the service account email address (found in credentials.json). Give the service account "Editor" access.
- Share the Google Sheet with the service account email address. Give the service account "Editor" access.

Project Structure

└── 📁Selenium-Automated-Fullpage-Screenshot
    └── 📁screenshots
    └── 📁utils
        └── __init__.py
        └── gdrive_utils.py
        └── gsheet_utils.py
        └── selenium_utils.py
        └── url_tracker.py
    └── .env
    └── .env.example
    └── .gitignore
    └── ARCHITECTURE.md
    └── cookies.json
    └── credentials.json
    └── error_logs.txt
    └── LICENSE.md
    └── main.py
    └── README.md
    └── requirements.txt
    └── STRUCTURE.md

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Selenium Automated Full-page Screenshot

Key Features

Installation

Project Structure

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
utils		utils
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE.md		LICENSE.md
README.md		README.md
STRUCTURE.md		STRUCTURE.md
main.py		main.py
playwright_processing_log.txt		playwright_processing_log.txt
playwright_screenshot_test.py		playwright_screenshot_test.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Selenium Automated Full-page Screenshot

Key Features

Installation

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages