A robust Python automation tool that captures full-page screenshots of web pages, with special handling for dynamic content and lazy-loaded elements. Built with Selenium WebDriver and optimized for reliability.
-
Reliable Full-page Screenshots
- Handles dynamic content and lazy-loaded images
- Optimized layout calculations for accurate dimensions
- Multiple fallback capture methods
- Smart scroll handling for content loading
-
Performance Optimizations
- Headless Chrome with optimized flags
- Network and resource handling optimizations
- Efficient memory management
- Smart timeouts and wait conditions
-
Prerequisites
# Required - Python 3.7+ - Google Chrome browser -
Clone and Setup
git clone <repository_url> cd <repository_directory> pip install -r requirements.txt
-
Configure environment variables:
Create a
.envfile in the project's root directory and add the following variables:FOLDER_ID=<your_google_drive_folder_id> SPREADSHEET_ID=<your_google_sheet_id> URL_RANGE=<range_containing_urls> # e.g., 'Sheet1!A2:A' CHROME_PATH=<optional_path_to_chrome_executable> COOKIES_PATH=cookies.json # Path to cookies.jsonReplace the placeholders with your actual values. The
CHROME_PATHis optional, but can be used to specify a specific Chrome installation. The script will attempt to auto-detect Chrome if this is not provided. -
Place
cookies.json:If you need to access websites that require login, create a
cookies.jsonfile in the project's root directory containing the necessary cookies. The format of this file should be a JSON array of cookie objects.How to get
cookies.json:- Install a cookie editor extension in your Chrome browser (e.g., "Cookie Editor").
- Log in to the website you need to take screenshots of.
- Open the cookie editor extension.
- Export the cookies in JSON format and save the file as
cookies.jsonin the project's root directory.
-
Place
credentials.json:You need to set up Google Cloud Project to get the
credentials.jsonfile.How to get
credentials.json:- Go to Google Cloud Console.
- Create a new project or select an existing one.
- Enable the Google Drive API and Google Sheets API for your project.
- Look into your left panel, if you have not in the APIs and Services panel, go there and then go to "Credentials" in the left sidebar.
- Click "Create Credentials" and choose "Service account".
- Give your service account a name and click "Create and Continue"
- Grant your service account the "Editor" role under "Grant this service account access to project (optional)" and click "Continue".
- Click "Done".
- Click on the service account email address you just created.
- Go to the "Keys" tab.
- Click "Add Key" and choose "Create new key".
- Select JSON as the key type and click "Create".
- Download the
credentials.jsonfile and place it in the project's root directory.
-
Share Google Drive Folder and Google Sheet:
- Share the Google Drive folder with the service account email address (found in
credentials.json). Give the service account "Editor" access. - Share the Google Sheet with the service account email address. Give the service account "Editor" access.
- Share the Google Drive folder with the service account email address (found in
└── 📁Selenium-Automated-Fullpage-Screenshot
└── 📁screenshots
└── 📁utils
└── __init__.py
└── gdrive_utils.py
└── gsheet_utils.py
└── selenium_utils.py
└── url_tracker.py
└── .env
└── .env.example
└── .gitignore
└── ARCHITECTURE.md
└── cookies.json
└── credentials.json
└── error_logs.txt
└── LICENSE.md
└── main.py
└── README.md
└── requirements.txt
└── STRUCTURE.md
This project is licensed under the MIT License - see the LICENSE file for details.