A Chrome extension that allows users to capture screenshots of web pages and analyze them using Google's Gemini AI. This tool provides both full-page screenshots and the ability to select specific areas of a page for analysis.
- Full-page Screenshots: Capture the entire visible portion of a webpage
- Area Selection: Select and screenshot specific areas of a webpage
- Gemini AI Integration: Send screenshots to Gemini AI for detailed image analysis
- Convenient Side Panel: All controls and previews in an easy-to-use side panel
This project uses GitHub Actions to automate the release process. Run the release command and pick a version bump type:
bun run release # interactive prompt to pick next version
bun run release patch # bump patch version (1.0.0 → 1.0.1)
bun run release minor # bump minor version (1.0.0 → 1.1.0)
bun run release major # bump major version (1.0.0 → 2.0.0)
bun run release 1.2.3 # set an explicit versionThis will:
- Update the version in
manifest.json - Commit the changes
- Create and push a new tag (e.g.,
v1.0.1) - Trigger the GitHub Actions workflow to create a release
The release will be automatically created on GitHub with the packaged extension attached.
This project was developed with assistance from GitHub Copilot, which contributed approximately 75-80% of the code. The AI helped generate the core functionality, while human input was essential for fine-tuning the user experience and integrating with Chrome extension APIs. The extension demonstrates practical application of Google's Gemini AI API for image analysis within a browser context.
-
Clone this repository or download the source code:
git clone <repository-url>or download and extract the ZIP file
-
Open Chrome and navigate to
chrome://extensions/ -
Enable "Developer mode" by toggling the switch in the top right corner
-
Click "Load unpacked" and select the directory containing the extension files
-
The extension should now appear in your Chrome toolbar
-
Obtain a Gemini API key from Google AI Studio
-
In the extension's side panel, click the ⚙️ (Settings) icon
-
Enter your API key in the provided field and click "Save"
-
Click the extension icon in your toolbar to open the side panel
-
To take a full-page screenshot, click "Take Screenshot"
-
To capture a specific area, click "Select Area" and drag to create a selection
-
After capturing a screenshot, click "Send to Gemini AI" to analyze the image
-
The AI's analysis will appear below the screenshot in the side panel
- JavaScript
- Chrome Extension APIs
- Google Gemini AI API
- Google Chrome browser
- Gemini API key
This extension processes screenshots locally and only sends them to Google's Gemini API when you explicitly click the "Send to Gemini AI" button. Your API key is stored in your browser's local storage and is never sent to any server except Google's API endpoints.