AmzPy - Amazon Product Scraper

AmzPy is a lightweight Python library for scraping product information from Amazon. It provides a simple interface to fetch product details like title, price, currency, and image URL while handling anti-bot measures automatically using curl_cffi for enhanced protection.

Features

Easy-to-use API for scraping Amazon product data
Supports multiple Amazon domains (.com, .in, .co.uk, etc.)
Enhanced anti-bot protection using curl_cffi with browser impersonation
Automatic retries on detection with intelligent delay management
Support for proxies to distribute requests
Dynamic configuration options
Extract color variants, discounts, delivery information, and more
Clean and typed Python interface

Installation

Install using pip:

pip install amzpy

Basic Usage

Fetching Product Details

from amzpy import AmazonScraper

# Create scraper with default settings (amazon.com)
scraper = AmazonScraper()

# Fetch product details
url = "https://www.amazon.com/dp/B0D4J2QDVY"
product = scraper.get_product_details(url)

if product:
    print(f"Title: {product['title']}")
    print(f"Price: {product['currency']}{product['price']}")
    print(f"Brand: {product['brand']}")
    print(f"Rating: {product['rating']}")
    print(f"Image URL: {product['img_url']}")

Searching for Products

from amzpy import AmazonScraper

# Create scraper for a specific Amazon domain
scraper = AmazonScraper(country_code="in")

# Search by query - get up to 2 pages of results
products = scraper.search_products(query="wireless earbuds", max_pages=2)

# Display the results
for i, product in enumerate(products[:5], 1):
    print(f"{i}. {product['title']} - {product['currency']}{product['price']}")

Advanced Usage

Configuration Options

AmzPy offers flexible configuration options that can be set in multiple ways:

# Method 1: Set at initialization
scraper = AmazonScraper(
    country_code="in",
    impersonate="chrome119",
    proxies={"http": "http://user:pass@proxy.example.com:8080"}
)

# Method 2: Using string-based configuration
scraper.config('MAX_RETRIES = 5, REQUEST_TIMEOUT = 30, DELAY_BETWEEN_REQUESTS = (3, 8)')

# Method 3: Using keyword arguments
scraper.config(MAX_RETRIES=4, DEFAULT_IMPERSONATE="safari15")

Advanced Search Features

The search functionality can extract rich product data including:

# Search for products with 5 pages of results
products = scraper.search_products(query="men sneakers size 9", max_pages=5)

# Or search using a pre-constructed URL (e.g., filtered searches)
url = "https://www.amazon.in/s?i=shoes&rh=n%3A1983518031&s=popularity-rank"
products = scraper.search_products(search_url=url, max_pages=3)

# Access comprehensive product data
for product in products:
    # Basic information
    print(f"Title: {product.get('title')}")
    print(f"ASIN: {product.get('asin')}")
    print(f"URL: https://www.amazon.{scraper.country_code}/dp/{product.get('asin')}")
    print(f"Brand: {product.get('brand')}")
    print(f"Price: {product.get('currency')}{product.get('price')}")
    
    # Discount information
    if 'original_price' in product:
        print(f"Original Price: {product.get('currency')}{product.get('original_price')}")
        print(f"Discount: {product.get('discount_percent')}% off")
    
    # Ratings and reviews
    print(f"Rating: {product.get('rating')} / 5.0 ({product.get('reviews_count')} reviews)")
    
    # Color variants
    if 'color_variants' in product:
        print(f"Available in {len(product['color_variants'])} colors")
        for variant in product['color_variants']:
            print(f"  - {variant['name']}: https://www.amazon.{scraper.country_code}/dp/{variant['asin']}")
    
    # Additional information
    print(f"Prime Eligible: {'Yes' if product.get('prime') else 'No'}")
    if 'delivery_info' in product:
        print(f"Delivery: {product.get('delivery_info')}")
    if 'badge' in product:
        print(f"Badge: {product.get('badge')}")

Working with Proxies

To distribute requests and avoid IP blocks, you can use proxies:

# HTTP/HTTPS proxies
proxies = {
    "http": "http://user:pass@proxy.example.com:8080",
    "https": "http://user:pass@proxy.example.com:8080"
}

# SOCKS5 proxies
proxies = {
    "http": "socks5://user:pass@proxy.example.com:1080",
    "https": "socks5://user:pass@proxy.example.com:1080" 
}

scraper = AmazonScraper(proxies=proxies)

Browser Impersonation

AmzPy uses curl_cffi's browser impersonation to mimic real browser requests, significantly improving success rates when scraping Amazon:

# Specify a particular browser to impersonate
scraper = AmazonScraper(impersonate="chrome119")  # Chrome 119
scraper = AmazonScraper(impersonate="safari15")   # Safari 15
scraper = AmazonScraper(impersonate="firefox115") # Firefox 115

Configuration Reference

These configuration parameters can be adjusted:

Parameter	Default	Description
MAX_RETRIES	3	Maximum number of retry attempts for failed requests
REQUEST_TIMEOUT	25	Request timeout in seconds
DELAY_BETWEEN_REQUESTS	(2, 5)	Random delay range between requests (min, max) in seconds
DEFAULT_IMPERSONATE	'chrome119'	Default browser to impersonate

Requirements

Python 3.6+
curl_cffi (for enhanced anti-bot protection)
beautifulsoup4
lxml (for faster HTML parsing)
fake_useragent

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please read CONTRIBUTING.md for details on how to contribute to this project.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
amzpy		amzpy
graphics		graphics
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
product_B00IZ936HM.json		product_B00IZ936HM.json
product_B0DX1MF624.json		product_B0DX1MF624.json
requirements.txt		requirements.txt
setup.py		setup.py
usage_examples.py		usage_examples.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AmzPy - Amazon Product Scraper

Features

Installation

Basic Usage

Fetching Product Details

Searching for Products

Advanced Usage

Configuration Options

Advanced Search Features

Working with Proxies

Browser Impersonation

Configuration Reference

Requirements

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

theonlyanil/amzpy

Folders and files

Latest commit

History

Repository files navigation

AmzPy - Amazon Product Scraper

Features

Installation

Basic Usage

Fetching Product Details

Searching for Products

Advanced Usage

Configuration Options

Advanced Search Features

Working with Proxies

Browser Impersonation

Configuration Reference

Requirements

License

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages