🛒 PLUS Product Analyzer

Lees dit in het Nederlands

An extensive scraper and analysis tool for product information from PLUS.nl, including nutritional values, prices, and ingredients.

⚠️ Important: This project is intended for educational purposes. Make sure to respect the terms of use of PLUS.nl and use the scraper responsibly.

🔒 Configuration Required: This is a public repository. All API keys and cookies have been removed. See docs/COOKIES.md for configuration instructions.

Gallery

Price Distribution	Brand Analysis	Protein Analysis

Ingredients Wordcloud	Alcohol Efficiency	Category Prices

🔑 Prerequisites

Before you start, you will need:

Python 3.8+ and pip
CSRF token from PLUS.nl (via browser dev tools)
Valid cookies for API access

📖 Read first: docs/COOKIES.md for full setup instructions.

📁 Project Structure

The project is divided into two main components: the scraper and the data analysis tool.

plusproducten/
├── scraper/                    # 🕷️ Web scraper
│   ├── main.py                # Main scraper script
│   ├── product_scraper.py     # Product detail scraper
│   ├── sitemap_parser.py      # Sitemap parser
│   ├── database.py            # Database management
│   └── data/                  # Scraped data
├── analyze_data.py            # 📊 Data analysis script
├── setup.py                   # 🔧 Automatic setup
└── README.md                  # This documentation

🛠️ Installation

Option 1: Automatic Setup (Recommended)

python setup.py

This script automatically installs all requirements and configures the directories.

Option 2: Manual Setup

# Install scraper dependencies
pip install -r scraper/requirements.txt

# Install analysis dependencies
pip install -r requirements_analysis.txt

# Create configuration file
cp scraper/.env.example scraper/.env

⚙️ Configuration

🔑 Required: To use this scraper, you need a CSRF token and cookies from PLUS.nl.

1. Environment Setup

# Copy the template
cp scraper/.env.example scraper/.env

# Edit with your credentials
nano scraper/.env  # or your favorite editor

2. Obtain CSRF Token & Cookies

See docs/COOKIES.md for detailed instructions.

🚀 Usage

Step 1: Scrape Data

cd scraper

# Scrape the first 50 products (for testing)
python main.py --all --limit 50

# Scrape all products (can take a long time!)
python main.py --all

Step 2: Analyze Data

# Generate all analyses and visualizations
python analyze_data.py

📊 Data Analysis

The analyze_data.py script generates a series of visualizations and reports in the scraper/data/analysis folder. This includes:

Price distributions
Brand analyses
Nutritional value analyses (proteins, calories, etc.)
Ingredient word clouds

The output is saved in scraper/data/analysis/, including a README.md with the results.

🛠️ Troubleshooting

Common Problems

Cookie/Authentication Errors: Refresh your cookies and CSRF token.
Database Errors: Run cd scraper && python migrate_db.py.
Analysis Errors: Make sure you have scraped data first.

Debug Mode

cd scraper
python main.py --debug --all --limit 10

🔒 Privacy & Ethics

Respectful scraping: Built-in delays and rate limiting.
Public data: Only publicly available product information.
Educational purpose: Intended for learning and research.

🤝 Contributing

Fork the project
Create a feature branch
Commit your changes
Open a Pull Request

📄 License

MIT License - see the LICENSE file for details.

⚠️ Disclaimer

This project is for educational purposes only. Respect the terms of service of PLUS.nl and use the tool responsibly.

For questions or problems: Open an issue on GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🛒 PLUS Product Analyzer

Gallery

📋 Table of Contents

🔑 Prerequisites

📁 Project Structure

🛠️ Installation

Option 1: Automatic Setup (Recommended)

Option 2: Manual Setup

⚙️ Configuration

1. Environment Setup

2. Obtain CSRF Token & Cookies

🚀 Usage

Step 1: Scrape Data

Step 2: Analyze Data

📊 Data Analysis

🛠️ Troubleshooting

Common Problems

Debug Mode

🔒 Privacy & Ethics

🤝 Contributing

📄 License

⚠️ Disclaimer

FilesExpand file tree

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

🛒 PLUS Product Analyzer

Gallery

📋 Table of Contents

🔑 Prerequisites

📁 Project Structure

🛠️ Installation

Option 1: Automatic Setup (Recommended)

Option 2: Manual Setup

⚙️ Configuration

1. Environment Setup

2. Obtain CSRF Token & Cookies

🚀 Usage

Step 1: Scrape Data

Step 2: Analyze Data

📊 Data Analysis

🛠️ Troubleshooting

Common Problems

Debug Mode

🔒 Privacy & Ethics

🤝 Contributing

📄 License

⚠️ Disclaimer