An extensive scraper and analysis tool for product information from PLUS.nl, including nutritional values, prices, and ingredients.
⚠️ Important: This project is intended for educational purposes. Make sure to respect the terms of use of PLUS.nl and use the scraper responsibly.
🔒 Configuration Required: This is a public repository. All API keys and cookies have been removed. See
docs/COOKIES.mdfor configuration instructions.
| Price Distribution | Brand Analysis | Protein Analysis |
|---|---|---|
![]() |
![]() |
![]() |
| Ingredients Wordcloud | Alcohol Efficiency | Category Prices |
|---|---|---|
![]() |
![]() |
![]() |
- Prerequisites
- Project Structure
- Installation
- Configuration
- Usage
- Data Analysis
- Troubleshooting
- Privacy & Ethics
- Contributing
- License
Before you start, you will need:
- Python 3.8+ and pip
- CSRF token from PLUS.nl (via browser dev tools)
- Valid cookies for API access
📖 Read first: docs/COOKIES.md for full setup instructions.
The project is divided into two main components: the scraper and the data analysis tool.
plusproducten/
├── scraper/ # 🕷️ Web scraper
│ ├── main.py # Main scraper script
│ ├── product_scraper.py # Product detail scraper
│ ├── sitemap_parser.py # Sitemap parser
│ ├── database.py # Database management
│ └── data/ # Scraped data
├── analyze_data.py # 📊 Data analysis script
├── setup.py # 🔧 Automatic setup
└── README.md # This documentation
python setup.pyThis script automatically installs all requirements and configures the directories.
# Install scraper dependencies
pip install -r scraper/requirements.txt
# Install analysis dependencies
pip install -r requirements_analysis.txt
# Create configuration file
cp scraper/.env.example scraper/.env🔑 Required: To use this scraper, you need a CSRF token and cookies from PLUS.nl.
# Copy the template
cp scraper/.env.example scraper/.env
# Edit with your credentials
nano scraper/.env # or your favorite editorSee docs/COOKIES.md for detailed instructions.
cd scraper
# Scrape the first 50 products (for testing)
python main.py --all --limit 50
# Scrape all products (can take a long time!)
python main.py --all# Generate all analyses and visualizations
python analyze_data.pyThe analyze_data.py script generates a series of visualizations and reports in the scraper/data/analysis folder. This includes:
- Price distributions
- Brand analyses
- Nutritional value analyses (proteins, calories, etc.)
- Ingredient word clouds
The output is saved in scraper/data/analysis/, including a README.md with the results.
- Cookie/Authentication Errors: Refresh your cookies and CSRF token.
- Database Errors: Run
cd scraper && python migrate_db.py. - Analysis Errors: Make sure you have scraped data first.
cd scraper
python main.py --debug --all --limit 10- Respectful scraping: Built-in delays and rate limiting.
- Public data: Only publicly available product information.
- Educational purpose: Intended for learning and research.
- Fork the project
- Create a feature branch
- Commit your changes
- Open a Pull Request
MIT License - see the LICENSE file for details.
This project is for educational purposes only. Respect the terms of service of PLUS.nl and use the tool responsibly.
For questions or problems: Open an issue on GitHub





