Bedetheque Scraper

A robust and scalable toolkit for extracting data from the Bedetheque.com portal. Designed to work as a flexible research wrapper for comic book metadata.

✨ Features

Comprehensive Scraping: Detailed extraction of Authors, Series, Albums, and Magazines (Revues).
Relational Mapping: Automatically maps connections between authors and their works.
Async Architecture: Built on top of httpx and SQLAlchemy Async for high-performance data handling.
Throttling: Integrated rate limiting to respect server boundaries.
Clean Models: Rich domain models for easy integration into your own applications.

🚀 Getting Started

1. Prerequisites

Python 3.11+
Async-compatible database (e.g., PostgreSQL or SQLite)

2. Installation

pip install -r requirements.txt

(Ensure sqlalchemy, asyncpg, httpx, beautifulsoup4, and python-dotenv are installed)

3. Configuration

Create a .env file in the root directory (see .env.example):

DATABASE_URL=postgresql+asyncpg://user:password@host:5432/postgres
REQUESTS_PER_SECOND=2

4. Basic Usage

Check out the scripts/examples directory for ready-to-use research scripts:

Search Series: python scripts/examples/example_serie.py
Research Authors: python scripts/examples/example_auteur.py
Browse Magazines: python scripts/examples/example_revue.py

📁 Project Structure

bedetheque/: Core library (Models, Parsers, Scrapers, Repositories).
scripts/: Operational and utility scripts.
scripts/examples/: Reference implementations for each entity research.

🛠️ Built With

HTTP Client: HTTPX
HTML Parsing: BeautifulSoup4
Data Layer: SQLAlchemy 2.0 (Async)

Developed with ❤️ for the comic book collector community.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
bedetheque		bedetheque
resources		resources
scripts		scripts
services		services
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bedetheque Scraper

✨ Features

🚀 Getting Started

1. Prerequisites

2. Installation

3. Configuration

4. Basic Usage

📁 Project Structure

🛠️ Built With

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bedetheque Scraper

✨ Features

🚀 Getting Started

1. Prerequisites

2. Installation

3. Configuration

4. Basic Usage

📁 Project Structure

🛠️ Built With

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages