Euraxess is a European platform that connects researchers with job opportunities across universities, research institutions, and industry. It serves as a central hub for research-related career openings, covering multiple fields and countries. The platform is maintained by the European Commission and supports international mobility and career development for researchers.
This project is built around the goal of extracting and analyzing data from the Euraxess job portal to better understand academic hiring trends, demand across research fields, geographic distribution, and more.
You don’t need Python, Poetry, or any local dependencies.
Simply pull the container from Docker Hub and run it; all outputs will land in a host-mounted folder of your choice.
docker run --rm -v "$(pwd)/data":/app/eurex_feature_engineering/output/transformed arjunrao123/eurex-stat:latest- What happens:
- The Scrapy spider crawls Euraxess and stores raw listings.
- The processor cleans & enriches the data.
- All CSV outputs appear in
./dataon your machine.
- Docker Hub: https://hub.docker.com/r/arjunrao123/eurex-stat
The container removes itself after finishing (--rm), leaving only the data behind. Happy scraping!
This repository is structured around three main goals:
-
Data Collection
Extract job listings from the Euraxess portal on a regular basis. -
Data Analysis (in progress)
Apply statistical and exploratory methods to uncover patterns in the academic and research job market across Europe.
This repository includes multiple components, each with its own specific function. For technical details such as how the scrapers work or how to run the data pipeline, refer to the individual README.md files provided in the respective directories.