Service for collecting climate data, implementing an ETL pipeline to monitor weather conditions in different Brazilian cities.
The project uses Docker containers to orchestrate:
- ETL Service: Python pipeline for weather data collection
- REST API: Spring Boot application to expose the data
- PostgreSQL: Database for storage
- Docker and Docker Compose installed
- WeatherAPI.com API key (free)
# Clone the repository
git clone <your-repository>
cd climate-data-service
# Copy the environment variables example file
cp env.example .env
# Edit the .env file and add your API key
nano .envEdit the .env file and add your API key:
API_KEY=your_api_key_hereGet your free key at: https://www.weatherapi.com/
# Build and start all services
docker-compose up -d
# View ETL service logs
docker-compose logs -f climate-etl
# View PostgreSQL logs
docker-compose logs -f postgres- REST API: http://localhost:8081
- Swagger UI: http://localhost:8081/swagger-ui.html
- Health Check: http://localhost:8081/health
- PostgreSQL:
localhost:5432
docker-compose ps# All services
docker-compose logs -f
# Only ETL
docker-compose logs -f climate-etl# Connect to PostgreSQL
docker-compose exec postgres psql -U climate_user -d climate_db
# Query data
SELECT * FROM weather ORDER BY created_at DESC LIMIT 10;
# Run useful queries
\i data/queries.sql# List all records
curl http://localhost:8081/clima
# Search by city
curl http://localhost:8081/clima/Campinas
# Get today's data for a city
curl http://localhost:8081/clima/Campinas/hoje
# API health check
curl http://localhost:8081/health# Stop all services
docker-compose down
# Stop and remove volumes (warning: deletes data)
docker-compose down -v
# Rebuild containers
docker-compose up --build -d
# Run only the ETL (without scheduling)
docker-compose run --rm climate-etl python etl_project/etl_pipeline.py
# Backup the database
docker-compose exec postgres pg_dump -U climate_user -d climate_db > data/backup_$(date +%Y%m%d_%H%M%S).sqlclimate-data-service/
βββ etl_project/ # ETL pipeline code
β βββ config/ # Configurations
β βββ utils/ # Utilities
β βββ main.py # Entry point
β βββ etl_pipeline.py # Main pipeline
β βββ Dockerfile # ETL app container
β βββ requirements.txt # Python dependencies
βββ api/ # Spring Boot application
β βββ src/main/java/ # Java code
β βββ src/main/resources/ # Configurations
β βββ src/test/ # Tests
β βββ Dockerfile # API container
β βββ pom.xml # Maven dependencies
βββ data/ # Database files
β βββ init.sql # Initialization script
β βββ backup.sql # Backup script
β βββ queries.sql # Useful queries
β βββ README.md # Data documentation
βββ logs/ # Application logs
βββ docker-compose.yml # Container orchestration
βββ env.example # Environment variable example
βββ .gitignore # Git ignored files
Edit etl_project/config/config.yaml to change the cities:
cities:
- Campinas
- Limeira
- Americana
- Sao Paulo
- Rio de JaneiroIn etl_project/config/config.yaml, change the interval:
schedule:
interval:
value: 30
unit: secondsThe system automatically collects:
- Temperature (Celsius)
- Wind speed (km/h)
- Relative humidity (%)
- Location (city, region, country)
- Collection timestamp
# Check logs
docker-compose logs climate-etl
# Check if PostgreSQL is healthy
docker-compose ps postgres# Check if environment variables are correct
docker-compose exec climate-etl env | grep POSTGRESQL- Check if the API key is correct in the
.envfile - Confirm the API is working: https://www.weatherapi.com/
Logs are saved in:
./logs/(mounted volume in the container)- Container console (via
docker-compose logs)
- Database credentials are set in docker-compose.yml
- The API key must be kept safe in the
.envfile - The
.envfile should not be committed to the repository
For issues or questions:
- Check the logs:
docker-compose logs - See WeatherAPI.com documentation
- Open an issue in the repository
-
ETL with Python β
- Python 3 script consuming weather API
- Configurable city query
- Data treatment and validation
- Periodic scheduling
-
PostgreSQL β
- Relational database structure
- Creation scripts (init.sql)
- Integrity and normalization
-
REST API with Spring Boot β
GET /clima- List all recordsGET /clima/{cidade}- Filter by cityGET /clima/{cidade}/hoje- Most recent data of the day
-
Containerization β
- Python container (ETL)
- PostgreSQL container
- Spring Boot container (API)
- docker-compose.yml for orchestration
- Swagger Documentation: Interactive API interface
- Health Checks: Service health monitoring
- Structured logs: Easier debugging
- Automated tests: Test structure configured
- Multi-stage Docker builds: Image optimization
π§π· Leia a documentaΓ§Γ£o em portuguΓͺs
See the contribution guidelines.
This project is licensed under the MIT License.