Skip to content

orenIsabella/weather-analytics

Repository files navigation

🌦️ Weather Analytics Platform

A backend platform for collecting, storing, and analyzing weather data across multiple cities using Open-Meteo API. It features an ETL pipeline, RESTful API, PostgreSQL storage, and Kubernetes deployment.


Setup

Installation

  1. Clone the Repository
git clone https://github.com/orenisabella/weather-analytics.git
cd weather-analytics
  1. Create Environment File

Create a .env file in the project root with:

PGHOST=host.docker.internal
PGPORT=5432
PGUSER=weather_user
PGPASSWORD=weather_pass
PGDATABASE=weather_db
  1. Initialize Database (first time only)

Make sure PostgreSQL is running, then run:

npx ts-node-dev src/db/init.ts
  1. Install Dependencies
npm install

Execution

Run ETL Process

npm run etl

Run API Server

npm run start

Server will be available at: http://localhost:3000


Database Schema

weather Table

Stores current or historical weather observations.

Column Type Description
city TEXT Name of the city
temperature REAL Temperature in Celsius
wind_speed REAL Wind speed in km/h
wind_direction REAL Wind direction in degrees
timestamp TIMESTAMP Time the data was recorded
Primary Key: (city, timestamp)

alerts Table

Stores alert entries based on defined thresholds (temp > 35°C or wind > 50km/h).

Column Type Description
id SERIAL Auto-incremented ID (Primary Key)
city TEXT City where the alert was triggered
type TEXT Type of alert
value REAL Value that triggered the alert
timestamp TIMESTAMP Time the alert was triggered

API Querying Strategies

/weather

Get weather data for a city in a time range:

GET /weather?city=Tel%20Aviv&from=2025-04-14T00:00:00&to=2025-04-15T00:00:00

/alerts

Get alerts by city, type, and time:

GET /alerts?city=London&type=High%20Temperature&from=2025-04-14T00:00:00

/trends

Used to analyze trends in weather data.

1. avg_temp – Average temperature per timestamp across city

GET /trends?metric=avg_temp&from=2025-04-10

2. max_wind – Max wind speed per city since date

GET /trends?metric=max_wind&from=2025-04-01

3. daily_avg_temp – Daily average temperature per city

GET /trends?metric=daily_avg_temp&from=2025-04-01&city=Tel%20Aviv

4. daily_max_temp – Daily max temperature per city

GET /trends?metric=daily_max_temp&from=2025-04-01&city=Tel%20Aviv

Kubernetes Deployment

  1. Start Minikube
minikube start
  1. Deploy Services
kubectl apply -f k8s/postgres-deployment.yaml
kubectl apply -f k8s/postgres-service.yaml
kubectl apply -f k8s/api-deployment.yaml
kubectl apply -f k8s/api-service.yaml
  1. Access the API
kubectl port-forward service/weather-api 3000:3000

Then visit http://localhost:3000


Docker Hub Images

Both services have public Docker images hosted at:

You can pull them with:

docker pull orenisabella/weather-api
docker pull orenisabella/weather-etl

fetchWeather vs fetchHourlyWeather

fetchWeather(city: City): WeatherData

  • Used to fetch current weather snapshot for a city.
  • Fast, simple, ideal for hourly collection.
  • Returns one object (latest temperature, wind, etc.).

fetchHourlyWeather(city: City, from: Date): WeatherData[]

  • Used to backfill historical data if the ETL missed hours.
  • Loops over hourly data from Open-Meteo.
  • Returns a list of hourly records from from to now.

The system chooses between them based on whether there's a time gap since the last recorded weather entry.


Thoughts on Querying & Performance

Things I would do to improve scalability and performance:

  • Add indexes on timestamp, city, and type.
  • Consider materialized views for heavy aggregation metrics like daily_avg_temp.
  • Introduce pagination to all endpoints.
  • Use Redis caching for repeated queries.
  • Use async queues or schedulers (e.g., CronJob) for large city lists.
  • Optionally, break alert logic into city-specific configs stored in DB.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors