This project showcases a real-time weather data pipeline designed to predict weather patterns for 15 targeted cities. By leveraging the power of AWS cloud services, Snowflake, and SQL-based analysis, the pipeline seamlessly processes, stores, and analyzes hourly weather updates from a reliable weather API. The real-time data integration ensures accurate and timely forecasts, making it a robust solution for understanding and predicting weather trends.
- Programming Language - Python
- Scripting Language - SQL
- AWS
- lambda
- Event Bridge
- Dynamo DB
- S3
- SQS
- Snowflake
Used weather data from the OpenWeather API. The API provides real-time weather updates, including city name, temperature, weather condition and timestamp, for various cities in India.
- API Endpoint:
- API_KEY:
- Usage in the Project: The weather data fetched from the OpenWeather API is processed every hour, stored in DynamoDB and S3, ultimately used to calculate average temperatures and forecast weather patterns.
- weather_fetch.py
- Fetches hourly weather data from the OpenWeather API using an API key and stores it in DynamoDB.
- dynamodb_stream.py
- Processes new weather data from DynamoDB Streams and archives it into an S3 bucket for analysis.
- Snowflake_query.sql
- Creates a Snowflake stage for seamless integration with the S3 bucket. Ingests new data into Snowflake tables using Snowpipe and performs basic transformations.Calculates the average hourly temperature and weather prediction for the 15 target cities.
- notes: this image signifies the Dynamo DB table storing the weather data. Major fields include city name, timestamp, weather and temperature.
- notes: S3 bucket containing archived weather data files. Each file corresponds to an hourly update for target locations.
- notes: Results of SQL queries in Snowflake showing the weather forcast date, predicted temperature and predicted weather for 15 cities, calculated from real-time weather data.



