🌍 Climate Change Data Analysis & Sea Level Rise Prediction

A Data Science Project using Machine Learning & Exploratory Analysis

📌 Overview

This project explores the impacts of various climate indicators—such as temperature, CO₂ emissions, precipitation, humidity, and wind speed—on sea level rise. Using a combination of exploratory data analysis (EDA), feature engineering, outlier treatment, and multiple machine-learning models, the project predicts sea-level variations and examines which environmental factors influence them most strongly.

This repository includes a complete end-to-end workflow:

Data loading
Cleaning & feature engineering
Exploratory data analysis
Model training (Linear Regression, Random Forest, Decision Tree, SVR)
Model evaluation
A prediction interface for generating sea-level rise estimates

📁 Dataset

The dataset contains the following climate-related fields:

Column	Description
Date	Daily timestamp
Location	City or locality
Country	Country identifier
Temperature	Temperature (°C)
CO₂ Emissions	Carbon emissions (tons/year)
Sea Level Rise	Rise in sea level (meters)
Precipitation	Precipitation (mm)
Humidity	Air humidity (%)
Wind Speed	Wind speed (m/s)

🔧 Features & Engineering

To support time-series and seasonal analysis, the following features were extracted from the Date column:

year_month (YYYY-MM)
year
month

These engineered features are used later in the model training pipeline, along with one-hot encoding of categorical fields such as Country.

🧹 Data Cleaning

Outliers were removed using the Interquartile Range (IQR) method for:

CO₂ Emissions
Sea Level Rise
Temperature

A total of 218 rows were removed, improving model stability and reducing noise.

📊 Exploratory Data Analysis (EDA)

The notebook includes detailed EDA through:

Correlation heatmaps
Scatter plots showing relationships with sea-level rise
Histograms & boxplots for distribution analysis
Seasonal and temporal trend visualizations

These analyses reveal which climate indicators correlate most strongly with sea-level variations.

🤖 Machine Learning Models Used

A variety of regression models were trained and evaluated:

1. Linear Regression

A baseline model for detecting linear relationships.

2. Random Forest Regressor

An ensemble learning method capable of modeling complex non-linear interactions.

3. Decision Tree Regressor

A simple, interpretable tree-based model using recursive splitting.

4. Support Vector Regression (SVR)

A margin-based model effective for non-linear boundaries (after feature scaling).

Evaluation Metrics

Each model is assessed using:

Mean Squared Error (MSE)
R² Score
Actual vs. Predicted Scatter Plots
Actual vs. Predicted Trend Plots

🔮 Prediction System

The notebook includes an interactive prediction system that accepts user input for:

Temperature
CO₂ emissions
Precipitation
Humidity
Wind speed
Year
Month
Country

These inputs go through the same preprocessing pipeline used during training (including scaling and encoding), ensuring no data leakage.

Example Output:

Model	Predicted Sea Level Rise
Linear Regression	0.08829
Random Forest	0.12365
Decision Tree	0.53591
SVR	-0.06596

📈 Results Summary

Outlier removal improved distribution smoothness and reduced skewness.
Random Forest performed best overall in accuracy and robustness.
SVR predictions varied significantly, showing sensitivity to scaling and kernel parameters.
Using multiple models provides a more comprehensive understanding of possible sea-level rise outcomes.

🚀 Next Steps

Potential enhancements include:

Hyperparameter tuning (GridSearchCV / RandomizedSearchCV)
Deep learning models (LSTM, GRU) for time-series prediction
GIS or geographical visualization dashboards
Country-level climate trend analytics
Feature importance analysis (SHAP, permutation importance)

🗂 Project Structure

├── climate_change_data.csv # Dataset
├── Climate_Analysis.ipynb # Main notebook (EDA, models, predictions)
├── README.md # Project documentation

🧰 Technologies Used

Python 3.10+
pandas
numpy
seaborn / matplotlib
scikit-learn
wordcloud
Jupyter Notebook / Google Colab

🙌 Acknowledgments

This project aims to deepen understanding of the global climate crisis using data-driven methods. Special thanks to the providers of open climate datasets and the open-source community.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
main_notebook.ipynb		main_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Climate Change Data Analysis & Sea Level Rise Prediction

📌 Overview

📁 Dataset

🔧 Features & Engineering

🧹 Data Cleaning

📊 Exploratory Data Analysis (EDA)

🤖 Machine Learning Models Used

1. Linear Regression

2. Random Forest Regressor

3. Decision Tree Regressor

4. Support Vector Regression (SVR)

Evaluation Metrics

🔮 Prediction System

Example Output:

📈 Results Summary

🚀 Next Steps

🗂 Project Structure

🧰 Technologies Used

🙌 Acknowledgments

About

Uh oh!

Languages

Azm1ne/Climate-Change-Predictor

Folders and files

Latest commit

History

Repository files navigation

🌍 Climate Change Data Analysis & Sea Level Rise Prediction

📌 Overview

📁 Dataset

🔧 Features & Engineering

🧹 Data Cleaning

📊 Exploratory Data Analysis (EDA)

🤖 Machine Learning Models Used

1. Linear Regression

2. Random Forest Regressor

3. Decision Tree Regressor

4. Support Vector Regression (SVR)

Evaluation Metrics

🔮 Prediction System

Example Output:

📈 Results Summary

🚀 Next Steps

🗂 Project Structure

🧰 Technologies Used

🙌 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages