Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 66 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,66 @@
# oaf-psd-bootcamp
Initial repository for the Open Avenues: Professional Software Development Bootcamp
# Chicago Temperature Analysis (1940-2020)

Python-based statistical analysis of 80 years of temperature data from Chicago, Illinois.

## Overview

This project analyzes long-term climate trends using hourly temperature measurements from the Open-Metro API. The analysis reveals significant temperature shifts over eight decades.

## Features

- Automated data retrieval from Open-Metro API
- SQLite database storage (Chicago_temp.db)
- Statistical calculations (min, max, mean, standard deviation)
- Annual and decade-level aggregations
- Data visualization with line graphs
- CSV export functionality

This executes three modules in sequence:
1. `model_builder.py` - Downloads and stores temperature data
2. `control_data.py` - Calculates statistical measures
3. `view_data.py` - Generates visualizations

## Key Findings

Analysis of 80 years of data (1940-2020) shows:

- Average temperature increased by 1°C (2°F)
- Maximum temperature increased by 1°C (2°F)
- Minimum temperature decreased by 5°C (10°F)

| Decade | Max | Mean | Min | Std Dev |
|--------|-------|-------|---------|---------|
| 1940 | 35.5 | 9.2 | -27.0 | 11.6 |
| 1950 | 36.1 | 10.0 | -26.4 | 11.5 |
| 1960 | 36.2 | 9.4 | -28.8 | 11.8 |
| 1970 | 35.4 | 9.5 | -26.5 | 11.8 |
| 1980 | 37.0 | 9.8 | -31.8 | 11.7 |
| 1990 | 37.4 | 10.3 | -32.5 | 11.0 |
| 2000 | 36.7 | 10.4 | -28.0 | 11.3 |
| 2010 | 36.9 | 10.2 | -32.4 | 11.3 |

*All temperatures in Celsius*

## Technologies

- Python 3.x
- SQLite
- Open-Metro API
- Data visualization libraries (matplotlib/seaborn)

## Project Structure
```
.
├── main.py # Main execution script
├── model_builder.py # Data retrieval and storage
├── control_data.py # Statistical analysis
├── view_data.py # Visualization generation
└── Chicago_temp.db # SQLite database (generated)
```

## Author

Steve Eckardt
October 2023 - December 2023
Professional Software Development Bootcamp, Open Avenues Foundation
This project is for educational and portfolio purposes.
Binary file added final/Chicagos_annual_temp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added final/Chicagos_decadal_temp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions final/control_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
"""
Author: Steve Eckardt <seckardt@pacbell.net>
Project: oa professional software development bootcamp
Program Name: control_data.py
Revised: January 3rd 2024
License: MIT
"""

import sqlite3
import pandas as pd
import numpy as np

sqlite_file = 'Chicago_temp.db'
conn = sqlite3.connect(sqlite_file)

column_names = ['Year', 'Mean', 'Max', 'Min', 'Std']
annual_df = pd.DataFrame(columns=column_names)
annual_df.set_index('Year', inplace=True)

for year_index in range(1940, 2024):
query = f"SELECT * FROM hourly_data WHERE strftime('%Y', date) = '{year_index}';"
selected_year_df = pd.read_sql_query(query, conn)
temp_array = selected_year_df['temperature_2m'].values

mean_value = round(np.mean(temp_array),3)
max_value = round(np.max(temp_array),3)
min_value = round(np.min(temp_array),3)
std_value = round(np.std(temp_array),3)

new_row = {
'Year': year_index,
'Mean': mean_value,
'Max': max_value,
'Min': min_value,
'Std': std_value
}

annual_df.loc[year_index] = new_row

annual_df['Mean+Std'] = annual_df['Mean'] + annual_df['Std']
annual_df['Mean-Std'] = annual_df['Mean'] - annual_df['Std']
new_order = ['Max', 'Mean+Std', 'Mean', 'Mean-Std', 'Min', 'Std']
annual_df = annual_df[new_order]

annual_df.to_sql('annual_data', conn, if_exists='replace')




column_names[0] = 'Decade'
decadal_df = pd.DataFrame(columns=column_names)
decadal_df.set_index('Decade', inplace=True)

for start_year in range(1940, 2011, 10):
end_year = start_year + 9
query = f"SELECT * FROM hourly_data WHERE strftime('%Y', date) >= '{start_year}' AND strftime('%Y', date) <= '{end_year}';"
selected_decade_df = pd.read_sql_query(query, conn)
temp_array = selected_decade_df['temperature_2m'].values

mean_value = round(np.mean(temp_array),3)
max_value = round(np.max(temp_array),3)
min_value = round(np.min(temp_array),3)
std_value = round(np.std(temp_array),3)

new_row = {
'Decade': start_year,
'Mean': mean_value,
'Max': max_value,
'Min': min_value,
'Std': std_value
}

decadal_df .loc[start_year] = new_row

decadal_df['Mean+Std'] = decadal_df['Mean'] + decadal_df['Std']
decadal_df['Mean-Std'] = decadal_df['Mean'] - decadal_df['Std']
decadal_df = decadal_df[new_order]

decadal_df.to_sql('decadal_data', conn, if_exists='replace')

conn.close()

print()
print(annual_df)
annual_df.to_csv('Chicago_annual_temp.csv', index=False)

print()
print(decadal_df)
decadal_df.to_csv('Chicago_decadal_temp.csv', index=False)

43 changes: 43 additions & 0 deletions final/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""
Author: Steve Eckardt <seckardt@pacbell.net>
Project: oa professional software development bootcamp
Program Name: main.py
Revised: January 3rd 2024
License: MIT

The goal of this project is to observe temperature changes in Chicago, Illinois, over the last 80 years.

Steps:
Model builder:
1. Download the hourly measurements from the Open-Metro API site.
2. Create a Chicago_temp.db database and save the temperature data into a table.
3. Save the temperature data into a CSV file and print temperature data to the screen.
Control Data:
4. Calculate the data set's annual min, max, mean, and standard deviation.
5. Calculate the decade min max medium and standard deviation of the data set.
6. Save the calculations to a new table in the database, a CSV file, and print to the screen.
View Data:
6. Plot a line graph of the annual data and the decade data.

The two graphs and a screen capture of the program run are attached.

Observations and Conclusions:
Decade Max Mean+Std Mean Mean-Std Min Std
1940 35.541 20.813 9.215 -2.383 -27.009 11.598
1950 36.126 21.429 9.958 -1.513 -26.374 11.471
1960 36.176 21.195 9.404 -2.387 -28.774 11.791
1970 35.376 21.292 9.466 -2.360 -26.474 11.826
1980 36.976 21.546 9.834 -1.878 -31.824 11.712
1990 37.426 21.302 10.291 -0.720 -32.524 11.011
2000 36.726 21.645 10.375 -0.895 -28.024 11.270
2010 36.876 21.541 10.246 -1.049 -32.437 11.295
Over the past 80 years, the data shows that average and maximum temperatures have increased by 1° Celsius or about 2° Fahrenheit.
Interestingly, the minimum temperature decreased by 5° C or about 10° f.
"""
import subprocess
try:
subprocess.run(["python", 'model_builder.py'], check=True)
subprocess.run(["python", 'control_data.py'], check=True)
subprocess.run(["python", 'view_data.py'], check=True)
except subprocess.CalledProcessError as e:
print(f"Error running the script: {e}")
71 changes: 71 additions & 0 deletions final/model_builder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
"""
Author: Steve Eckardt <seckardt@pacbell.net>
Original code by Patrick Zippenfenig Source: https://github.com/open-meteo/sdk
Project: oa professional software development bootcamp
Program Name: model_builder.py
Revised: January 3rd 2024
License: MIT
"""

import openmeteo_requests
import requests_cache
import pandas as pd
import matplotlib.pyplot as plt
from retry_requests import retry
import sqlite3

# Setup the Open-Meteo API client with cache and retry on error
cache_session = requests_cache.CachedSession('.cache', expire_after = -1)
retry_session = retry(cache_session, retries = 5, backoff_factor = 0.2)
openmeteo = openmeteo_requests.Client(session = retry_session)

# Make sure all required weather variables are listed here
# The order of variables in hourly or daily is important to assign them correctly below
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
"latitude": 41.85003,
"longitude": -87.65005, #Chicago, Illinois
"start_date": "1940-01-01",
# "start_date": "2023-12-30",
"end_date": "2023-12-31",
"hourly": "temperature_2m",
"timezone": "America/Los_Angeles"
}
responses = openmeteo.weather_api(url, params=params)

# Process first location. Add a for-loop for multiple locations or weather models
response = responses[0]
print(f"Coordinates {response.Latitude()}°E {response.Longitude()}°N")
print(f"Elevation {response.Elevation()} m asl")
print(f"Timezone {response.Timezone()} {response.TimezoneAbbreviation()}")
print(f"Timezone difference to GMT+0 {response.UtcOffsetSeconds()} s")

# Process hourly data. The order of variables needs to be the same as requested.
hourly = response.Hourly()
hourly_temperature_2m = hourly.Variables(0).ValuesAsNumpy()

hourly_data = {"date": pd.date_range(
start = pd.to_datetime(hourly.Time(), unit = "s"),
end = pd.to_datetime(hourly.TimeEnd(), unit = "s"),
freq = pd.Timedelta(seconds = hourly.Interval()),
inclusive = "left"
)}
hourly_data["temperature_2m"] = hourly_temperature_2m

hourly_dataframe = pd.DataFrame(data = hourly_data)
print(hourly_dataframe)

first_column_type = hourly_dataframe['date'].dtype
print(f"The data type of the first column is: {first_column_type}")

second_column_type = hourly_dataframe['temperature_2m'].dtype
print(f"The data type of the second column is: {second_column_type}")

# Save the DataFrame to a CSV file
hourly_dataframe.to_csv('Chicago_hourly_temp.csv', index=False)

# create SQLite database file
sqlite_file = 'Chicago_temp.db'
conn = sqlite3.connect(sqlite_file)
hourly_dataframe.to_sql('hourly_data', conn, index=False, if_exists='replace')
conn.close()
50 changes: 50 additions & 0 deletions final/screen_capture.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Q:\NotePad>python main.py
Coordinates 41.8629150390625°E -87.64877319335938°N
Elevation 178.0 m asl
Timezone b'America/Los_Angeles' b'PST'
Timezone difference to GMT+0 -28800 s
date temperature_2m
0 1940-01-01 08:00:00 -14.359000
1 1940-01-01 09:00:00 -14.459001
2 1940-01-01 10:00:00 -13.809000
3 1940-01-01 11:00:00 -14.159000
4 1940-01-01 12:00:00 -14.459001
... ... ...
736339 2024-01-01 03:00:00 0.213000
736340 2024-01-01 04:00:00 0.263000
736341 2024-01-01 05:00:00 0.563000
736342 2024-01-01 06:00:00 0.363000
736343 2024-01-01 07:00:00 0.513000

[736344 rows x 2 columns]
The data type of the first column is: datetime64[ns]
The data type of the second column is: float32

Max Mean+Std Mean Mean-Std Min Std
Year
1940 34.691 19.651 7.666 -4.319 -26.459 11.985
1941 34.491 21.023 9.176 -2.671 -21.709 11.847
1942 32.691 20.861 8.802 -3.257 -27.009 12.059
1943 31.541 19.990 8.263 -3.464 -26.159 11.727
1944 34.691 21.226 9.839 -1.548 -21.059 11.387
... ... ... ... ... ... ...
2019 33.963 20.664 9.461 -1.742 -32.437 11.203
2020 35.363 20.987 10.675 0.363 -22.087 10.312
2021 33.413 22.436 11.235 0.034 -21.887 11.201
2022 37.413 21.752 10.046 -1.660 -24.437 11.706
2023 37.013 20.524 10.986 1.448 -19.437 9.538

[84 rows x 6 columns]

Max Mean+Std Mean Mean-Std Min Std
Decade
1940 35.541 20.813 9.215 -2.383 -27.009 11.598
1950 36.126 21.429 9.958 -1.513 -26.374 11.471
1960 36.176 21.195 9.404 -2.387 -28.774 11.791
1970 35.376 21.292 9.466 -2.360 -26.474 11.826
1980 36.976 21.546 9.834 -1.878 -31.824 11.712
1990 37.426 21.302 10.291 -0.720 -32.524 11.011
2000 36.726 21.645 10.375 -0.895 -28.024 11.270
2010 36.876 21.541 10.246 -1.049 -32.437 11.295

Q:\NotePad>
50 changes: 50 additions & 0 deletions final/view_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""
Author: Steve Eckardt <seckardt@pacbell.net>
Project: oa professional software development bootcamp
Program Name: view_data.py
Revised: January 3rd 2024
License: MIT
"""
import sqlite3
import pandas as pd
import matplotlib.pyplot as plt

sqlite_file = 'Chicago_temp.db'
conn = sqlite3.connect(sqlite_file)
query = "SELECT * FROM annual_data"
annual_df = pd.read_sql_query(query, conn)
query = "SELECT * FROM decadal_data"
decadal_df = pd.read_sql_query(query, conn)
conn.close()



annual_df = annual_df.drop('Std', axis=1)
x = annual_df['Year']

plt.xlabel('Year')
plt.ylabel('Temperatures in Celsius')
plt.title('Chicago\'s Annual Temperatures')
plt.plot(x, annual_df['Max'], color='#FF0000', label='Max') # Hot Red
plt.plot(x, annual_df['Mean+Std'], color='#FFA500', label='Mean+Std') # Warm Orange
plt.plot(x, annual_df['Mean'], color='#FFFF00', label='Mean') # Neutral Yellow
plt.plot(x, annual_df['Mean-Std'], color='#0000FF', label='Mean-Std') # Cool Blue
plt.plot(x, annual_df['Min'], color='#800080', label='Min') # Cold Purple
plt.legend( loc='lower left', bbox_to_anchor=(1, 0))
plt.subplots_adjust(right=0.8)
plt.show()


x = decadal_df['Decade']
decadal_df = decadal_df.drop('Std', axis=1)
plt.plot(x, decadal_df['Max'], color='#FF0000', label='Max') # Hot Red
plt.plot(x, decadal_df['Mean+Std'], color='#FFA500', label='Mean+Std') # Warm Orange
plt.plot(x, decadal_df['Mean'], color='#FFFF00', label='Mean') # Neutral Yellow
plt.plot(x, decadal_df['Mean-Std'], color='#0000FF', label='Mean-Std') # Cool Blue
plt.plot(x, decadal_df['Min'], color='#800080', label='Min') # Cold Purple
plt.xlabel('Decade')
plt.ylabel('Temperatures in Celsius')
plt.title('Chicago\'s Decadal Temperatures')
plt.legend(loc='lower left', bbox_to_anchor=(1, 0))
plt.subplots_adjust(right=0.8)
plt.show()