heemonsu · SteveEckardt · Nov 27, 2023 · Nov 27, 2023 · Nov 27, 2023 · Dec 4, 2023
diff --git a/README.md b/README.md
@@ -1,2 +1,66 @@
-# oaf-psd-bootcamp
-Initial repository for the Open Avenues: Professional Software Development Bootcamp
+# Chicago Temperature Analysis (1940-2020)
+
+Python-based statistical analysis of 80 years of temperature data from Chicago, Illinois.
+
+## Overview
+
+This project analyzes long-term climate trends using hourly temperature measurements from the Open-Metro API. The analysis reveals significant temperature shifts over eight decades.
+
+## Features
+
+- Automated data retrieval from Open-Metro API
+- SQLite database storage (Chicago_temp.db)
+- Statistical calculations (min, max, mean, standard deviation)
+- Annual and decade-level aggregations
+- Data visualization with line graphs
+- CSV export functionality
+
+This executes three modules in sequence:
+1. `model_builder.py` - Downloads and stores temperature data
+2. `control_data.py` - Calculates statistical measures
+3. `view_data.py` - Generates visualizations
+
+## Key Findings
+
+Analysis of 80 years of data (1940-2020) shows:
+
+- Average temperature increased by 1°C (2°F)
+- Maximum temperature increased by 1°C (2°F)
+- Minimum temperature decreased by 5°C (10°F)
+
+| Decade | Max   | Mean  | Min     | Std Dev |
+|--------|-------|-------|---------|---------|
+| 1940   | 35.5  | 9.2   | -27.0   | 11.6    |
+| 1950   | 36.1  | 10.0  | -26.4   | 11.5    |
+| 1960   | 36.2  | 9.4   | -28.8   | 11.8    |
+| 1970   | 35.4  | 9.5   | -26.5   | 11.8    |
+| 1980   | 37.0  | 9.8   | -31.8   | 11.7    |
+| 1990   | 37.4  | 10.3  | -32.5   | 11.0    |
+| 2000   | 36.7  | 10.4  | -28.0   | 11.3    |
+| 2010   | 36.9  | 10.2  | -32.4   | 11.3    |
+
+*All temperatures in Celsius*
+
+## Technologies
+
+- Python 3.x
+- SQLite
+- Open-Metro API
+- Data visualization libraries (matplotlib/seaborn)
+
+## Project Structure
+```
+.
+├── main.py              # Main execution script
+├── model_builder.py     # Data retrieval and storage
+├── control_data.py      # Statistical analysis
+├── view_data.py         # Visualization generation
+└── Chicago_temp.db      # SQLite database (generated)
+```
+
+## Author
+
+Steve Eckardt  
+October 2023 - December 2023  
+Professional Software Development Bootcamp, Open Avenues Foundation   
+This project is for educational and portfolio purposes.  
diff --git a/final/Chicagos_annual_temp.png b/final/Chicagos_annual_temp.png
diff --git a/final/Chicagos_decadal_temp.png b/final/Chicagos_decadal_temp.png
diff --git a/final/control_data.py b/final/control_data.py
@@ -0,0 +1,90 @@
+"""
+Author: Steve Eckardt <seckardt@pacbell.net>
+Project: oa professional software development bootcamp
+Program Name: control_data.py
+Revised: January 3rd 2024
+License: MIT
+"""
+
+import sqlite3
+import pandas as pd
+import numpy as np
+
+sqlite_file = 'Chicago_temp.db'
+conn = sqlite3.connect(sqlite_file)
+
+column_names = ['Year', 'Mean', 'Max', 'Min', 'Std']
+annual_df = pd.DataFrame(columns=column_names)
+annual_df.set_index('Year', inplace=True)
+
+for year_index in range(1940, 2024):
+    query = f"SELECT * FROM hourly_data WHERE strftime('%Y', date) = '{year_index}';"
+    selected_year_df = pd.read_sql_query(query, conn)
+    temp_array = selected_year_df['temperature_2m'].values
+
+    mean_value = round(np.mean(temp_array),3)
+    max_value = round(np.max(temp_array),3)
+    min_value = round(np.min(temp_array),3)
+    std_value = round(np.std(temp_array),3)
+
+    new_row = {
+        'Year': year_index,
+        'Mean': mean_value,
+        'Max': max_value,
+        'Min': min_value,
+        'Std': std_value
+    }
+
+    annual_df.loc[year_index] = new_row
+
+annual_df['Mean+Std'] = annual_df['Mean'] + annual_df['Std']
+annual_df['Mean-Std'] = annual_df['Mean'] - annual_df['Std']
+new_order = ['Max', 'Mean+Std', 'Mean', 'Mean-Std', 'Min', 'Std']
+annual_df = annual_df[new_order]
+
+annual_df.to_sql('annual_data', conn, if_exists='replace')
+
+
+
+
+column_names[0] = 'Decade'
+decadal_df = pd.DataFrame(columns=column_names)
+decadal_df.set_index('Decade', inplace=True)
+
+for start_year in range(1940, 2011, 10):
+    end_year = start_year + 9
+    query = f"SELECT * FROM hourly_data WHERE strftime('%Y', date) >= '{start_year}' AND strftime('%Y', date) <= '{end_year}';"
+    selected_decade_df = pd.read_sql_query(query, conn)
+    temp_array = selected_decade_df['temperature_2m'].values
+
+    mean_value = round(np.mean(temp_array),3)
+    max_value = round(np.max(temp_array),3)
+    min_value = round(np.min(temp_array),3)
+    std_value = round(np.std(temp_array),3)
+
+    new_row = {
+        'Decade': start_year,
+        'Mean': mean_value,
+        'Max': max_value,
+        'Min': min_value,
+        'Std': std_value
+    }
+
+    decadal_df .loc[start_year] = new_row
+
+decadal_df['Mean+Std'] = decadal_df['Mean'] + decadal_df['Std']
+decadal_df['Mean-Std'] = decadal_df['Mean'] - decadal_df['Std']
+decadal_df = decadal_df[new_order]
+
+decadal_df.to_sql('decadal_data', conn, if_exists='replace')
+
+conn.close()
+
+print()
+print(annual_df)
+annual_df.to_csv('Chicago_annual_temp.csv', index=False)
+
+print()
+print(decadal_df)
+decadal_df.to_csv('Chicago_decadal_temp.csv', index=False)
+
diff --git a/final/main.py b/final/main.py
@@ -0,0 +1,43 @@
+"""
+Author: Steve Eckardt <seckardt@pacbell.net>
+Project: oa professional software development bootcamp
+Program Name: main.py
+Revised: January 3rd 2024
+License: MIT
+
+The goal of this project is to observe temperature changes in Chicago, Illinois, over the last 80 years.
+
+Steps:
+Model builder:
+1. Download the hourly measurements from the Open-Metro API site.
+2. Create a Chicago_temp.db database and save the temperature data into a table.
+3. Save the temperature data into a CSV file and print temperature data to the screen.
+Control Data:
+4. Calculate the data set's annual min, max, mean, and standard deviation.
+5. Calculate the decade min max medium and standard deviation of the data set.
+6. Save the calculations to a new table in the database, a CSV file, and print to the screen.
+View Data:
+6. Plot a line graph of the annual data and the decade data.
+
+The two graphs and a screen capture of the program run are attached.
+
+Observations and Conclusions:
+Decade     Max  Mean+Std    Mean  Mean-Std     Min     Std
+  1940  35.541    20.813   9.215    -2.383 -27.009  11.598
+  1950  36.126    21.429   9.958    -1.513 -26.374  11.471
+  1960  36.176    21.195   9.404    -2.387 -28.774  11.791
+  1970  35.376    21.292   9.466    -2.360 -26.474  11.826
+  1980  36.976    21.546   9.834    -1.878 -31.824  11.712
+  1990  37.426    21.302  10.291    -0.720 -32.524  11.011
+  2000  36.726    21.645  10.375    -0.895 -28.024  11.270
+  2010  36.876    21.541  10.246    -1.049 -32.437  11.295
+Over the past 80 years, the data shows that average and maximum temperatures have increased by 1° Celsius or about 2° Fahrenheit.
+Interestingly, the minimum temperature decreased by 5° C or about 10° f.
+"""
+import subprocess
+try:
+    subprocess.run(["python", 'model_builder.py'], check=True)
+    subprocess.run(["python", 'control_data.py'], check=True)
+    subprocess.run(["python", 'view_data.py'], check=True)
+except subprocess.CalledProcessError as e:
+    print(f"Error running the script: {e}")
diff --git a/final/model_builder.py b/final/model_builder.py
@@ -0,0 +1,71 @@
+"""
+Author: Steve Eckardt <seckardt@pacbell.net>
+Original code by Patrick Zippenfenig Source: https://github.com/open-meteo/sdk
+Project: oa professional software development bootcamp
+Program Name: model_builder.py
+Revised: January 3rd 2024
+License: MIT
+"""
+
+import openmeteo_requests
+import requests_cache
+import pandas as pd
+import matplotlib.pyplot as plt
+from retry_requests import retry
+import sqlite3
+
+# Setup the Open-Meteo API client with cache and retry on error
+cache_session = requests_cache.CachedSession('.cache', expire_after = -1)
+retry_session = retry(cache_session, retries = 5, backoff_factor = 0.2)
+openmeteo = openmeteo_requests.Client(session = retry_session)
+
+# Make sure all required weather variables are listed here
+# The order of variables in hourly or daily is important to assign them correctly below
+url = "https://archive-api.open-meteo.com/v1/archive"
+params = {
+	"latitude": 41.85003,
+	"longitude": -87.65005,  #Chicago, Illinois
+	"start_date": "1940-01-01",
+#    "start_date": "2023-12-30",
+	"end_date": "2023-12-31",
+	"hourly": "temperature_2m",
+	"timezone": "America/Los_Angeles"
+}
+responses = openmeteo.weather_api(url, params=params)
+
+# Process first location. Add a for-loop for multiple locations or weather models
+response = responses[0]
+print(f"Coordinates {response.Latitude()}°E {response.Longitude()}°N")
+print(f"Elevation {response.Elevation()} m asl")
+print(f"Timezone {response.Timezone()} {response.TimezoneAbbreviation()}")
+print(f"Timezone difference to GMT+0 {response.UtcOffsetSeconds()} s")
+
+# Process hourly data. The order of variables needs to be the same as requested.
+hourly = response.Hourly()
+hourly_temperature_2m = hourly.Variables(0).ValuesAsNumpy()
+
+hourly_data = {"date": pd.date_range(
+	start = pd.to_datetime(hourly.Time(), unit = "s"),
+	end = pd.to_datetime(hourly.TimeEnd(), unit = "s"),
+	freq = pd.Timedelta(seconds = hourly.Interval()),
+	inclusive = "left"
+)}
+hourly_data["temperature_2m"] = hourly_temperature_2m
+
+hourly_dataframe = pd.DataFrame(data = hourly_data)
+print(hourly_dataframe)
+
+first_column_type = hourly_dataframe['date'].dtype
+print(f"The data type of the first column is: {first_column_type}")
+
+second_column_type = hourly_dataframe['temperature_2m'].dtype
+print(f"The data type of the second column is: {second_column_type}")
+
+# Save the DataFrame to a CSV file
+hourly_dataframe.to_csv('Chicago_hourly_temp.csv', index=False)
+
+# create SQLite database file
+sqlite_file = 'Chicago_temp.db'
+conn = sqlite3.connect(sqlite_file)
+hourly_dataframe.to_sql('hourly_data', conn, index=False, if_exists='replace')
+conn.close()
diff --git a/final/screen_capture.txt b/final/screen_capture.txt
@@ -0,0 +1,50 @@
+Q:\NotePad>python main.py
+Coordinates 41.8629150390625°E -87.64877319335938°N
+Elevation 178.0 m asl
+Timezone b'America/Los_Angeles' b'PST'
+Timezone difference to GMT+0 -28800 s
+                      date  temperature_2m
+0      1940-01-01 08:00:00      -14.359000
+1      1940-01-01 09:00:00      -14.459001
+2      1940-01-01 10:00:00      -13.809000
+3      1940-01-01 11:00:00      -14.159000
+4      1940-01-01 12:00:00      -14.459001
+...                    ...             ...
+736339 2024-01-01 03:00:00        0.213000
+736340 2024-01-01 04:00:00        0.263000
+736341 2024-01-01 05:00:00        0.563000
+736342 2024-01-01 06:00:00        0.363000
+736343 2024-01-01 07:00:00        0.513000
+
+[736344 rows x 2 columns]
+The data type of the first column is: datetime64[ns]
+The data type of the second column is: float32
+
+         Max  Mean+Std    Mean  Mean-Std     Min     Std
+Year
+1940  34.691    19.651   7.666    -4.319 -26.459  11.985
+1941  34.491    21.023   9.176    -2.671 -21.709  11.847
+1942  32.691    20.861   8.802    -3.257 -27.009  12.059
+1943  31.541    19.990   8.263    -3.464 -26.159  11.727
+1944  34.691    21.226   9.839    -1.548 -21.059  11.387
+...      ...       ...     ...       ...     ...     ...
+2019  33.963    20.664   9.461    -1.742 -32.437  11.203
+2020  35.363    20.987  10.675     0.363 -22.087  10.312
+2021  33.413    22.436  11.235     0.034 -21.887  11.201
+2022  37.413    21.752  10.046    -1.660 -24.437  11.706
+2023  37.013    20.524  10.986     1.448 -19.437   9.538
+
+[84 rows x 6 columns]
+
+           Max  Mean+Std    Mean  Mean-Std     Min     Std
+Decade
+1940    35.541    20.813   9.215    -2.383 -27.009  11.598
+1950    36.126    21.429   9.958    -1.513 -26.374  11.471
+1960    36.176    21.195   9.404    -2.387 -28.774  11.791
+1970    35.376    21.292   9.466    -2.360 -26.474  11.826
+1980    36.976    21.546   9.834    -1.878 -31.824  11.712
+1990    37.426    21.302  10.291    -0.720 -32.524  11.011
+2000    36.726    21.645  10.375    -0.895 -28.024  11.270
+2010    36.876    21.541  10.246    -1.049 -32.437  11.295
+
+Q:\NotePad>
diff --git a/final/view_data.py b/final/view_data.py
@@ -0,0 +1,50 @@
+"""
+Author: Steve Eckardt <seckardt@pacbell.net>
+Project: oa professional software development bootcamp
+Program Name: view_data.py
+Revised: January 3rd 2024
+License: MIT
+"""
+import sqlite3
+import pandas as pd
+import matplotlib.pyplot as plt
+
+sqlite_file = 'Chicago_temp.db'
+conn = sqlite3.connect(sqlite_file)
+query = "SELECT * FROM annual_data"
+annual_df = pd.read_sql_query(query, conn)
+query = "SELECT * FROM decadal_data"
+decadal_df = pd.read_sql_query(query, conn)
+conn.close()
+
+
+
+annual_df = annual_df.drop('Std', axis=1)
+x = annual_df['Year']
+
+plt.xlabel('Year')
+plt.ylabel('Temperatures in Celsius')
+plt.title('Chicago\'s Annual Temperatures')
+plt.plot(x, annual_df['Max'], color='#FF0000', label='Max') # Hot Red
+plt.plot(x, annual_df['Mean+Std'], color='#FFA500', label='Mean+Std') # Warm Orange
+plt.plot(x, annual_df['Mean'], color='#FFFF00', label='Mean') # Neutral Yellow
+plt.plot(x, annual_df['Mean-Std'], color='#0000FF', label='Mean-Std') # Cool Blue
+plt.plot(x, annual_df['Min'], color='#800080', label='Min') # Cold Purple
+plt.legend( loc='lower left', bbox_to_anchor=(1, 0))
+plt.subplots_adjust(right=0.8)
+plt.show()
+
+
+x = decadal_df['Decade']
+decadal_df = decadal_df.drop('Std', axis=1)
+plt.plot(x, decadal_df['Max'], color='#FF0000', label='Max') # Hot Red
+plt.plot(x, decadal_df['Mean+Std'], color='#FFA500', label='Mean+Std') # Warm Orange
+plt.plot(x, decadal_df['Mean'], color='#FFFF00', label='Mean') # Neutral Yellow
+plt.plot(x, decadal_df['Mean-Std'], color='#0000FF', label='Mean-Std') # Cool Blue
+plt.plot(x, decadal_df['Min'], color='#800080', label='Min') # Cold Purple
+plt.xlabel('Decade')
+plt.ylabel('Temperatures in Celsius')
+plt.title('Chicago\'s Decadal Temperatures')
+plt.legend(loc='lower left', bbox_to_anchor=(1, 0))
+plt.subplots_adjust(right=0.8)
+plt.show()