Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
198 changes: 198 additions & 0 deletions Geospatial Data Science Internship/ShivangPandey/TASK1.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b567b773",
"metadata": {
"id": "b567b773"
},
"source": [
"# TASK-1 \n",
"\n",
"This task is to provide a JSON output which summarised the number of fires per administrative boundary per year in Telangana. The json output should include the following keys - adm_name, year, fireCount. Please use the same key names, and the boundary names\n",
"\n",
"First, all necessary libraries has to be installed like pandas, rtree, pygeos and pandas.<br>\n",
"pandas: This library can be used for converting a set of data into a dataframe and do necessary operations on them.\n",
"geopandas: This library can be used as pandas for spatial data and for their vizualization <br>\n",
"pygeos, rtree: These are the dependent libraries which are has to be installed for geopandas<br>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1ead88d",
"metadata": {
"id": "qnQPg7AsV8ME"
},
"outputs": [],
"source": [
"!pip install pandas\n",
"!pip install rtree\n",
"!pip install pygeos\n",
"!pip install geopandas"
]
},
{
"cell_type": "markdown",
"id": "17a3633b",
"metadata": {},
"source": [
"## Importing Libraries\n",
"\n",
"geopandas and pandas are imported as gpd and pd respectively to use their functions with shortforms"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ba618cf6",
"metadata": {
"id": "qnQPg7AsV8ME"
},
"outputs": [],
"source": [
"import geopandas as gpd\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "qnQPg7AsV8ME",
"metadata": {
"id": "qnQPg7AsV8ME"
},
"outputs": [],
"source": [
"#---------------------THIS PART IS USED TO WORK FROM DRIVE---------------#\n",
"#from google.colab import drive\n",
"#drive.mount('/content/drive/')\n",
"\n",
"#import os\n",
"#os.chdir(r'/content/drive/MyDrive/undp')"
]
},
{
"cell_type": "markdown",
"id": "c18f6930",
"metadata": {},
"source": [
"## Creating DataFrame with given datasets\n",
"\n",
"\n",
"Firstly, we need to download the administrative boundaries of India from [GADM level 3](https://gadm.org/), and further features are extracted for the Telangana state boundary using QGIS. <br>\n",
"<br>\n",
"As fire_points are read as comma-separated values, we have to assign latitude and longitude values to use as a geodataframe. We will create a pandas data frame _'fire_points'_ and extract the longitude and latitude column from it. Further, we will assign geometry to each point using its longitude, latitude values in new geodataframe _'newdf'_.To create a geodataframe (_'pointstodist'_) which can show each fire point with administrative boundaries in which it is present, the administrative boundaries shapefile (dist.shp) and fire points in Telangana boundary are read using geopandas (_'adm_name'_ and _'nwedf'_)and joined together using _'sjoin'_ function of geopandas with the intersection of keys from both GeoDataFrames."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0acde00",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1646991744057,
"user": {
"displayName": "Shivang Pandey",
"photoUrl": "https://lh3.googleusercontent.com/a-/AOh14Ghcuw4QK3NL6bdbVpWe3vKu-Tr2lMCm2-IlZkPN=s64",
"userId": "06979323684164181966"
},
"user_tz": -330
},
"id": "b0acde00",
"outputId": "3e99c3a0-3f1a-4622-ba82-6fb70e231989"
},
"outputs": [],
"source": [
"#shapefile of districts of Telangana\n",
"adm_name = gpd.read_file('dist.shp')\n",
"#shapefile of fire data given\n",
"fire_points = pd.read_csv('telangana_fires.csv') \n",
"\n",
"#--------------Geodataframe for fire points--------------#\n",
"newdf = gpd.GeoDataFrame(fire_points, geometry=gpd.points_from_xy(fire_points.longitude, fire_points.latitude), crs=4326)\n",
"\n",
"#--------------- using inner join to merge to geoDataFrames-------------#\n",
"pointstodist = gpd.sjoin(newdf, adm_name, how=\"inner\", op='intersects')"
]
},
{
"cell_type": "markdown",
"id": "9f568225",
"metadata": {},
"source": [
"## Creating JSON File:\n",
"\n",
"The JSON output should include the following keys - adm_name, year, fireCount. For this, we need to rename geodataframe column _'NAME_3'_ as _'adm_name'_. Next, we will fetch years for each fire point from _'acq_date'_ and dump it into a new column named _'year'_. Now we have to count the fire points for each year and administrative boundary. This is done by creating an empty column in the data frame named as _'FireCount'_ and summing fire points grouping by _'adm_name'_ & _'year'_ .\n",
"Now, we will create a data frame that will be containing only three columns of the previous data frame named as _'adm_name'_, _'year'_ and _'fireCount'_ to dump in a JSON file named as _'output1.json'_.<br>\n",
"<br>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d134ec4",
"metadata": {
"executionInfo": {
"elapsed": 2,
"status": "ok",
"timestamp": 1646991755052,
"user": {
"displayName": "Shivang Pandey",
"photoUrl": "https://lh3.googleusercontent.com/a-/AOh14Ghcuw4QK3NL6bdbVpWe3vKu-Tr2lMCm2-IlZkPN=s64",
"userId": "06979323684164181966"
},
"user_tz": -330
},
"id": "7d134ec4"
},
"outputs": [],
"source": [
"#renaming column by districts\n",
"pointstodist = pointstodist.rename(columns={\"NAME_3\":\"adm_name\"})\n",
"#getting year from the date\n",
"pointstodist['year'] = pd. DatetimeIndex(pointstodist['acq_date']).year\n",
"#declaring firecount column by 1\n",
"pointstodist['fireCount']=1\n",
"#suming the occurence of fire by grouping district and year\n",
"new_df = pointstodist.groupby(['adm_name','year']).sum() \n",
"new_df = new_df.reset_index()\n",
"\n",
"#creating new Dataframe to dumping json\n",
"json_pd = new_df[['adm_name','year','fireCount']]\n",
"#dumping to json file\n",
"json_pd = json_pd.to_json('output1.json', orient=\"records\", lines=True) "
]
}
],
"metadata": {
"colab": {
"name": "TASK1.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading