undpindia · shivangpandey31 · Mar 13, 2022
diff --git a/Geospatial Data Science Internship/ShivangPandey/Map3.png b/Geospatial Data Science Internship/ShivangPandey/Map3.png
diff --git a/Geospatial Data Science Internship/ShivangPandey/TASK1.ipynb b/Geospatial Data Science Internship/ShivangPandey/TASK1.ipynb
@@ -0,0 +1,198 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "b567b773",
+   "metadata": {
+    "id": "b567b773"
+   },
+   "source": [
+    "# TASK-1 \n",
+    "\n",
+    "This task is to provide a JSON output which summarised the number of fires per administrative boundary per year in Telangana. The json output should include the following keys - adm_name, year, fireCount. Please use the same key names, and the boundary names\n",
+    "\n",
+    "First, all necessary libraries has to be installed like pandas, rtree, pygeos and pandas.<br>\n",
+    "pandas: This library can be used for converting a set of data into a dataframe and do necessary operations on them.\n",
+    "geopandas: This library can be used as pandas for spatial data and for their vizualization <br>\n",
+    "pygeos, rtree: These are the dependent libraries which are has to be installed for geopandas<br>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c1ead88d",
+   "metadata": {
+    "id": "qnQPg7AsV8ME"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install pandas\n",
+    "!pip install rtree\n",
+    "!pip install pygeos\n",
+    "!pip install geopandas"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "17a3633b",
+   "metadata": {},
+   "source": [
+    "## Importing Libraries\n",
+    "\n",
+    "geopandas and pandas are imported as gpd and pd respectively to use their functions with shortforms"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ba618cf6",
+   "metadata": {
+    "id": "qnQPg7AsV8ME"
+   },
+   "outputs": [],
+   "source": [
+    "import geopandas as gpd\n",
+    "import pandas as pd"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "qnQPg7AsV8ME",
+   "metadata": {
+    "id": "qnQPg7AsV8ME"
+   },
+   "outputs": [],
+   "source": [
+    "#---------------------THIS PART IS USED TO WORK FROM DRIVE---------------#\n",
+    "#from google.colab import drive\n",
+    "#drive.mount('/content/drive/')\n",
+    "\n",
+    "#import os\n",
+    "#os.chdir(r'/content/drive/MyDrive/undp')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c18f6930",
+   "metadata": {},
+   "source": [
+    "## Creating DataFrame with given datasets\n",
+    "\n",
+    "\n",
+    "Firstly, we need to download the administrative boundaries of India from [GADM level 3](https://gadm.org/), and further features are extracted for the Telangana state boundary using QGIS. <br>\n",
+    "<br>\n",
+    "As fire_points are read as comma-separated values, we have to assign latitude and longitude values to use as a geodataframe. We will create a pandas data frame _'fire_points'_ and extract the longitude and latitude column from it. Further, we will assign geometry to each point using its longitude, latitude values in new geodataframe _'newdf'_.To create a geodataframe (_'pointstodist'_) which can show each fire point with administrative boundaries in which it is present, the administrative boundaries shapefile (dist.shp) and fire points in Telangana boundary are read using geopandas (_'adm_name'_ and _'nwedf'_)and joined together using _'sjoin'_ function of geopandas with the intersection of keys from both GeoDataFrames."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b0acde00",
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "executionInfo": {
+     "elapsed": 3,
+     "status": "ok",
+     "timestamp": 1646991744057,
+     "user": {
+      "displayName": "Shivang Pandey",
+      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14Ghcuw4QK3NL6bdbVpWe3vKu-Tr2lMCm2-IlZkPN=s64",
+      "userId": "06979323684164181966"
+     },
+     "user_tz": -330
+    },
+    "id": "b0acde00",
+    "outputId": "3e99c3a0-3f1a-4622-ba82-6fb70e231989"
+   },
+   "outputs": [],
+   "source": [
+    "#shapefile of districts of Telangana\n",
+    "adm_name = gpd.read_file('dist.shp')\n",
+    "#shapefile of fire data given\n",
+    "fire_points = pd.read_csv('telangana_fires.csv') \n",
+    "\n",
+    "#--------------Geodataframe for fire points--------------#\n",
+    "newdf = gpd.GeoDataFrame(fire_points, geometry=gpd.points_from_xy(fire_points.longitude, fire_points.latitude), crs=4326)\n",
+    "\n",
+    "#--------------- using inner join to merge to geoDataFrames-------------#\n",
+    "pointstodist = gpd.sjoin(newdf, adm_name, how=\"inner\", op='intersects')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9f568225",
+   "metadata": {},
+   "source": [
+    "## Creating JSON File:\n",
+    "\n",
+    "The JSON output should include the following keys - adm_name, year, fireCount. For this, we need to rename geodataframe column _'NAME_3'_ as _'adm_name'_. Next, we will fetch years for each fire point from _'acq_date'_ and dump it into a new column named _'year'_. Now we have to count the fire points for each year and administrative boundary. This is done by creating an empty column in the data frame named as _'FireCount'_ and summing fire points grouping by _'adm_name'_ & _'year'_ .\n",
+    "Now, we will create a data frame that will be containing only three columns of the previous data frame named as _'adm_name'_, _'year'_ and _'fireCount'_ to dump in a JSON file named as _'output1.json'_.<br>\n",
+    "<br>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7d134ec4",
+   "metadata": {
+    "executionInfo": {
+     "elapsed": 2,
+     "status": "ok",
+     "timestamp": 1646991755052,
+     "user": {
+      "displayName": "Shivang Pandey",
+      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14Ghcuw4QK3NL6bdbVpWe3vKu-Tr2lMCm2-IlZkPN=s64",
+      "userId": "06979323684164181966"
+     },
+     "user_tz": -330
+    },
+    "id": "7d134ec4"
+   },
+   "outputs": [],
+   "source": [
+    "#renaming column by districts\n",
+    "pointstodist = pointstodist.rename(columns={\"NAME_3\":\"adm_name\"})\n",
+    "#getting year from the date\n",
+    "pointstodist['year'] = pd. DatetimeIndex(pointstodist['acq_date']).year\n",
+    "#declaring firecount column by 1\n",
+    "pointstodist['fireCount']=1\n",
+    "#suming the occurence of fire by grouping district and year\n",
+    "new_df = pointstodist.groupby(['adm_name','year']).sum()  \n",
+    "new_df = new_df.reset_index()\n",
+    "\n",
+    "#creating new Dataframe to dumping json\n",
+    "json_pd = new_df[['adm_name','year','fireCount']]\n",
+    "#dumping to json file\n",
+    "json_pd = json_pd.to_json('output1.json', orient=\"records\", lines=True)     "
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "name": "TASK1.ipynb",
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}