diff --git a/your-code/lab_imbalance.ipynb b/your-code/lab_imbalance.ipynb index dbb15e1..2e52bf9 100644 --- a/your-code/lab_imbalance.ipynb +++ b/your-code/lab_imbalance.ipynb @@ -1,147 +1,526 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Inbalanced Classes\n", - "## In this lab, we are going to explore a case of imbalanced classes. \n", - "\n", - "\n", - "Like we disussed in class, when we have noisy data, if we are not careful, we can end up fitting our model to the noise in the data and not the 'signal'-- the factors that actually determine the outcome. This is called overfitting, and results in good results in training, and in bad results when the model is applied to real data. Similarly, we could have a model that is too simplistic to accurately model the signal. This produces a model that doesnt work well (ever). \n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Note: before doing the first commit, make sure you don't include the large csv file, either by adding it to .gitignore, or by deleting it." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### First, download the data from: https://www.kaggle.com/datasets/chitwanmanchanda/fraudulent-transactions-data?resource=download . Import the dataset and provide some discriptive statistics and plots. What do you think will be the important features in determining the outcome?\n", - "### Note: don't use the entire dataset, use a sample instead, with n=100000 elements, so your computer doesn't freeze." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### What is the distribution of the outcome? " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Your response here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Clean the dataset. Pre-process it to make it suitable for ML training. Feel free to explore, drop, encode, transform, etc. Whatever you feel will improve the model score." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Run a logisitc regression classifier and evaluate its accuracy." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Now pick a model of your choice and evaluate its accuracy." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Which model worked better and how do you know?" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "# Your response here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Note: before doing the first commit, make sure you don't include the large csv file, either by adding it to .gitignore, or by deleting it." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Inbalanced Classes\n", + "## In this lab, we are going to explore a case of imbalanced classes. \n", + "\n", + "\n", + "Like we disussed in class, when we have noisy data, if we are not careful, we can end up fitting our model to the noise in the data and not the 'signal'-- the factors that actually determine the outcome. This is called overfitting, and results in good results in training, and in bad results when the model is applied to real data. Similarly, we could have a model that is too simplistic to accurately model the signal. This produces a model that doesnt work well (ever). \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Note: before doing the first commit, make sure you don't include the large csv file, either by adding it to .gitignore, or by deleting it." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### First, download the data from: https://www.kaggle.com/datasets/chitwanmanchanda/fraudulent-transactions-data?resource=download . Import the dataset and provide some discriptive statistics and plots. What do you think will be the important features in determining the outcome?\n", + "### Note: don't use the entire dataset, use a sample instead, with n=100000 elements, so your computer doesn't freeze." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "import pandas as pd\n", + "import numpy as no\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.model_selection import train_test_split\n", + "from sklearn.preprocessing import StandardScaler, OneHotEncoder\n", + "from sklearn.compose import ColumnTransformer\n", + "from sklearn.pipeline import Pipeline\n", + "from sklearn.ensemble import RandomForestClassifier\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " step type amount nameOrig oldbalanceOrg newbalanceOrig \\\n", + "6322570 688 CASH_IN 23557.12 C867750533 8059.00 31616.12 \n", + "3621196 274 PAYMENT 6236.13 C601099070 0.00 0.00 \n", + "1226256 133 PAYMENT 33981.87 C279540931 18745.72 0.00 \n", + "2803274 225 CASH_OUT 263006.42 C11675531 20072.00 0.00 \n", + "3201247 249 CASH_OUT 152013.74 C530649214 20765.00 0.00 \n", + "\n", + " nameDest oldbalanceDest newbalanceDest isFraud isFlaggedFraud \n", + "6322570 C1026934669 169508.66 145951.53 0 0 \n", + "3621196 M701283411 0.00 0.00 0 0 \n", + "1226256 M577905776 0.00 0.00 0 0 \n", + "2803274 C529577791 390253.56 653259.98 0 0 \n", + "3201247 C1304175579 252719.19 404732.93 0 0 \n" + ] + } + ], + "source": [ + "\n", + "samples = 100000\n", + "fraud = pd.read_csv('Fraud.csv').sample(n=samples, random_state=1)\n", + "print(fraud.head())" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stepamountoldbalanceOrgnewbalanceOrigoldbalanceDestnewbalanceDestisFraudisFlaggedFraud
count100000.0000001.000000e+051.000000e+051.000000e+051.000000e+051.000000e+05100000.000000100000.0
mean243.2294301.815534e+058.283566e+058.498642e+051.102745e+061.230714e+060.0012400.0
std142.7118636.168152e+052.875843e+062.912484e+063.531001e+063.836813e+060.0351920.0
min1.0000001.800000e-010.000000e+000.000000e+000.000000e+000.000000e+000.0000000.0
25%155.0000001.344656e+040.000000e+000.000000e+000.000000e+000.000000e+000.0000000.0
50%239.0000007.550210e+041.466800e+040.000000e+001.315374e+052.163863e+050.0000000.0
75%335.0000002.092775e+051.070410e+051.449728e+059.458267e+051.115200e+060.0000000.0
max742.0000004.669816e+074.381886e+074.368662e+073.281945e+083.279981e+081.0000000.0
\n", + "
" + ], + "text/plain": [ + " step amount oldbalanceOrg newbalanceOrig \\\n", + "count 100000.000000 1.000000e+05 1.000000e+05 1.000000e+05 \n", + "mean 243.229430 1.815534e+05 8.283566e+05 8.498642e+05 \n", + "std 142.711863 6.168152e+05 2.875843e+06 2.912484e+06 \n", + "min 1.000000 1.800000e-01 0.000000e+00 0.000000e+00 \n", + "25% 155.000000 1.344656e+04 0.000000e+00 0.000000e+00 \n", + "50% 239.000000 7.550210e+04 1.466800e+04 0.000000e+00 \n", + "75% 335.000000 2.092775e+05 1.070410e+05 1.449728e+05 \n", + "max 742.000000 4.669816e+07 4.381886e+07 4.368662e+07 \n", + "\n", + " oldbalanceDest newbalanceDest isFraud isFlaggedFraud \n", + "count 1.000000e+05 1.000000e+05 100000.000000 100000.0 \n", + "mean 1.102745e+06 1.230714e+06 0.001240 0.0 \n", + "std 3.531001e+06 3.836813e+06 0.035192 0.0 \n", + "min 0.000000e+00 0.000000e+00 0.000000 0.0 \n", + "25% 0.000000e+00 0.000000e+00 0.000000 0.0 \n", + "50% 1.315374e+05 2.163863e+05 0.000000 0.0 \n", + "75% 9.458267e+05 1.115200e+06 0.000000 0.0 \n", + "max 3.281945e+08 3.279981e+08 1.000000 0.0 " + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "fraud.describe()" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjYAAAGJCAYAAACZwnkIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA+HElEQVR4nO3deVwW9f7//+cFyuLC4gKIkpKaK6mpIeaSSVKiHUzPSbNEc6kUU8lKU3Fp8aRp7tpyEk/lSfOTZi4oQmopuaDmknqy45oCmgJCrjDfP/oxPy9BBUTB6XG/3a7bzes9r3nP65oL5MlcM4PNMAxDAAAAFuBQ3A0AAAAUFYINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINLGHcuHGy2Wx3ZVuPPvqoHn30UfP5+vXrZbPZtGTJkruy/d69e6tGjRp3ZVuFlZGRoX79+snHx0c2m01Dhw4t7pZuS857vH79+gKve+TIEdlsNkVHRxd5Xyhat/M+o+Qg2KDEiY6Ols1mMx8uLi7y9fVVSEiIZsyYofPnzxfJdk6ePKlx48Zp165dRTJfUSrJveXHu+++q+joaL388sv67LPP9Pzzz9+wtkaNGnbv97WPixcv3sWu721z5szJV3jq3bv3Dff3tY/evXvf8Z6LS373Fe5NpYq7AeBGJkyYIH9/f125ckVJSUlav369hg4dqqlTp2r58uV68MEHzdrRo0drxIgRBZr/5MmTGj9+vGrUqKHGjRvne721a9cWaDuFcbPePv74Y2VnZ9/xHm5HfHy8WrRoobFjx+arvnHjxnr11VdzjTs5ORV1a5Y1Z84cVapU6ZaB5MUXX1RwcLD5/PDhw4qKitKAAQPUunVrc7xmzZp3qtVid6N91aZNG124cIGvu3scwQYl1pNPPqlmzZqZz0eOHKn4+Hh16tRJTz31lPbv3y9XV1dJUqlSpVSq1J39cv7jjz9UpkyZYv9Pr3Tp0sW6/fxISUlR/fr1811ftWpVPffcc/muz3kvUHBBQUEKCgoyn2/fvl1RUVEKCgq66XuQmZmpsmXL3o0Wi42Dg4NcXFyKuw3cJj6Kwj3lscce05gxY3T06FF9/vnn5nhe59jExsaqVatW8vDwULly5VSnTh29+eabkv78LL158+aSpD59+piH33MOTz/66KNq2LChEhMT1aZNG5UpU8Zc9/pzbHJkZWXpzTfflI+Pj8qWLaunnnpKx48ft6upUaNGnr9RXzvnrXrL6xybzMxMvfrqq/Lz85Ozs7Pq1Kmj999/X4Zh2NXZbDZFRERo2bJlatiwoZydndWgQQPFxMTkvcOvk5KSor59+8rb21suLi5q1KiRFixYYC7POUfh8OHDWrlypdn7kSNH8jV/Xm72XnzzzTcKDQ2Vr6+vnJ2dVbNmTb311lvKysqymyM/+z3HiRMnFBYWprJly8rLy0vDhg3TpUuXcq1bkDnzcuDAAXXr1k0VKlSQi4uLmjVrpuXLl9vV5Hwsu2nTJkVGRqpy5coqW7asunTpotOnT9v1sm/fPm3YsMHc5/np4UZytrthwwYNHDhQXl5eqlatmiTp6NGjGjhwoOrUqSNXV1dVrFhRf//733O9x/ntXfozXIWEhKhSpUpydXWVv7+/XnjhBbua999/Xy1btlTFihXl6uqqpk2b3vC8ts8//1wPP/ywypQpI09PT7Vp08Y80nqzfXWjc2y++uorNW3aVK6urqpUqZKee+45/fbbb3Y1vXv3Vrly5fTbb78pLCxM5cqVU+XKlTV8+PBcX49ffvmlmjZtqvLly8vNzU0BAQGaPn36Ld8X5A9HbHDPef755/Xmm29q7dq16t+/f541+/btU6dOnfTggw9qwoQJcnZ21qFDh7Rp0yZJUr169TRhwoRch+BbtmxpzvH777/rySefVPfu3fXcc8/J29v7pn298847stlseuONN5SSkqJp06YpODhYu3btMo8s5Ud+eruWYRh66qmn9N1336lv375q3Lix1qxZo9dee02//fabPvjgA7v6H374QV9//bUGDhyo8uXLa8aMGeratauOHTumihUr3rCvCxcu6NFHH9WhQ4cUEREhf39/ffXVV+rdu7dSU1M1ZMgQ1atXT5999pmGDRumatWqmR8vVa5c+aav+cqVKzpz5ozdWJkyZcyjMjd6L6Kjo1WuXDlFRkaqXLlyio+PV1RUlNLT0zV58uSbbvNGr7F9+/Y6duyYXnnlFfn6+uqzzz5TfHx8gee6mX379umRRx5R1apVNWLECJUtW1aLFy9WWFiY/u///k9dunSxqx88eLA8PT01duxYHTlyRNOmTVNERIQWLVokSZo2bZoGDx6scuXKadSoUZJ0y6/X/Bg4cKAqV66sqKgoZWZmSpK2bdumzZs3q3v37qpWrZqOHDmiuXPn6tFHH9XPP/+c60jarXpPSUlRhw4dVLlyZY0YMUIeHh46cuSIvv76a7t5pk+frqeeeko9e/bU5cuX9eWXX+rvf/+7VqxYodDQULNu/PjxGjdunFq2bKkJEybIyclJW7ZsUXx8vDp06FDgfRUdHa0+ffqoefPmmjhxopKTkzV9+nRt2rRJO3fulIeHh1mblZWlkJAQBQYG6v3339e6des0ZcoU1axZUy+//LKkP3/h6tGjh9q3b6/33ntPkrR//35t2rRJQ4YMKeQ7BTsGUMLMnz/fkGRs27bthjXu7u5GkyZNzOdjx441rv1y/uCDDwxJxunTp284x7Zt2wxJxvz583Mta9u2rSHJmDdvXp7L2rZtaz7/7rvvDElG1apVjfT0dHN88eLFhiRj+vTp5lj16tWN8PDwW855s97Cw8ON6tWrm8+XLVtmSDLefvttu7pu3boZNpvNOHTokDkmyXBycrIb++mnnwxJxsyZM3Nt61rTpk0zJBmff/65OXb58mUjKCjIKFeunN1rr169uhEaGnrT+a6tlZTrMXbsWMMwbv5e/PHHH7nGXnzxRaNMmTLGxYsX7baRn/2e8xoXL15sjmVmZhq1atUyJBnfffddgec8fPhwrveyffv2RkBAgF2P2dnZRsuWLY3atWubYznfC8HBwUZ2drY5PmzYMMPR0dFITU01xxo0aGC33fzK62stZ7utWrUyrl69alef1z5PSEgwJBn//ve/C9z70qVLb/n9ntd2L1++bDRs2NB47LHHzLFffvnFcHBwMLp06WJkZWXZ1V/bw432Vc73cs77fPnyZcPLy8to2LChceHCBbNuxYoVhiQjKirKHAsPDzckGRMmTLCbs0mTJkbTpk3N50OGDDHc3Nxy7VcUHT6Kwj2pXLlyN706Kue3qG+++abQJ9o6OzurT58++a7v1auXypcvbz7v1q2bqlSpolWrVhVq+/m1atUqOTo66pVXXrEbf/XVV2UYhlavXm03HhwcbHdi6IMPPig3Nzf973//u+V2fHx81KNHD3OsdOnSeuWVV5SRkaENGzYU+jUEBgYqNjbW7tGrVy9z+Y3ei2uPhJ0/f15nzpxR69at9ccff+jAgQMF7mPVqlWqUqWKunXrZo6VKVNGAwYMKPBcN3L27FnFx8frH//4h9nzmTNn9PvvvyskJES//PJLro85BgwYYPdRa+vWrZWVlaWjR48WWV956d+/vxwdHe3Grt3nV65c0e+//65atWrJw8NDO3bsyDXHrXrP+V5dsWKFrly5csNert3uuXPnlJaWptatW9ttc9myZcrOzlZUVJQcHOx/vBXmdhDbt29XSkqKBg4caHfuTWhoqOrWrauVK1fmWuell16ye966dWu77y0PDw9lZmYqNja2wP0gfwg2uCdlZGTYhYjrPfPMM3rkkUfUr18/eXt7q3v37lq8eHGBQk7VqlULdKJw7dq17Z7bbDbVqlXrts4vyY+jR4/K19c31/6oV6+eufxa9913X645PD09de7cuVtup3bt2rl+YNxoOwVRqVIlBQcH2z3uv/9+c/mN3ot9+/apS5cucnd3l5ubmypXrmyeAJuWllbgPo4ePapatWrl+iFYp06dAs91I4cOHZJhGBozZowqV65s98i5iiwlJcVunevfM09PT0m65Xt2u/z9/XONXbhwQVFRUeb5XJUqVVLlypWVmpqa5z6/Ve9t27ZV165dNX78eFWqVEl/+9vfNH/+/FznNa1YsUItWrSQi4uLKlSooMqVK2vu3Ll22/z111/l4OBQoBPXbybnazqv979u3bq5vuZdXFxyfex6/ffWwIED9cADD+jJJ59UtWrV9MILL+T7HDfkD+fY4J5z4sQJpaWlqVatWjescXV11caNG/Xdd99p5cqViomJ0aJFi/TYY49p7dq1uX4LvdEcRe1GvzVmZWXlq6eicKPtGNedaFyS5PVepKamqm3btnJzc9OECRNUs2ZNubi4aMeOHXrjjTfsQuyd2O+FnTOnr+HDhyskJCTPmuu/tovrPctrvw8ePFjz58/X0KFDFRQUJHd3d9lsNnXv3j3PXxxu1XvOzS1//PFHffvtt1qzZo1eeOEFTZkyRT/++KPKlSun77//Xk899ZTatGmjOXPmqEqVKipdurTmz5+vhQsXFu2Lvg35+Vry8vLSrl27tGbNGq1evVqrV6/W/Pnz1atXL7sT8VF4BBvccz777DNJuuEPhRwODg5q37692rdvr6lTp+rdd9/VqFGj9N133yk4OLjI71T8yy+/2D03DEOHDh2yu9+Op6enUlNTc6179OhRuyMUBemtevXqWrdunc6fP2931Cbno5jq1avne65bbWf37t3Kzs62O2pT1NvJr/Xr1+v333/X119/rTZt2pjjhw8fzlWb3/1evXp17d27V4Zh2L0HBw8eLPSc18tZVrp0abv7ydyuu3Xn7SVLlig8PFxTpkwxxy5evJjnviiIFi1aqEWLFnrnnXe0cOFC9ezZU19++aX69eun//u//5OLi4vWrFkjZ2dnc5358+fbzVGzZk1lZ2fr559/vum9qfK7r3K+pg8ePKjHHnvMbtnBgwcL/TXv5OSkzp07q3PnzsrOztbAgQP14YcfasyYMTf9hQ35w0dRuKfEx8frrbfekr+/v3r27HnDurNnz+Yay/mPLucQd849OW73P+Qc//73v+3O+1myZIlOnTqlJ5980hyrWbOmfvzxR12+fNkcW7FiRa7LwgvSW8eOHZWVlaVZs2bZjX/wwQey2Wx2278dHTt2VFJSknk1iyRdvXpVM2fOVLly5dS2bdsi2U5+5fx2fO1Ri8uXL2vOnDm5avO73zt27KiTJ0/aXUb8xx9/6KOPPir0nNfz8vLSo48+qg8//FCnTp3Ktfz6S6Hzq2zZskX2tXwzjo6OuY4UzZw5M9clzfl17ty5XPNd/73q6Ogom81mt40jR45o2bJlduuFhYXJwcFBEyZMyHX06Npt5HdfNWvWTF5eXpo3b57dR2OrV6/W/v377a7Gyq/ff//d7rmDg4P5y09etxVAwXHEBiXW6tWrdeDAAV29elXJycmKj49XbGysqlevruXLl9/0RloTJkzQxo0bFRoaqurVqyslJUVz5sxRtWrV1KpVK0l//mDy8PDQvHnzVL58eZUtW1aBgYF5nleQHxUqVFCrVq3Up08fJScna9q0aapVq5bdJen9+vXTkiVL9MQTT+gf//iHfv31V33++ee57vJakN46d+6sdu3aadSoUTpy5IgaNWqktWvX6ptvvtHQoUOL7A6yAwYM0IcffqjevXsrMTFRNWrU0JIlS7Rp0yZNmzbtpuc83QktW7aUp6enwsPD9corr8hms+mzzz7L8+OZ/O73/v37a9asWerVq5cSExNVpUoVffbZZ3neDDC/c+Zl9uzZatWqlQICAtS/f3/df//9Sk5OVkJCgk6cOKGffvqpwPujadOmmjt3rt5++23VqlVLXl5euY4yFIVOnTrps88+k7u7u+rXr6+EhAStW7fuprcKuJkFCxZozpw56tKli2rWrKnz58/r448/lpubmzp27Cjpz5N1p06dqieeeELPPvusUlJSNHv2bNWqVUu7d+8256pVq5ZGjRqlt956S61bt9bTTz8tZ2dnbdu2Tb6+vpo4caKk/O+r0qVL67333lOfPn3Utm1b9ejRw7zcu0aNGho2bFiBX2+/fv109uxZPfbYY6pWrZqOHj2qmTNnqnHjxub5arhNxXMxFnBjOZeJ5jycnJwMHx8f4/HHHzemT59ud1lxjusv946LizP+9re/Gb6+voaTk5Ph6+tr9OjRw/jvf/9rt94333xj1K9f3yhVqpTdJa9t27Y1GjRokGd/N7rc+z//+Y8xcuRIw8vLy3B1dTVCQ0ONo0eP5lp/ypQpRtWqVQ1nZ2fjkUceMbZv355rzpv1dv3l3oZhGOfPnzeGDRtm+Pr6GqVLlzZq165tTJ482e4SV8P483LvQYMG5erpRpcuXy85Odno06ePUalSJcPJyckICAjI85L0gl7ufbPam70XmzZtMlq0aGG4uroavr6+xuuvv26sWbMm16XZhpH//X706FHjqaeeMsqUKWNUqlTJGDJkiBETE1PoOfO63NswDOPXX381evXqZfj4+BilS5c2qlatanTq1MlYsmSJWXOjWx9cf1myYRhGUlKSERoaapQvX96QlO9Lv292uXdel2CfO3fO/BooV66cERISYhw4cCDX11B+e9+xY4fRo0cP47777jOcnZ0NLy8vo1OnTsb27dvt1vvXv/5l1K5d23B2djbq1q1rzJ8/P9f3fY5PP/3UaNKkieHs7Gx4enoabdu2NWJjY2+5r/Lar4ZhGIsWLTLnq1ChgtGzZ0/jxIkTdjXh4eFG2bJlc/VyfY9LliwxOnToYHh5eRlOTk7GfffdZ7z44ovGqVOncq2LwrEZRgk+YxAAAKAAOMcGAABYBsEGAABYBsEGAABYBsEGAABYBsEGAABYBsEGAABYBjfou4uys7N18uRJlS9f/q7d/hwAACswDEPnz5+Xr69vrj/Gey2CzV108uRJ+fn5FXcbAADcs44fP65q1ardcDnB5i7KueX88ePH5ebmVszdAABw70hPT5efn98t/3wLweYuyvn4yc3NjWADAEAh3OpUDk4eBgAAlkGwAQAAlkGwAQAAlkGwAQAAlkGwAQAAlkGwAQAAlkGwAQAAllGswWbjxo3q3LmzfH19ZbPZtGzZMrvlhmEoKipKVapUkaurq4KDg/XLL7/Y1Zw9e1Y9e/aUm5ubPDw81LdvX2VkZNjV7N69W61bt5aLi4v8/Pw0adKkXL189dVXqlu3rlxcXBQQEKBVq1YVuBcAAFC8ijXYZGZmqlGjRpo9e3aeyydNmqQZM2Zo3rx52rJli8qWLauQkBBdvHjRrOnZs6f27dun2NhYrVixQhs3btSAAQPM5enp6erQoYOqV6+uxMRETZ48WePGjdNHH31k1mzevFk9evRQ3759tXPnToWFhSksLEx79+4tUC8AAKCYGSWEJGPp0qXm8+zsbMPHx8eYPHmyOZaammo4Ozsb//nPfwzDMIyff/7ZkGRs27bNrFm9erVhs9mM3377zTAMw5gzZ47h6elpXLp0yax54403jDp16pjP//GPfxihoaF2/QQGBhovvvhivnvJj7S0NEOSkZaWlu91AABA/n+GlthzbA4fPqykpCQFBwebY+7u7goMDFRCQoIkKSEhQR4eHmrWrJlZExwcLAcHB23ZssWsadOmjZycnMyakJAQHTx4UOfOnTNrrt1OTk3OdvLTS14uXbqk9PR0uwcAALhzSuzfikpKSpIkeXt72417e3uby5KSkuTl5WW3vFSpUqpQoYJdjb+/f645cpZ5enoqKSnpltu5VS95mThxosaPH3/rF1tEmr7277u2LaC4JE7uVdwtACjBSuwRGysYOXKk0tLSzMfx48eLuyUAACytxAYbHx8fSVJycrLdeHJysrnMx8dHKSkpdsuvXr2qs2fP2tXkNce127hRzbXLb9VLXpydnc2/5M1f9AYA4M4rscHG399fPj4+iouLM8fS09O1ZcsWBQUFSZKCgoKUmpqqxMREsyY+Pl7Z2dkKDAw0azZu3KgrV66YNbGxsapTp448PT3Nmmu3k1OTs5389AIAAIpfsQabjIwM7dq1S7t27ZL050m6u3bt0rFjx2Sz2TR06FC9/fbbWr58ufbs2aNevXrJ19dXYWFhkqR69erpiSeeUP/+/bV161Zt2rRJERER6t69u3x9fSVJzz77rJycnNS3b1/t27dPixYt0vTp0xUZGWn2MWTIEMXExGjKlCk6cOCAxo0bp+3btysiIkKS8tULAAAofsV68vD27dvVrl0783lO2AgPD1d0dLRef/11ZWZmasCAAUpNTVWrVq0UExMjFxcXc50vvvhCERERat++vRwcHNS1a1fNmDHDXO7u7q61a9dq0KBBatq0qSpVqqSoqCi7e920bNlSCxcu1OjRo/Xmm2+qdu3aWrZsmRo2bGjW5KcXAABQvGyGYRjF3cRfRXp6utzd3ZWWlnZHzrfhqij8FXBVFPDXlN+foSX2HBsAAICCItgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLINgAAADLKNHBJisrS2PGjJG/v79cXV1Vs2ZNvfXWWzIMw6wxDENRUVGqUqWKXF1dFRwcrF9++cVunrNnz6pnz55yc3OTh4eH+vbtq4yMDLua3bt3q3Xr1nJxcZGfn58mTZqUq5+vvvpKdevWlYuLiwICArRq1ao788IBAEChlOhg895772nu3LmaNWuW9u/fr/fee0+TJk3SzJkzzZpJkyZpxowZmjdvnrZs2aKyZcsqJCREFy9eNGt69uypffv2KTY2VitWrNDGjRs1YMAAc3l6ero6dOig6tWrKzExUZMnT9a4ceP00UcfmTWbN29Wjx491LdvX+3cuVNhYWEKCwvT3r17787OAAAAt2Qzrj38UcJ06tRJ3t7e+te//mWOde3aVa6urvr8889lGIZ8fX316quvavjw4ZKktLQ0eXt7Kzo6Wt27d9f+/ftVv359bdu2Tc2aNZMkxcTEqGPHjjpx4oR8fX01d+5cjRo1SklJSXJycpIkjRgxQsuWLdOBAwckSc8884wyMzO1YsUKs5cWLVqocePGmjdvXr5eT3p6utzd3ZWWliY3N7ci2UfXavrav4t8TqCkSZzcq7hbAFAM8vsztEQfsWnZsqXi4uL03//+V5L0008/6YcfftCTTz4pSTp8+LCSkpIUHBxsruPu7q7AwEAlJCRIkhISEuTh4WGGGkkKDg6Wg4ODtmzZYta0adPGDDWSFBISooMHD+rcuXNmzbXbyanJ2U5eLl26pPT0dLsHAAC4c0oVdwM3M2LECKWnp6tu3bpydHRUVlaW3nnnHfXs2VOSlJSUJEny9va2W8/b29tclpSUJC8vL7vlpUqVUoUKFexq/P39c82Rs8zT01NJSUk33U5eJk6cqPHjxxf0ZQMAgEIq0UdsFi9erC+++EILFy7Ujh07tGDBAr3//vtasGBBcbeWLyNHjlRaWpr5OH78eHG3BACApZXoIzavvfaaRowYoe7du0uSAgICdPToUU2cOFHh4eHy8fGRJCUnJ6tKlSrmesnJyWrcuLEkycfHRykpKXbzXr16VWfPnjXX9/HxUXJysl1NzvNb1eQsz4uzs7OcnZ0L+rIBAEAhlegjNn/88YccHOxbdHR0VHZ2tiTJ399fPj4+iouLM5enp6dry5YtCgoKkiQFBQUpNTVViYmJZk18fLyys7MVGBho1mzcuFFXrlwxa2JjY1WnTh15enqaNdduJ6cmZzsAAKD4lehg07lzZ73zzjtauXKljhw5oqVLl2rq1Knq0qWLJMlms2no0KF6++23tXz5cu3Zs0e9evWSr6+vwsLCJEn16tXTE088of79+2vr1q3atGmTIiIi1L17d/n6+kqSnn32WTk5Oalv377at2+fFi1apOnTpysyMtLsZciQIYqJidGUKVN04MABjRs3Ttu3b1dERMRd3y8AACBvJfqjqJkzZ2rMmDEaOHCgUlJS5OvrqxdffFFRUVFmzeuvv67MzEwNGDBAqampatWqlWJiYuTi4mLWfPHFF4qIiFD79u3l4OCgrl27asaMGeZyd3d3rV27VoMGDVLTpk1VqVIlRUVF2d3rpmXLllq4cKFGjx6tN998U7Vr19ayZcvUsGHDu7MzAADALZXo+9hYDfexAW4f97EB/poscR8bAACAgiDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyDYAAAAyyjxwea3337Tc889p4oVK8rV1VUBAQHavn27udwwDEVFRalKlSpydXVVcHCwfvnlF7s5zp49q549e8rNzU0eHh7q27evMjIy7Gp2796t1q1by8XFRX5+fpo0aVKuXr766ivVrVtXLi4uCggI0KpVq+7MiwYAAIVSooPNuXPn9Mgjj6h06dJavXq1fv75Z02ZMkWenp5mzaRJkzRjxgzNmzdPW7ZsUdmyZRUSEqKLFy+aNT179tS+ffsUGxurFStWaOPGjRowYIC5PD09XR06dFD16tWVmJioyZMna9y4cfroo4/Mms2bN6tHjx7q27evdu7cqbCwMIWFhWnv3r13Z2cAAIBbshmGYRR3EzcyYsQIbdq0Sd9//32eyw3DkK+vr1599VUNHz5ckpSWliZvb29FR0ere/fu2r9/v+rXr69t27apWbNmkqSYmBh17NhRJ06ckK+vr+bOnatRo0YpKSlJTk5O5raXLVumAwcOSJKeeeYZZWZmasWKFeb2W7RoocaNG2vevHn5ej3p6elyd3dXWlqa3NzcCr1fbqTpa/8u8jmBkiZxcq/ibgFAMcjvz9ASfcRm+fLlatasmf7+97/Ly8tLTZo00ccff2wuP3z4sJKSkhQcHGyOubu7KzAwUAkJCZKkhIQEeXh4mKFGkoKDg+Xg4KAtW7aYNW3atDFDjSSFhITo4MGDOnfunFlz7XZyanK2k5dLly4pPT3d7gEAAO6cEh1s/ve//2nu3LmqXbu21qxZo5dfflmvvPKKFixYIElKSkqSJHl7e9ut5+3tbS5LSkqSl5eX3fJSpUqpQoUKdjV5zXHtNm5Uk7M8LxMnTpS7u7v58PPzK9DrBwAABVOig012drYeeughvfvuu2rSpIkGDBig/v375/ujn+I2cuRIpaWlmY/jx48Xd0sAAFhaiQ42VapUUf369e3G6tWrp2PHjkmSfHx8JEnJycl2NcnJyeYyHx8fpaSk2C2/evWqzp49a1eT1xzXbuNGNTnL8+Ls7Cw3Nze7BwAAuHNKdLB55JFHdPDgQbux//73v6pevbokyd/fXz4+PoqLizOXp6ena8uWLQoKCpIkBQUFKTU1VYmJiWZNfHy8srOzFRgYaNZs3LhRV65cMWtiY2NVp04d8wqsoKAgu+3k1ORsBwAAFL9CBZv7779fv//+e67x1NRU3X///bfdVI5hw4bpxx9/1LvvvqtDhw5p4cKF+uijjzRo0CBJks1m09ChQ/X2229r+fLl2rNnj3r16iVfX1+FhYVJ+vMIzxNPPKH+/ftr69at2rRpkyIiItS9e3f5+vpKkp599lk5OTmpb9++2rdvnxYtWqTp06crMjLS7GXIkCGKiYnRlClTdODAAY0bN07bt29XREREkb1eAABwe0oVZqUjR44oKysr1/ilS5f022+/3XZTOZo3b66lS5dq5MiRmjBhgvz9/TVt2jT17NnTrHn99deVmZmpAQMGKDU1Va1atVJMTIxcXFzMmi+++EIRERFq3769HBwc1LVrV82YMcNc7u7urrVr12rQoEFq2rSpKlWqpKioKLt73bRs2VILFy7U6NGj9eabb6p27dpatmyZGjZsWGSvFwAA3J4C3cdm+fLlkqSwsDAtWLBA7u7u5rKsrCzFxcUpNjY218dH+BP3sQFuH/exAf6a8vsztEBHbHI+3rHZbAoPD7dbVrp0adWoUUNTpkwpeLcAAABFoEDBJjs7W9KfJ+1u27ZNlSpVuiNNAQAAFEahzrE5fPhwUfcBAABw2woVbCQpLi5OcXFxSklJMY/k5Pj0009vuzEAAICCKlSwGT9+vCZMmKBmzZqpSpUqstlsRd0XAABAgRUq2MybN0/R0dF6/vnni7ofAACAQivUDfouX76sli1bFnUvAAAAt6VQwaZfv35auHBhUfcCAABwWwr1UdTFixf10Ucfad26dXrwwQdVunRpu+VTp04tkuYAAAAKolDBZvfu3WrcuLEkae/evXbLOJEYAAAUl0IFm++++66o+wAAALhthTrHBgAAoCQq1BGbdu3a3fQjp/j4+EI3BAAAUFiFCjY559fkuHLlinbt2qW9e/fm+uOYAAAAd0uhgs0HH3yQ5/i4ceOUkZFxWw0BAAAUVpGeY/Pcc8/xd6IAAECxKdJgk5CQIBcXl6KcEgAAIN8K9VHU008/bffcMAydOnVK27dv15gxY4qkMQAAgIIqVLBxd3e3e+7g4KA6depowoQJ6tChQ5E0BgAAUFCFCjbz588v6j4AAABuW6GCTY7ExETt379fktSgQQM1adKkSJoCAAAojEIFm5SUFHXv3l3r16+Xh4eHJCk1NVXt2rXTl19+qcqVKxdljwAAAPlSqKuiBg8erPPnz2vfvn06e/aszp49q7179yo9PV2vvPJKUfcIAACQL4U6YhMTE6N169apXr165lj9+vU1e/ZsTh4GAADFplBHbLKzs1W6dOlc46VLl1Z2dvZtNwUAAFAYhQo2jz32mIYMGaKTJ0+aY7/99puGDRum9u3bF1lzAAAABVGoYDNr1iylp6erRo0aqlmzpmrWrCl/f3+lp6dr5syZRd0jAABAvhTqHBs/Pz/t2LFD69at04EDByRJ9erVU3BwcJE2BwAAUBAFOmITHx+v+vXrKz09XTabTY8//rgGDx6swYMHq3nz5mrQoIG+//77O9UrAADATRUo2EybNk39+/eXm5tbrmXu7u568cUXNXXq1CJrDgAAoCAKFGx++uknPfHEEzdc3qFDByUmJt52UwAAAIVRoGCTnJyc52XeOUqVKqXTp0/fdlMAAACFUaBgU7VqVe3du/eGy3fv3q0qVarcdlMAAACFUaBg07FjR40ZM0YXL17MtezChQsaO3asOnXqVGTNAQAAFESBLvcePXq0vv76az3wwAOKiIhQnTp1JEkHDhzQ7NmzlZWVpVGjRt2RRgEAAG6lQMHG29tbmzdv1ssvv6yRI0fKMAxJks1mU0hIiGbPni1vb+870igAAMCtFPgGfdWrV9eqVat07tw5HTp0SIZhqHbt2vL09LwT/QEAAORboe48LEmenp5q3rx5UfYCAABwWwr1t6IAAABKIoINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwDIINAACwjHsq2Pzzn/+UzWbT0KFDzbGLFy9q0KBBqlixosqVK6euXbsqOTnZbr1jx44pNDRUZcqUkZeXl1577TVdvXrVrmb9+vV66KGH5OzsrFq1aik6OjrX9mfPnq0aNWrIxcVFgYGB2rp16514mQAAoJDumWCzbds2ffjhh3rwwQftxocNG6Zvv/1WX331lTZs2KCTJ0/q6aefNpdnZWUpNDRUly9f1ubNm7VgwQJFR0crKirKrDl8+LBCQ0PVrl077dq1S0OHDlW/fv20Zs0as2bRokWKjIzU2LFjtWPHDjVq1EghISFKSUm58y8eAADki83I+RPdJVhGRoYeeughzZkzR2+//bYaN26sadOmKS0tTZUrV9bChQvVrVs3SdKBAwdUr149JSQkqEWLFlq9erU6deqkkydPmn95fN68eXrjjTd0+vRpOTk56Y033tDKlSu1d+9ec5vdu3dXamqqYmJiJEmBgYFq3ry5Zs2aJUnKzs6Wn5+fBg8erBEjRuTrdaSnp8vd3V1paWlyc3Mryl0kSWr62r+LfE6gpEmc3Ku4WwBQDPL7M/SeOGIzaNAghYaGKjg42G48MTFRV65csRuvW7eu7rvvPiUkJEiSEhISFBAQYIYaSQoJCVF6err27dtn1lw/d0hIiDnH5cuXlZiYaFfj4OCg4OBgsyYvly5dUnp6ut0DAADcOYX+6953y5dffqkdO3Zo27ZtuZYlJSXJyclJHh4eduPe3t5KSkoya64NNTnLc5bdrCY9PV0XLlzQuXPnlJWVlWfNgQMHbtj7xIkTNX78+Py9UAAAcNtK9BGb48ePa8iQIfriiy/k4uJS3O0U2MiRI5WWlmY+jh8/XtwtAQBgaSU62CQmJiolJUUPPfSQSpUqpVKlSmnDhg2aMWOGSpUqJW9vb12+fFmpqal26yUnJ8vHx0eS5OPjk+sqqZznt6pxc3OTq6urKlWqJEdHxzxrcubIi7Ozs9zc3OweAADgzinRwaZ9+/bas2ePdu3aZT6aNWumnj17mv8uXbq04uLizHUOHjyoY8eOKSgoSJIUFBSkPXv22F29FBsbKzc3N9WvX9+suXaOnJqcOZycnNS0aVO7muzsbMXFxZk1AACg+JXoc2zKly+vhg0b2o2VLVtWFStWNMf79u2ryMhIVahQQW5ubho8eLCCgoLUokULSVKHDh1Uv359Pf/885o0aZKSkpI0evRoDRo0SM7OzpKkl156SbNmzdLrr7+uF154QfHx8Vq8eLFWrlxpbjcyMlLh4eFq1qyZHn74YU2bNk2ZmZnq06fPXdobAADgVkp0sMmPDz74QA4ODuratasuXbqkkJAQzZkzx1zu6OioFStW6OWXX1ZQUJDKli2r8PBwTZgwwazx9/fXypUrNWzYME2fPl3VqlXTJ598opCQELPmmWee0enTpxUVFaWkpCQ1btxYMTExuU4oBgAAxeeeuI+NVXAfG+D2cR8b4K/JUvexAQAAyA+CDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsAyCDQAAsIwSHWwmTpyo5s2bq3z58vLy8lJYWJgOHjxoV3Px4kUNGjRIFStWVLly5dS1a1clJyfb1Rw7dkyhoaEqU6aMvLy89Nprr+nq1at2NevXr9dDDz0kZ2dn1apVS9HR0bn6mT17tmrUqCEXFxcFBgZq69atRf6aAQBA4ZXoYLNhwwYNGjRIP/74o2JjY3XlyhV16NBBmZmZZs2wYcP07bff6quvvtKGDRt08uRJPf300+byrKwshYaG6vLly9q8ebMWLFig6OhoRUVFmTWHDx9WaGio2rVrp127dmno0KHq16+f1qxZY9YsWrRIkZGRGjt2rHbs2KFGjRopJCREKSkpd2dnAACAW7IZhmEUdxP5dfr0aXl5eWnDhg1q06aN0tLSVLlyZS1cuFDdunWTJB04cED16tVTQkKCWrRoodWrV6tTp046efKkvL29JUnz5s3TG2+8odOnT8vJyUlvvPGGVq5cqb1795rb6t69u1JTUxUTEyNJCgwMVPPmzTVr1ixJUnZ2tvz8/DR48GCNGDEiX/2np6fL3d1daWlpcnNzK8pdI0lq+tq/i3xOoKRJnNyruFsAUAzy+zO0RB+xuV5aWpokqUKFCpKkxMREXblyRcHBwWZN3bp1dd999ykhIUGSlJCQoICAADPUSFJISIjS09O1b98+s+baOXJqcua4fPmyEhMT7WocHBwUHBxs1uTl0qVLSk9Pt3sAAIA7554JNtnZ2Ro6dKgeeeQRNWzYUJKUlJQkJycneXh42NV6e3srKSnJrLk21OQsz1l2s5r09HRduHBBZ86cUVZWVp41OXPkZeLEiXJ3dzcffn5+BX/hAAAg3+6ZYDNo0CDt3btXX375ZXG3km8jR45UWlqa+Th+/HhxtwQAgKWVKu4G8iMiIkIrVqzQxo0bVa1aNXPcx8dHly9fVmpqqt1Rm+TkZPn4+Jg111+9lHPV1LU1119JlZycLDc3N7m6usrR0VGOjo551uTMkRdnZ2c5OzsX/AUDAIBCKdFHbAzDUEREhJYuXar4+Hj5+/vbLW/atKlKly6tuLg4c+zgwYM6duyYgoKCJElBQUHas2eP3dVLsbGxcnNzU/369c2aa+fIqcmZw8nJSU2bNrWryc7OVlxcnFkDAACKX4k+YjNo0CAtXLhQ33zzjcqXL2+ez+Lu7i5XV1e5u7urb9++ioyMVIUKFeTm5qbBgwcrKChILVq0kCR16NBB9evX1/PPP69JkyYpKSlJo0eP1qBBg8yjKS+99JJmzZql119/XS+88ILi4+O1ePFirVy50uwlMjJS4eHhatasmR5++GFNmzZNmZmZ6tOnz93fMQAAIE8lOtjMnTtXkvToo4/ajc+fP1+9e/eWJH3wwQdycHBQ165ddenSJYWEhGjOnDlmraOjo1asWKGXX35ZQUFBKlu2rMLDwzVhwgSzxt/fXytXrtSwYcM0ffp0VatWTZ988olCQkLMmmeeeUanT59WVFSUkpKS1LhxY8XExOQ6oRgAABSfe+o+Nvc67mMD3D7uYwP8NVnyPjYAAAA3Q7ABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbABAACWQbApoNmzZ6tGjRpycXFRYGCgtm7dWtwtAQCA/w/BpgAWLVqkyMhIjR07Vjt27FCjRo0UEhKilJSU4m4NAACIYFMgU6dOVf/+/dWnTx/Vr19f8+bNU5kyZfTpp58Wd2sAAEBSqeJu4F5x+fJlJSYmauTIkeaYg4ODgoODlZCQkOc6ly5d0qVLl8znaWlpkqT09PQ70mPWpQt3ZF6gJLlT3z93Q5vR/ynuFoA7buPbPe7IvDnf+4Zh3LSOYJNPZ86cUVZWlry9ve3Gvb29deDAgTzXmThxosaPH59r3M/P7470CPwVuM98qbhbAHATd/p79Pz583J3d7/hcoLNHTRy5EhFRkaaz7Ozs3X27FlVrFhRNputGDtDUUhPT5efn5+OHz8uNze34m4HwHX4HrUWwzB0/vx5+fr63rSOYJNPlSpVkqOjo5KTk+3Gk5OT5ePjk+c6zs7OcnZ2thvz8PC4Uy2imLi5ufGfJlCC8T1qHTc7UpODk4fzycnJSU2bNlVcXJw5lp2drbi4OAUFBRVjZwAAIAdHbAogMjJS4eHhatasmR5++GFNmzZNmZmZ6tOnT3G3BgAARLApkGeeeUanT59WVFSUkpKS1LhxY8XExOQ6oRh/Dc7Ozho7dmyujxsBlAx8j/412YxbXTcFAABwj+AcGwAAYBkEGwAAYBkEGwAAYBkEGwAAYBkEG6CQZs+erRo1asjFxUWBgYHaunVrcbcEQNLGjRvVuXNn+fr6ymazadmyZcXdEu4igg1QCIsWLVJkZKTGjh2rHTt2qFGjRgoJCVFKSkpxtwb85WVmZqpRo0aaPXt2cbeCYsDl3kAhBAYGqnnz5po1a5akP+9C7efnp8GDB2vEiBHF3B2AHDabTUuXLlVYWFhxt4K7hCM2QAFdvnxZiYmJCg4ONsccHBwUHByshISEYuwMAECwAQrozJkzysrKynXHaW9vbyUlJRVTVwAAiWADAAAshGADFFClSpXk6Oio5ORku/Hk5GT5+PgUU1cAAIlgAxSYk5OTmjZtqri4OHMsOztbcXFxCgoKKsbOAAD8dW+gECIjIxUeHq5mzZrp4Ycf1rRp05SZmak+ffoUd2vAX15GRoYOHTpkPj98+LB27dqlChUq6L777ivGznA3cLk3UEizZs3S5MmTlZSUpMaNG2vGjBkKDAws7raAv7z169erXbt2ucbDw8MVHR199xvCXUWwAQAAlsE5NgAAwDIINgAAwDIINgAAwDIINgAAwDIINgAAwDIINgAAwDIINgAAwDIINgAAwDIINgCQhyNHjshms2nXrl35XmfcuHFq3LjxHesJwK0RbAAUqd69eyssLKxQ6+aEiesfzz33XNE2eQ+Jjo6Wh4eH3djFixc1cuRI1apVS2XKlFGjRo0UGxtbPA0CJQx/BBNAibNu3To1aNDAfO7q6pqrxjAMZWVlqVSpv95/YykpKTp+/Lg+/fRTVa1aVZMnT1aXLl2UnJyssmXLFnd7QLHiiA2AO2rJkiUKCAiQq6urKlasqODgYGVmZt50nYoVK8rHx8d8uLu7a/369bLZbFq9erWaNm0qZ2dn/fDDD/r111/1t7/9Td7e3ipXrpyaN2+udevW2c1ns9m0bNkyuzEPDw+7P4i4detWNWnSRC4uLmrWrJl27txpV5/XkZNly5bJZrPd9LV88sknqlevnlxcXFS3bl3NmTPHXJZzhOrrr79Wu3btzKMvCQkJkv78Y459+vRRWlqaefRq3Lhxuu+++/T555+rTZs2qlmzpgYMGKDMzEydO3fupr0AfwUEGwB3zKlTp9SjRw+98MIL2r9/v9avX6+nn35at/O3d0eMGKF//vOf2r9/vx588EFlZGSoY8eOiouL086dO/XEE0+oc+fOOnbsWL7nzMjIUKdOnVS/fn0lJiZq3LhxGj58eKF7zPHFF18oKipK77zzjvbv3693331XY8aM0YIFC+zqRo0apeHDh2vXrl164IEH1KNHD129elUtW7bUtGnT5ObmplOnTunUqVO5+rp06ZJGjhypxx9/XNWqVbvtnoF73V/vGC6Au+bUqVO6evWqnn76aVWvXl2SFBAQcMv1WrZsKQeH///3ru+//97894QJE/T444+bzytUqKBGjRqZz9966y0tXbpUy5cvV0RERL76XLhwobKzs/Wvf/1LLi4uatCggU6cOKGXX345X+vfyNixYzVlyhQ9/fTTkiR/f3/9/PPP+vDDDxUeHm7WDR8+XKGhoZKk8ePHq0GDBjp06JDq1q0rd3d32Ww2+fj45Jr/6tWreuqpp5SRkaHVq1ffVq+AVRBsANwxjRo1Uvv27RUQEKCQkBB16NBB3bp1k6en503XW7RokerVq2c+9/PzMz+eadasmV1tRkaGxo0bp5UrV5pB6sKFCwU6YpNz9MfFxcUcCwoKyvf6ecnMzNSvv/6qvn37qn///ub41atX5e7ublf74IMPmv+uUqWKpD/Po6lbt+5Nt7F06VL98MMPOnHihNzc3G6rX8AqCDYA7hhHR0fFxsZq8+bNWrt2rWbOnKlRo0Zpy5Yt8vf3v+F6fn5+qlWrVp7Lrj85dvjw4YqNjdX777+vWrVqydXVVd26ddPly5fNGpvNluvjrytXrhTotTg4OBRojoyMDEnSxx9/rMDAQLtljo6Ods9Lly5t16skZWdn37KnkydPqnLlyrcMisBfCefYALijbDabHnnkEY0fP147d+6Uk5OTli5dWmTzb9q0Sb1791aXLl0UEBAgHx8fHTlyxK6mcuXKOnXqlPn8l19+0R9//GE+r1evnnbv3q2LFy+aYz/++GOuOc6fP2934vPN7nHj7e0tX19f/e9//1OtWrXsHjcLdddzcnJSVlZWnst69Oihb7/9Nt9zAX8FBBsAd8yWLVv07rvvavv27Tp27Ji+/vprnT592u5jpttVu3Ztff3119q1a5d++uknPfvss7mOdjz22GOaNWuWdu7cqe3bt+ull16yO0ry7LPPymazqX///vr555+1atUqvf/++3ZzBAYGqkyZMnrzzTf166+/auHChXZXVeVl/PjxmjhxombMmKH//ve/2rNnj+bPn6+pU6fm+/XVqFFDGRkZiouL05kzZ+wC2eLFizV06NB8zwX8FRBsANwxbm5u2rhxozp27KgHHnhAo0eP1pQpU/Tkk08W2TamTp0qT09PtWzZUp07d1ZISIgeeughu5opU6bIz89PrVu31rPPPqvhw4erTJky5vJy5crp22+/1Z49e9SkSRONGjVK7733nt0cFSpU0Oeff65Vq1YpICBA//nPfzRu3Lib9tavXz998sknmj9/vgICAtS2bVtFR0cX6IhNy5Yt9dJLL+mZZ55R5cqVNWnSJHPZmTNn9Ouvv+Z7LuCvwGbcznWXAAAAJQhHbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGUQbAAAgGX8P3DKyKerQOM6AAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# Plot distribution of 'isFraud'\n", + "\n", + "plt.figure(figsize=(6, 4))\n", + "sns.countplot(data=fraud, x='isFraud')\n", + "plt.title('Distribution of Fraudulent Transactions')\n", + "plt.xlabel('Is Fraudulent?')\n", + "plt.ylabel('Count')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### What is the distribution of the outcome? " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"The majority of the outcome is zeor meaning that they are not fraudulent.\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Clean the dataset. Pre-process it to make it suitable for ML training. Feel free to explore, drop, encode, transform, etc. Whatever you feel will improve the model score." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Index: 100000 entries, 6322570 to 4679267\n", + "Data columns (total 11 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 step 100000 non-null int64 \n", + " 1 type 100000 non-null object \n", + " 2 amount 100000 non-null float64\n", + " 3 nameOrig 100000 non-null object \n", + " 4 oldbalanceOrg 100000 non-null float64\n", + " 5 newbalanceOrig 100000 non-null float64\n", + " 6 nameDest 100000 non-null object \n", + " 7 oldbalanceDest 100000 non-null float64\n", + " 8 newbalanceDest 100000 non-null float64\n", + " 9 isFraud 100000 non-null int64 \n", + " 10 isFlaggedFraud 100000 non-null int64 \n", + "dtypes: float64(5), int64(3), object(3)\n", + "memory usage: 9.2+ MB\n" + ] + } + ], + "source": [ + "fraud.info()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "fraud = pd.get_dummies(fraud, columns=['type'], drop_first=True)\n", + "\n", + "fraud.drop(columns=['nameOrig', 'nameDest'], inplace=True)\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "fraud.drop(columns=['step', 'isFlaggedFraud'], inplace=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "amount 0\n", + "oldbalanceOrg 0\n", + "newbalanceOrig 0\n", + "oldbalanceDest 0\n", + "newbalanceDest 0\n", + "isFraud 0\n", + "type_CASH_OUT 0\n", + "type_DEBIT 0\n", + "type_PAYMENT 0\n", + "type_TRANSFER 0\n", + "dtype: int64\n" + ] + } + ], + "source": [ + "missing_values = fraud.isnull().sum()\n", + "print(missing_values)" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [], + "source": [ + "X = fraud.drop(columns=['isFraud'])\n", + "y = fraud['isFraud']\n", + "\n", + "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", + "\n", + "numeric_features = X.select_dtypes(include=['int64', 'float64']).columns.tolist()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "preprocessor = ColumnTransformer(\n", + " transformers=[\n", + " ('num', StandardScaler(), numeric_features)\n", + " ])\n", + "\n", + "pipeline = Pipeline(steps=[('preprocessor', preprocessor)])\n", + "\n", + "X_train_preprocessed = pipeline.fit_transform(X_train)\n", + "X_test_preprocessed = pipeline.transform(X_test)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Run a logisitc regression classifier and evaluate its accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 0.99925\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.metrics import accuracy_score\n", + "\n", + "model = LogisticRegression()\n", + "\n", + "model.fit(X_train_preprocessed, y_train)\n", + "y_pred = model.predict(X_test_preprocessed)\n", + "\n", + "accuracy = accuracy_score(y_test, y_pred)\n", + "print(\"Accuracy:\", accuracy)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Now pick a model of your choice and evaluate its accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Random Forest Classifier Accuracy: 0.99945\n" + ] + } + ], + "source": [ + "from sklearn.ensemble import RandomForestClassifier\n", + "from sklearn.metrics import accuracy_score\n", + "\n", + "rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)\n", + "rf_classifier.fit(X_train_preprocessed, y_train)\n", + "\n", + "y_pred_rf = rf_classifier.predict(X_test_preprocessed)\n", + "\n", + "accuracy_rf = accuracy_score(y_test, y_pred_rf)\n", + "print(\"Random Forest Classifier Accuracy:\", accuracy_rf)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Which model worked better and how do you know?" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + " Random Forest Classifier achieved a slightly higher accuracy score of 0.99945" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Note: before doing the first commit, make sure you don't include the large csv file, either by adding it to .gitignore, or by deleting it." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}