diff --git a/Data Intake Report_G2M case study.pdf b/Data Intake Report_G2M case study.pdf
new file mode 100644
index 00000000..0dfa1a3b
Binary files /dev/null and b/Data Intake Report_G2M case study.pdf differ
diff --git a/G2M insight for Cab Investment Case Study.ipynb b/G2M insight for Cab Investment Case Study.ipynb
new file mode 100644
index 00000000..abf5c66c
--- /dev/null
+++ b/G2M insight for Cab Investment Case Study.ipynb
@@ -0,0 +1,1786 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "63b21722",
+ "metadata": {},
+ "source": [
+ "# Go-to-Market(G2M) insight for Cab Investment firm\n",
+ "\n",
+ "## Introduction\n",
+ "\n",
+ "**The Client**\n",
+ "\n",
+ "XYZ is a private firm in US. Due to remarkable growth in the Cab Industry in last few years and multiple key players in the market, it is planning for an investment in Cab industry and as per their Go-to-Market(G2M) strategy they want to understand the market before taking final decision.\n",
+ "\n",
+ "**Data Set:**\n",
+ "\n",
+ "We have been provided 4 individual data sets. Time period of data is from 31/01/2016 to 31/12/2018.\n",
+ "\n",
+ "Below are the list of datasets which are provided for the analysis:\n",
+ "\n",
+ "**Cab_Data.csv** – This file includes details of transaction for 2 cab companies\n",
+ "\n",
+ "**Customer_ID.csv** – This is a mapping table that contains a unique identifier which links the customer’s demographic details\n",
+ "\n",
+ "**Transaction_ID.csv** – This is a mapping table that contains transaction to customer mapping and payment mode\n",
+ "\n",
+ "**City.csv** – This file contains list of US cities, their population and number of cab users"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "71f56ea5",
+ "metadata": {},
+ "source": [
+ "## To decide which company is a better investment opportunity for XYZ we will try to respond the following questions:\n",
+ "\n",
+ "•\tWhich company has had more profit over the years?\n",
+ "\n",
+ "•\tWhich company has users with better income?\n",
+ "\n",
+ "•\tWhich company has more users by city?\n",
+ "\n",
+ "•\tWhich company has more ride throughout the years?\n",
+ "\n",
+ "•\tWhich company tend to retain more customers?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4f003b9c",
+ "metadata": {},
+ "source": [
+ "## Exploratory Data Analysis (EDA)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "c346aa0e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
Transaction ID
\n",
+ "
Date of Travel
\n",
+ "
Company
\n",
+ "
City
\n",
+ "
KM Travelled
\n",
+ "
Price Charged
\n",
+ "
Cost of Trip
\n",
+ "
\n",
+ " \n",
+ " \n",
+ "
\n",
+ "
0
\n",
+ "
10000011
\n",
+ "
42377
\n",
+ "
Pink Cab
\n",
+ "
ATLANTA GA
\n",
+ "
30.45
\n",
+ "
370.95
\n",
+ "
313.635
\n",
+ "
\n",
+ "
\n",
+ "
1
\n",
+ "
10000012
\n",
+ "
42375
\n",
+ "
Pink Cab
\n",
+ "
ATLANTA GA
\n",
+ "
28.62
\n",
+ "
358.52
\n",
+ "
334.854
\n",
+ "
\n",
+ "
\n",
+ "
2
\n",
+ "
10000013
\n",
+ "
42371
\n",
+ "
Pink Cab
\n",
+ "
ATLANTA GA
\n",
+ "
9.04
\n",
+ "
125.20
\n",
+ "
97.632
\n",
+ "
\n",
+ "
\n",
+ "
3
\n",
+ "
10000014
\n",
+ "
42376
\n",
+ "
Pink Cab
\n",
+ "
ATLANTA GA
\n",
+ "
33.17
\n",
+ "
377.40
\n",
+ "
351.602
\n",
+ "
\n",
+ "
\n",
+ "
4
\n",
+ "
10000015
\n",
+ "
42372
\n",
+ "
Pink Cab
\n",
+ "
ATLANTA GA
\n",
+ "
8.73
\n",
+ "
114.62
\n",
+ "
97.776
\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Transaction ID Date of Travel Company City KM Travelled \\\n",
+ "0 10000011 42377 Pink Cab ATLANTA GA 30.45 \n",
+ "1 10000012 42375 Pink Cab ATLANTA GA 28.62 \n",
+ "2 10000013 42371 Pink Cab ATLANTA GA 9.04 \n",
+ "3 10000014 42376 Pink Cab ATLANTA GA 33.17 \n",
+ "4 10000015 42372 Pink Cab ATLANTA GA 8.73 \n",
+ "\n",
+ " Price Charged Cost of Trip \n",
+ "0 370.95 313.635 \n",
+ "1 358.52 334.854 \n",
+ "2 125.20 97.632 \n",
+ "3 377.40 351.602 \n",
+ "4 114.62 97.776 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
City
\n",
+ "
Population
\n",
+ "
Users
\n",
+ "
\n",
+ " \n",
+ " \n",
+ "
\n",
+ "
0
\n",
+ "
NEW YORK NY
\n",
+ "
8,405,837
\n",
+ "
302,149
\n",
+ "
\n",
+ "
\n",
+ "
1
\n",
+ "
CHICAGO IL
\n",
+ "
1,955,130
\n",
+ "
164,468
\n",
+ "
\n",
+ "
\n",
+ "
2
\n",
+ "
LOS ANGELES CA
\n",
+ "
1,595,037
\n",
+ "
144,132
\n",
+ "
\n",
+ "
\n",
+ "
3
\n",
+ "
MIAMI FL
\n",
+ "
1,339,155
\n",
+ "
17,675
\n",
+ "
\n",
+ "
\n",
+ "
4
\n",
+ "
SILICON VALLEY
\n",
+ "
1,177,609
\n",
+ "
27,247
\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " City Population Users\n",
+ "0 NEW YORK NY 8,405,837 302,149 \n",
+ "1 CHICAGO IL 1,955,130 164,468 \n",
+ "2 LOS ANGELES CA 1,595,037 144,132 \n",
+ "3 MIAMI FL 1,339,155 17,675 \n",
+ "4 SILICON VALLEY 1,177,609 27,247 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
Customer ID
\n",
+ "
Gender
\n",
+ "
Age
\n",
+ "
Income (USD/Month)
\n",
+ "
\n",
+ " \n",
+ " \n",
+ "
\n",
+ "
0
\n",
+ "
29290
\n",
+ "
Male
\n",
+ "
28
\n",
+ "
10813
\n",
+ "
\n",
+ "
\n",
+ "
1
\n",
+ "
27703
\n",
+ "
Male
\n",
+ "
27
\n",
+ "
9237
\n",
+ "
\n",
+ "
\n",
+ "
2
\n",
+ "
28712
\n",
+ "
Male
\n",
+ "
53
\n",
+ "
11242
\n",
+ "
\n",
+ "
\n",
+ "
3
\n",
+ "
28020
\n",
+ "
Male
\n",
+ "
23
\n",
+ "
23327
\n",
+ "
\n",
+ "
\n",
+ "
4
\n",
+ "
27182
\n",
+ "
Male
\n",
+ "
33
\n",
+ "
8536
\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Customer ID Gender Age Income (USD/Month)\n",
+ "0 29290 Male 28 10813\n",
+ "1 27703 Male 27 9237\n",
+ "2 28712 Male 53 11242\n",
+ "3 28020 Male 23 23327\n",
+ "4 27182 Male 33 8536"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "
"
+ ],
+ "text/plain": [
+ " Company Year Customer ID Number of rides\n",
+ "0 Pink Cab 2016 1 1\n",
+ "1 Pink Cab 2016 2 2\n",
+ "2 Pink Cab 2016 3 2\n",
+ "3 Pink Cab 2016 5 2\n",
+ "4 Pink Cab 2016 6 1\n",
+ "... ... ... ... ...\n",
+ "134895 Yellow Cab 2018 59996 3\n",
+ "134896 Yellow Cab 2018 59997 4\n",
+ "134897 Yellow Cab 2018 59998 2\n",
+ "134898 Yellow Cab 2018 59999 4\n",
+ "134899 Yellow Cab 2018 60000 7\n",
+ "\n",
+ "[134900 rows x 4 columns]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/plain": [
+ "[,\n",
+ " ,\n",
+ " ]"
+ ]
+ },
+ "execution_count": 24,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {
+ "needs_background": "light"
+ },
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "retention_df = data.groupby(['Company',data['Date of Travel'].dt.year,'Customer ID'])[['Payment_Mode']].count().reset_index()\n",
+ "retention_df.columns = ['Company','Year','Customer ID', 'Number of rides']\n",
+ "display(retention_df)\n",
+ "\n",
+ "sporadic_customers = retention_df[retention_df['Number of rides'] <= 5]\n",
+ "loyal_customers = retention_df[retention_df['Number of rides'] > 5]\n",
+ "\n",
+ "fig, ax = plt.subplots(1,2)\n",
+ "\n",
+ "sns.lineplot(x = 'Year', y = 'Number of rides', data = sporadic_customers, hue = 'Company', palette = ['pink', 'yellow'], ax = ax[0])\n",
+ "sns.lineplot(x = 'Year', y = 'Number of rides', data = loyal_customers, hue = 'Company', palette = ['pink', 'yellow'], ax = ax[1])\n",
+ "ax[0].set_title('Sporadic Customers (5 rides or less)')\n",
+ "ax[1].set_title('Loyal Custumers (more than 5 rides)')\n",
+ "ax[0].set_xticks(sporadic_customers['Year'].unique())\n",
+ "ax[1].set_xticks(sporadic_customers['Year'].unique())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fd9167c8",
+ "metadata": {},
+ "source": [
+ "We categories the customers as sporadics and loyal based on the number of rides they take in each company every year (5 rides or less is consider a sporadic customer in our analysis), and we found that Yellow Cab is doing a better job than Pink Cab in customer retention."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "89c5e747",
+ "metadata": {},
+ "source": [
+ "## Recomendation\n",
+ "\n",
+ "Based on the questions previously analyzed and answered, we can conclude that **Yellow Cab** is a better option to invest than Pink Cab.\n",
+ "\n",
+ "**Profit through the years:** Yellow Cab has earned **8.3** times the earnings of Pink Cab in the period from 2016 to 2018.\n",
+ "\n",
+ "**User income profile:** In each economic group (Poor or near-poor, Lower-middle class, Middle class, Upper-middle class, Rich) Yellow Cab has more users than Pink Cab.\n",
+ "\n",
+ "**Users by city:** In the Top 5 cities with the largest number of users Yellow Cab has more presence than Pink Cab.\n",
+ "\n",
+ "**Volume of rides:** Yellow Cab has had **3.25** more trips than Pink Cab in the period from 2016 to 2018.\n",
+ "\n",
+ "**Customer retention:** We categories the customers as sporadics and loyal based on the number of rides they take in each company every year (5 rides or less is consider a sporadic customer in our analysis), and we found that Yellow Cab is doing a better job than Pink Cab in customer retention."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.7"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}