Please fill out:
- Student name: Joseph Husney
- Student pace: full time
- Scheduled project review date/time: 8/06/2020 5:00 pm
- Instructor name: James Irving
- Blog post URL: https://jhusney1.github.io/reasons_to_use_webscraping_over_apis
Microsoft sees all the big companies creating original video content, and they want to get in on the fun. They have decided to create a new movie studio, but the problem is they don’t know anything about creating movies. They have hired you to help them better understand the movie industry. Your team is charged with doing data analysis and creating a presentation that explores what type of films are currently doing the best at the box office. You must then translate those findings into actionable insights that the CEO can use when deciding what type of films they should be creating. We will analyze our findings through three visuals. Hopefully, this will shed some light on which kinds of movies microsoft should invest in.
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())
from tqdm import tqdm#Allowing pandas to display unlimited info
# pd.set_option('display.max_rows', 1000)
# pd.set_option('display.max_columns', 1000)Before analyzing the data, it must first be retrieved. Inside the package given to us, there is a file that has all the movie info. We will use this info to get all kinds of details which we can further analyze. We will put this into a pandas dataframe so we can utilize all the tools pandas has to offer
# Load csv file to dataframe
df = pd.read_csv('zippedData/tmdb_5000_movies.csv')# Preview data
df.head().dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
| budget | genres | homepage | id | keywords | original_language | original_title | overview | popularity | production_companies | production_countries | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 237000000 | [{"id": 28, "name": "Action"}, {"id": 12, "nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | [{"name": "Ingenious Film Partners", "id": 289... | [{"iso_3166_1": "US", "name": "United States o... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 |
| 1 | 300000000 | [{"id": 12, "name": "Adventure"}, {"id": 14, "... | http://disney.go.com/disneypictures/pirates/ | 285 | [{"id": 270, "name": "ocean"}, {"id": 726, "na... | en | Pirates of the Caribbean: At World's End | Captain Barbossa, long believed to be dead, ha... | 139.082615 | [{"name": "Walt Disney Pictures", "id": 2}, {"... | [{"iso_3166_1": "US", "name": "United States o... | 2007-05-19 | 961000000 | 169.0 | [{"iso_639_1": "en", "name": "English"}] | Released | At the end of the world, the adventure begins. | Pirates of the Caribbean: At World's End | 6.9 | 4500 |
| 2 | 245000000 | [{"id": 28, "name": "Action"}, {"id": 12, "nam... | http://www.sonypictures.com/movies/spectre/ | 206647 | [{"id": 470, "name": "spy"}, {"id": 818, "name... | en | Spectre | A cryptic message from Bond’s past sends him o... | 107.376788 | [{"name": "Columbia Pictures", "id": 5}, {"nam... | [{"iso_3166_1": "GB", "name": "United Kingdom"... | 2015-10-26 | 880674609 | 148.0 | [{"iso_639_1": "fr", "name": "Fran\u00e7ais"},... | Released | A Plan No One Escapes | Spectre | 6.3 | 4466 |
| 3 | 250000000 | [{"id": 28, "name": "Action"}, {"id": 80, "nam... | http://www.thedarkknightrises.com/ | 49026 | [{"id": 849, "name": "dc comics"}, {"id": 853,... | en | The Dark Knight Rises | Following the death of District Attorney Harve... | 112.312950 | [{"name": "Legendary Pictures", "id": 923}, {"... | [{"iso_3166_1": "US", "name": "United States o... | 2012-07-16 | 1084939099 | 165.0 | [{"iso_639_1": "en", "name": "English"}] | Released | The Legend Ends | The Dark Knight Rises | 7.6 | 9106 |
| 4 | 260000000 | [{"id": 28, "name": "Action"}, {"id": 12, "nam... | http://movies.disney.com/john-carter | 49529 | [{"id": 818, "name": "based on novel"}, {"id":... | en | John Carter | John Carter is a war-weary, former military ca... | 43.926995 | [{"name": "Walt Disney Pictures", "id": 2}] | [{"iso_3166_1": "US", "name": "United States o... | 2012-03-07 | 284139100 | 132.0 | [{"iso_639_1": "en", "name": "English"}] | Released | Lost in our world, found in another. | John Carter | 6.1 | 2124 |
# See how long dataframe is
df.shape(4803, 20)
Upon inspecting the data, it became clear that there are many movies missing certain important data points. For instance, some movies don't have genre data. Others have a budget and/or revenue of zero. Here we will clean up the data by deleting those rows from the dataframe.
# See if there are any null values for genre
df.isna().sum()budget 0
genres 0
homepage 3091
id 0
keywords 0
original_language 0
original_title 0
overview 3
popularity 0
production_companies 0
production_countries 0
release_date 1
revenue 0
runtime 2
spoken_languages 0
status 0
tagline 844
title 0
vote_average 0
vote_count 0
dtype: int64
# No null values - this code isn't necessary for now
# df = df[~df['genres'].isna()]# Find out how many rows we are dropping
indexNames1 = df[ df['budget'] == 0 ].index
indexNames2 = df[ df['revenue'] == 0 ].index
len(indexNames1), len(indexNames2)(1037, 1427)
# Drop rows
indexNames = df[ df['budget'] == 0 ].index
df.drop(indexNames , inplace=True)
indexNames = df[ df['revenue'] == 0 ].index
df.drop(indexNames , inplace=True)# See shape after dropping all those rows
df.shape(3229, 20)
Now we will discuss three questions that will shed some light on which movies microsoft should invest in
# Check type of genre column in order to manipulate data
type(df['genres'][0])str
# Convert to list of dictionaries that it originally was.
import ast
df['genres'] = df['genres'].map(ast.literal_eval)type(df['genres'][0][0])dict
def seperate_genres(genre_list):
genres = []
for genre in genre_list:
genres.append(genre['name'])
return genres# Make separate column for genres as a list of genre names
df['genre'] = df['genres'].map(seperate_genres)genre_df = df.explode('genre')genre_df['profit'] = genre_df['revenue'] - genre_df['budget']genre_df[['title','budget','revenue', 'profit']].head().dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
| title | budget | revenue | profit | |
|---|---|---|---|---|
| 0 | Avatar | 237000000 | 2787965087 | 2550965087 |
| 0 | Avatar | 237000000 | 2787965087 | 2550965087 |
| 0 | Avatar | 237000000 | 2787965087 | 2550965087 |
| 0 | Avatar | 237000000 | 2787965087 | 2550965087 |
| 1 | Pirates of the Caribbean: At World's End | 300000000 | 961000000 | 661000000 |
sns.barplot(y= 'genre', x='profit', data=genre_df, ci=68,palette="Blues_d")<matplotlib.axes._subplots.AxesSubplot at 0x23be6e34320>
Conclusion: The 5 most profitable genres are Animation, Adventure, Fantasy, Family, and Science Fiction. We recommend that microsoft invest in animated movies.
Question 2: Is there an optimal runtime (in terms of profit) for movies domestically. If so, what is it?
df_runtime = df.copy()df_runtime['profit'] = df_runtime['revenue'] - df_runtime['budget']
df_runtime['profit_margin'] = (df_runtime['profit'] / df_runtime['budget'])*100df_runtime['runtime']0 162.0
1 169.0
2 148.0
3 165.0
4 132.0
...
4773 92.0
4788 93.0
4792 111.0
4796 77.0
4798 81.0
Name: runtime, Length: 3229, dtype: float64
# slice out 50 most profitable movies to look at
df_runtime = df_runtime.sort_values('profit',ascending=False).head(50)df_runtime[['title', 'profit', 'runtime']].dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
| title | profit | runtime | |
|---|---|---|---|
| 0 | Avatar | 2550965087 | 162.0 |
| 25 | Titanic | 1645034188 | 194.0 |
| 28 | Jurassic World | 1363528810 | 124.0 |
| 44 | Furious 7 | 1316249360 | 137.0 |
| 16 | The Avengers | 1299557910 | 143.0 |
| 7 | Avengers: Age of Ultron | 1125403694 | 141.0 |
| 124 | Frozen | 1124219009 | 102.0 |
| 546 | Minions | 1082730962 | 91.0 |
| 329 | The Lord of the Rings: The Return of the King | 1024888979 | 201.0 |
| 31 | Iron Man 3 | 1015439994 | 130.0 |
| 52 | Transformers: Dark of the Moon | 928746996 | 154.0 |
| 29 | Skyfall | 908561013 | 143.0 |
| 26 | Captain America: Civil War | 903304495 | 147.0 |
| 506 | Despicable Me 2 | 894761885 | 98.0 |
| 36 | Transformers: Age of Extinction | 881405097 | 165.0 |
| 42 | Toy Story 3 | 866969703 | 103.0 |
| 12 | Pirates of the Caribbean: Dead Man's Chest | 865659812 | 151.0 |
| 675 | Jurassic Park | 857100000 | 127.0 |
| 197 | Harry Potter and the Philosopher's Stone | 851475550 | 152.0 |
| 330 | The Lord of the Rings: The Two Towers | 847287400 | 179.0 |
| 328 | Finding Nemo | 846335536 | 100.0 |
| 3 | The Dark Knight Rises | 834939099 | 165.0 |
| 32 | Alice in Wonderland | 825491110 | 108.0 |
| 65 | The Dark Knight | 819558444 | 152.0 |
| 233 | Star Wars: Episode I - The Phantom Menace | 809317558 | 136.0 |
| 504 | The Secret Life of Pets | 800958308 | 87.0 |
| 348 | Ice Age: Dawn of the Dinosaurs | 796686817 | 94.0 |
| 78 | The Jungle Book | 791550600 | 106.0 |
| 113 | Harry Potter and the Order of the Phoenix | 788212738 | 138.0 |
| 2967 | E.T. the Extra-Terrestrial | 782410554 | 115.0 |
| 325 | Ice Age: Continental Drift | 782244782 | 88.0 |
| 262 | The Lord of the Rings: The Fellowship of the Ring | 778368364 | 178.0 |
| 276 | Harry Potter and the Chamber of Secrets | 776688482 | 161.0 |
| 98 | The Hobbit: An Unexpected Journey | 771103568 | 169.0 |
| 565 | Shrek 2 | 769838758 | 93.0 |
| 2912 | Star Wars | 764398007 | 121.0 |
| 114 | Harry Potter and the Goblet of Fire | 745921036 | 157.0 |
| 494 | The Lion King | 743241776 | 89.0 |
| 507 | Independence Day | 741969268 | 145.0 |
| 229 | Star Wars: Episode III - Revenge of the Sith | 737000000 | 140.0 |
| 788 | Deadpool | 725112979 | 108.0 |
| 183 | The Hunger Games: Catching Fire | 717423452 | 146.0 |
| 172 | The Twilight Saga: Breaking Dawn - Part 2 | 709000000 | 115.0 |
| 22 | The Hobbit: The Desolation of Smaug | 708400000 | 161.0 |
| 19 | The Hobbit: The Battle of the Five Armies | 706019788 | 144.0 |
| 35 | Transformers: Revenge of the Fallen | 686297228 | 150.0 |
| 8 | Harry Potter and the Half-Blood Prince | 683959197 | 153.0 |
| 159 | Spider-Man | 682708551 | 121.0 |
| 77 | Inside Out | 682611174 | 94.0 |
| 17 | Pirates of the Caribbean: On Stranger Tides | 665713802 | 136.0 |
sns.jointplot("runtime", "revenue", data=df_runtime, kind="reg")<seaborn.axisgrid.JointGrid at 0x23be7183cc0>
Conclusion: As illustrated in this plot, there is a small positive correlation between lengthy movies and how profitable they are. One thing to take note of is that out of the 50 most profitable movies, a big portion of them were around 100 minutes long or around 125-150 minutes long (as illustrated by the bars above the plot). Clearly, they are doing something right. Therefore, microsoft should make movies that are between 125 and 150 minutes long to mimic the most profitable movies.
Question 3: Which production company(s) are most successful in terms of domestic profit and therefore should be used?
# Ensure that our original dataset is unchanged
print(df.shape)
df.head()(3229, 21)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
| budget | genres | homepage | id | keywords | original_language | original_title | overview | popularity | production_companies | ... | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | genre | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 237000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | [{"name": "Ingenious Film Partners", "id": 289... | ... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 | [Action, Adventure, Fantasy, Science Fiction] |
| 1 | 300000000 | [{'id': 12, 'name': 'Adventure'}, {'id': 14, '... | http://disney.go.com/disneypictures/pirates/ | 285 | [{"id": 270, "name": "ocean"}, {"id": 726, "na... | en | Pirates of the Caribbean: At World's End | Captain Barbossa, long believed to be dead, ha... | 139.082615 | [{"name": "Walt Disney Pictures", "id": 2}, {"... | ... | 2007-05-19 | 961000000 | 169.0 | [{"iso_639_1": "en", "name": "English"}] | Released | At the end of the world, the adventure begins. | Pirates of the Caribbean: At World's End | 6.9 | 4500 | [Adventure, Fantasy, Action] |
| 2 | 245000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.sonypictures.com/movies/spectre/ | 206647 | [{"id": 470, "name": "spy"}, {"id": 818, "name... | en | Spectre | A cryptic message from Bond’s past sends him o... | 107.376788 | [{"name": "Columbia Pictures", "id": 5}, {"nam... | ... | 2015-10-26 | 880674609 | 148.0 | [{"iso_639_1": "fr", "name": "Fran\u00e7ais"},... | Released | A Plan No One Escapes | Spectre | 6.3 | 4466 | [Action, Adventure, Crime] |
| 3 | 250000000 | [{'id': 28, 'name': 'Action'}, {'id': 80, 'nam... | http://www.thedarkknightrises.com/ | 49026 | [{"id": 849, "name": "dc comics"}, {"id": 853,... | en | The Dark Knight Rises | Following the death of District Attorney Harve... | 112.312950 | [{"name": "Legendary Pictures", "id": 923}, {"... | ... | 2012-07-16 | 1084939099 | 165.0 | [{"iso_639_1": "en", "name": "English"}] | Released | The Legend Ends | The Dark Knight Rises | 7.6 | 9106 | [Action, Crime, Drama, Thriller] |
| 4 | 260000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://movies.disney.com/john-carter | 49529 | [{"id": 818, "name": "based on novel"}, {"id":... | en | John Carter | John Carter is a war-weary, former military ca... | 43.926995 | [{"name": "Walt Disney Pictures", "id": 2}] | ... | 2012-03-07 | 284139100 | 132.0 | [{"iso_639_1": "en", "name": "English"}] | Released | Lost in our world, found in another. | John Carter | 6.1 | 2124 | [Action, Adventure, Science Fiction] |
5 rows × 21 columns
type(df['production_companies'][0][0])str
# Convert to list of dicts
df['production_companies'] = df['production_companies'].map(ast.literal_eval)type(df['production_companies'][0][0])dict
list_of_dicts = df['production_companies'][0]list_of_dicts[{'name': 'Ingenious Film Partners', 'id': 289},
{'name': 'Twentieth Century Fox Film Corporation', 'id': 306},
{'name': 'Dune Entertainment', 'id': 444},
{'name': 'Lightstorm Entertainment', 'id': 574}]
def seperate_production_companies(company_list):
company_names = []
for dict_ in company_list:
company_names.append(dict_['name'])
return company_namesdf['production_companies'] = df['production_companies'].map(seperate_production_companies)df['production_companies']0 [Ingenious Film Partners, Twentieth Century Fo...
1 [Walt Disney Pictures, Jerry Bruckheimer Films...
2 [Columbia Pictures, Danjaq, B24]
3 [Legendary Pictures, Warner Bros., DC Entertai...
4 [Walt Disney Pictures]
...
4773 [Miramax Films, View Askew Productions]
4788 [Dreamland Productions]
4792 [Daiei Studios]
4796 [Thinkfilm]
4798 [Columbia Pictures]
Name: production_companies, Length: 3229, dtype: object
production_company_df = df.explode('production_companies')production_company_df.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
| budget | genres | homepage | id | keywords | original_language | original_title | overview | popularity | production_companies | ... | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | genre | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 237000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | Ingenious Film Partners | ... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 | [Action, Adventure, Fantasy, Science Fiction] |
| 0 | 237000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | Twentieth Century Fox Film Corporation | ... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 | [Action, Adventure, Fantasy, Science Fiction] |
| 0 | 237000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | Dune Entertainment | ... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 | [Action, Adventure, Fantasy, Science Fiction] |
| 0 | 237000000 | [{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | http://www.avatarmovie.com/ | 19995 | [{"id": 1463, "name": "culture clash"}, {"id":... | en | Avatar | In the 22nd century, a paraplegic Marine is di... | 150.437577 | Lightstorm Entertainment | ... | 2009-12-10 | 2787965087 | 162.0 | [{"iso_639_1": "en", "name": "English"}, {"iso... | Released | Enter the World of Pandora. | Avatar | 7.2 | 11800 | [Action, Adventure, Fantasy, Science Fiction] |
| 1 | 300000000 | [{'id': 12, 'name': 'Adventure'}, {'id': 14, '... | http://disney.go.com/disneypictures/pirates/ | 285 | [{"id": 270, "name": "ocean"}, {"id": 726, "na... | en | Pirates of the Caribbean: At World's End | Captain Barbossa, long believed to be dead, ha... | 139.082615 | Walt Disney Pictures | ... | 2007-05-19 | 961000000 | 169.0 | [{"iso_639_1": "en", "name": "English"}] | Released | At the end of the world, the adventure begins. | Pirates of the Caribbean: At World's End | 6.9 | 4500 | [Adventure, Fantasy, Action] |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4773 | 27000 | [{'id': 35, 'name': 'Comedy'}] | http://www.miramax.com/movie/clerks/ | 2292 | [{"id": 1361, "name": "salesclerk"}, {"id": 30... | en | Clerks | Convenience and video store clerks Dante and R... | 19.748658 | View Askew Productions | ... | 1994-09-13 | 3151130 | 92.0 | [{"iso_639_1": "en", "name": "English"}] | Released | Just because they serve you doesn't mean they ... | Clerks | 7.4 | 755 | [Comedy] |
| 4788 | 12000 | [{'id': 27, 'name': 'Horror'}, {'id': 35, 'nam... | NaN | 692 | [{"id": 237, "name": "gay"}, {"id": 900, "name... | en | Pink Flamingos | Notorious Baltimore criminal and underground f... | 4.553644 | Dreamland Productions | ... | 1972-03-12 | 6000000 | 93.0 | [{"iso_639_1": "en", "name": "English"}] | Released | An exercise in poor taste. | Pink Flamingos | 6.2 | 110 | [Horror, Comedy, Crime] |
| 4792 | 20000 | [{'id': 80, 'name': 'Crime'}, {'id': 27, 'name... | NaN | 36095 | [{"id": 233, "name": "japan"}, {"id": 549, "na... | ja | キュア | A wave of gruesome murders is sweeping Tokyo. ... | 0.212443 | Daiei Studios | ... | 1997-11-06 | 99000 | 111.0 | [{"iso_639_1": "ja", "name": "\u65e5\u672c\u8a... | Released | Madness. Terror. Murder. | Cure | 7.4 | 63 | [Crime, Horror, Mystery, Thriller] |
| 4796 | 7000 | [{'id': 878, 'name': 'Science Fiction'}, {'id'... | http://www.primermovie.com | 14337 | [{"id": 1448, "name": "distrust"}, {"id": 2101... | en | Primer | Friends/fledgling entrepreneurs invent a devic... | 23.307949 | Thinkfilm | ... | 2004-10-08 | 424760 | 77.0 | [{"iso_639_1": "en", "name": "English"}] | Released | What happens if it actually works? | Primer | 6.9 | 658 | [Science Fiction, Drama, Thriller] |
| 4798 | 220000 | [{'id': 28, 'name': 'Action'}, {'id': 80, 'nam... | NaN | 9367 | [{"id": 5616, "name": "united states\u2013mexi... | es | El Mariachi | El Mariachi just wants to play his guitar and ... | 14.269792 | Columbia Pictures | ... | 1992-09-04 | 2040920 | 81.0 | [{"iso_639_1": "es", "name": "Espa\u00f1ol"}] | Released | He didn't come looking for trouble, but troubl... | El Mariachi | 6.6 | 238 | [Action, Crime, Thriller] |
10373 rows × 21 columns
production_company_df['production_companies'].value_counts()Warner Bros. 280
Universal Pictures 273
Paramount Pictures 245
Twentieth Century Fox Film Corporation 201
Columbia Pictures 167
...
Filmtribe 1
Seven Arts 1
Geisler-Roberdeau 1
Lago Film 1
Novo RPI 1
Name: production_companies, Length: 3564, dtype: int64
# Filter to 50 most profitable movies
production_company_df = production_company_df[production_company_df.groupby('production_companies')['production_companies'].transform('count').ge(30)]fig_dims = (15,8)
fig, ax = plt.subplots(figsize=fig_dims)
sns.barplot(x = "revenue", y = "production_companies", ax=ax, data=production_company_df)<matplotlib.axes._subplots.AxesSubplot at 0x23be8a35978>
Conclusion: The top production companies in terms of revenue are DreamWorks animation, Lengendary Pictures, Amblin Entertainment, Walt Disney Pictures, and Dune entertainment. Therefore, microsoft should consider consulting with these companies to help them produce their movies
Bottom line, it's a risky business to get involved with something new without any experience. Microsoft did what they can by getting the data they needed to see how to get started with the movie business
Some questions that have been cleared up for them are as follows: Question 1: What is the domestic average movie profit categorized by genre? Question 2: Is there an optimal runtime (in terms of profit) for movies domestically. If so, what is it? Question 3: Which production company(s) are most successful in terms of domestic profit and therefore should be used?
After analyzing the data, it seems clear that microsoft should make animated movies around 150 minutes long using Dreamworks Animation as their production company.
Although microsoft now has a better idea about which movies are most profitable, there are many other ideas that can be researched to further their investigation to get a clearer picture. One idea would be to analyze the MPAA ratings and see if there is a correlation between certain ratings and the profit. Another idea would be to see if it's worthwhile to hire certain actors and actresses to gain popularity and in turn, profit. These are just some ideas that I plan to get my hands with in the future.


