-
Notifications
You must be signed in to change notification settings - Fork 14
Expand file tree
/
Copy pathmovies_codebook.Rmd
More file actions
61 lines (54 loc) · 3.35 KB
/
movies_codebook.Rmd
File metadata and controls
61 lines (54 loc) · 3.35 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
title: "Codebook for Movies dataset"
output:
html_document:
fig_height: 4
highlight: pygments
theme: spacelab
---
## Data
The data set is comprised of 651 randomly sampled movies produced and
released before 2016.
Some of these variables are only there for informational purposes and do
not make any sense to include in a statistical analysis. It is up to you
to decide which variables are meaningful and which should be omitted. For
example information in the the `actor1` through `actor5` variables was used
to determine whether the movie casts an actor or actress who won a best
actor or actress Oscar.
You might also choose to omit certain observations or restructure some of
the variables to make them suitable for answering your research questions.
When you are fitting a model you should also be careful about collinearity,
as some of these variables may be dependent on each other.
### Codebook
1. `title`: Title of movie
1. `title_type`: Type of movie (Documentary, Feature Film, TV Movie)
1. `genre`: Genre of movie (Action & Adventure, Comedy, Documentary, Drama, Horror, Mystery & Suspense, Other)
1. `runtime`: Runtime of movie (in minutes)
1. `mpaa_rating`: MPAA rating of the movie (G, PG, PG-13, R, Unrated)
1. `studio`: Studio that produced the movie
1. `thtr_rel_year`: Year the movie is released in theaters
1. `thtr_rel_month`: Month the movie is released in theaters
1. `thtr_rel_day`: Day of the month the movie is released in theaters
1. `dvd_rel_year`: Year the movie is released on DVD
1. `dvd_rel_month`: Month the movie is released on DVD
1. `dvd_rel_day`: Day of the month the movie is released on DVD
1. `imdb_rating`: Rating on IMDB
1. `imdb_num_votes`: Number of votes on IMDB
1. `critics_rating`: Categorical variable for critics rating on Rotten Tomatoes (Certified Fresh, Fresh, Rotten)
1. `critics_score`: Critics score on Rotten Tomatoes
1. `audience_rating`: Categorical variable for audience rating on Rotten Tomatoes (Spilled, Upright)
1. `audience_score`: Audience score on Rotten Tomatoes
1. `best_pic_nom`: Whether or not the movie was nominated for a best picture Oscar (no, yes)
1. `best_pic_win`: Whether or not the movie won a best picture Oscar (no, yes)
1. `best_actor_win`: Whether or not one of the main actors in the movie ever won an Oscar (no, yes) -- note that this is not necessarily whether the actor won an Oscar for their role in the given movie
1. `best_actress win`: Whether or not one of the main actresses in the movie ever won an Oscar (no, yes) -- not that this is not necessarily whether the actresses won an Oscar for their role in the given movie
1. `best_dir_win`: Whether or not the director of the movie ever won an Oscar (no, yes) -- not that this is not necessarily whether the director won an Oscar for the given movie
1. `top200_box`: Whether or not the movie is in the Top 200 Box Office list on BoxOfficeMojo (no, yes)
1. `director`: Director of the movie
1. `actor1`: First main actor/actress in the abridged cast of the movie
1. `actor2`: Second main actor/actress in the abridged cast of the movie
1. `actor3`: Third main actor/actress in the abridged cast of the movie
1. `actor4`: Fourth main actor/actress in the abridged cast of the movie
1. `actor5`: Fifth main actor/actress in the abridged cast of the movie
1. `imdb_url`: Link to IMDB page for the movie
1. `rt_url`: Link to Rotten Tomatoes page for the movie