forked from beanumber/sds192-mp2
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathmp2.Rmd
More file actions
191 lines (118 loc) · 11.5 KB
/
mp2.Rmd
File metadata and controls
191 lines (118 loc) · 11.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
---
title: "Mini Project 2"
author: "Audrey Bertin and Eva Gerstle"
date: "10/23/2017"
output: html_document
---
Political campaigns have many expenditures they must account for, including travel cost for candidates and staff, consulting/campaign advising, and of course advertising. For some campaigns, particularly high profile senate seats and presidential elections, these things are quite costly, and candidates have to find a way to raise money so that they can afford to keep their campaign going. A large portion of this money is collected through donations from individual citizens. Citizens can donate to committees, which then spend money on behalf of or against candidates.
For this project, we were interested in determining how donations to campaigns varied by the individual’s occupation and income level. Several specific questions we were interested in were:
• 1) While individuals in higher income jobs are expected across the board to donate a higher total amount to campaigns (e.g. ALL doctors combined will donate more than ALL social workers combined), but how does the median individual donation compare across occupations? Do their donations still appear as significant after considering income level?
• 2) Do individuals across different income levels tend to prefer certain political parties (e.g. do wealthy people as a whole donate to a different proportion of democratic/republican candidates than less-wealthy people), and is there significant variance in partisanship within an income level due to the type of occupation one is in?
To try and answer these questions, we analyzed political donations over 200 dollars made by individuals to committees. We selected 18 common occupations to consider: six that are relatively low income (20,000-60,000 a year), six middle income (60,000-100,000), and six high income (100,000 +). First we looked at the average and median size of the donation by occupation type, regardless of which candidate or party the donation was going to. Next we looked at which party those donations are going to. We looked only at the committees that are associated with one of the two major political parties–Democratic and Republican. Below are the occupations, and their respective financial support for each of the two major political parties.
We placed the occupations within three income brackets (20k-60k, 60k-100k, 100k+) according to each occupation’s average annual salary as recorded by the May 2012 National Occupational Employment and Wage Estimates. This data is collected by the [Bureau of Labor Statistics](https://www.bls.gov/oes/2012/may/oes_nat.htm).
### Loading The Data
```{r, message = FALSE, warning = FALSE}
library(tidyverse)
load("committees.rda")
load("contributions.rda")
load("individuals.rda")
```
### Join committees and individual donations to get a list of all donations to each committee
```{r}
selected_individuals <- individuals %>%
select(occupation, transaction_amt)
donations_to_committees <- committees %>%
select(cmte_id, cmte_name, cmte_party_affiliation) %>%
inner_join(individuals, by = "cmte_id")
#Selecting before joining here in order to speed up the knitting process
```
### Select some jobs to consider for studying
```{r}
jobs <- c("REGISTERED NURSE", "TEACHER", "SOCIAL WORKER", "FARMER", "CONSTRUCTION", "RANCHER", "REAL ESTATE", "ACCOUNTANT", "SOFTWARE ENGINEER", "MARKETING", "POLICE OFFICER", "CONTRACTOR", "CEO", "LAWYER", "EXECUTIVE", "ENGINEER", "DOCTOR", "ANESTHESIOLOGIST", "POLICE OFFICER")
```
### Calculate the size of donations for each of these occupations
```{r}
occupation_donations <- function(job) {
donations_to_committees %>%
filter(occupation == job) %>%
group_by(occupation) %>%
summarize(
total_donated = sum(transaction_amt),
avg_donated = mean(transaction_amt),
median_donation = median(transaction_amt),
num_donations = n())
}
#This function calculates information about all donations given to committees by individuals in any given occupation
donations_by_occupations <- lapply(jobs, FUN = occupation_donations) %>% bind_rows()
donations_by_occupations <- mutate(donations_by_occupations, total_donated_millions = total_donated/1000000)
#The addition of the millions column is just so that the graph we use later will be easier to read.
low_income = c("POLICE OFFICER", "TEACHER", "SOCIAL WORKER", "FARMER", "CONSTRUCTION", "RANCHER") #20K-60K avg income
med_income = c( "REAL ESTATE", "ACCOUNTANT", "SOFTWARE ENGINEER", "MARKETING", "REGISTERED NURSE", "CONTRACTOR") #60K-100K avg income
high_income = c("CEO", "LAWYER", "EXECUTIVE", "ENGINEER", "DOCTOR", "ANESTHESIOLOGIST") #100k+ avg income
donations_by_occupations <- donations_by_occupations %>%
mutate(income_level = ifelse(occupation %in% low_income, "$20K-60K", ifelse(occupation %in% med_income, "$60k-100k", "$100K +"))) #Add a new column that designates the income level of each occupation
donations_by_occupations$income_level <- factor(donations_by_occupations$income_level, levels = c("$20K-60K", "$60k-100k", "$100K +")) #Give the income levels a rank according to value instead of alphabetical order
```
#Plot the total sum of donations given by each of the 18 occupations.
In the plot below, we look at the total amount donated by each occupation in the 2012 election cycle. We found this plot to be useful in displaying the scale of money donated but recognize that different numbers of individuals donated in different occupations. Thus this plot does not show us much about the individual contributions themselves, but does give insight into which occupations have the largest monetary effect on candidate fundraising.
```{r}
total_plot <- ggplot(donations_by_occupations, aes(x= reorder(occupation,total_donated_millions), y=total_donated_millions, fill= income_level)) + geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, size = 7, hjust = 1)) + xlab(NULL) +
ylab("Total Donations (In Millions Of Dollars)") + scale_fill_discrete(name= "Income Level", breaks = c("$20K-60K", "$60k-100k", "$100K +"), labels = c("$20K-60K", "$60k-100k", "$100K +")) + ggtitle("Total Donations by Occupation")
total_plot
```
### Plot median donations by occupation
In the graph below we looked at the median individual donation across several occupations. We chose to use median rather than mean because we felt that the data was strongly impacted by outliers (specifically large donators) and we know that median is less affected by outliers.
```{r}
median_plot <- ggplot(donations_by_occupations, aes(x=reorder(occupation,total_donated_millions), y = median_donation, fill = income_level)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, hjust = 1) ) + scale_fill_discrete(name= "Income Level", breaks = c("$20K-60K", "$60k-100k", "$100K +"), labels = c("$20K-60K", "$60k-100k", "$100K +")) +
xlab(NULL) +
ylab("Median Donation (Dollars)")+
ggtitle("Median Individual Contributions by Occupation")
median_plot
#Note that this plot was ordered by total_donated_millions so that the occupations would be in the same order as in the last plot, and it would be clear which occupations had relatively high average donations compared to their total donations.
```
Across the occupations, there is much less variation in the median contribution than in total contributions. Many of the lower-income occupations still donate a significant amount of money to committees, indicating that they are contributing a proportionally higher amount of their income than some higher-income individuals.
### Find the partisanship by occupation
```{r}
committees_by_occupation <- function(job) {
donations_to_committees %>%
filter(occupation == job) %>%
group_by(occupation, cmte_name, cmte_id, cmte_party_affiliation) %>%
summarize(total_donations = sum(transaction_amt), num_donations = n()) %>%
arrange(desc(total_donations))
}
#This function finds all party-affiliated committees that each occupation donated to
all_committees_by_occupation <- lapply(jobs, FUN = committees_by_occupation) %>% bind_rows()
all_committees_by_occupation
occupation_partisanship <- all_committees_by_occupation %>%
group_by(occupation) %>%
summarize(num_democratic = sum(cmte_party_affiliation == "DEM"), num_republican = sum(cmte_party_affiliation == "REP"), favor = ifelse(num_democratic > num_republican, "Dem", "Rep"))
#Here, we find how many of those committees were Dem and Rep to determine which party each occupation favors and the numerical distribution showing that.
occupations_with_partisanship <- donations_by_occupations %>%
inner_join(occupation_partisanship, by = "occupation")
#We join the earlier information about donation size by occupation to donation partisanship so that all the info about each occupation is contained in one place
partisanship_gathered <- occupations_with_partisanship %>%
select(-favor) %>%
rename(D = num_democratic, R = num_republican) %>%
gather(key = "party", value = "num_committees", D:R)
#Since we originally had to create two columns in the mutate function to count the numbers of dem and rep parties, we have to use a gather here to make the data tidy and plottable.
partisanship_gathered$party <- factor(partisanship_gathered$party, levels = c("D", "R"))
```
### Plot the partisanship by occupation
Here we graphed the partisanship of donations by occupation held by donator. We were interested in seeing whether certian jobs were linked with more Republican or Democratic support.
```{r}
ggplot(partisanship_gathered, aes(x = occupation, y = num_committees, fill = party)) +
geom_bar(stat = "identity", position = "Fill") +
geom_hline(yintercept = 0.5)+
facet_wrap(~income_level, scales = "free_x") +
xlab(NULL) +
ylab("Percentage of Committees Supported") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle("Individual Donations by Income Level and Occupation") +
scale_fill_manual(values = c("Blue", "Red"),name= "Committee Affiliation", breaks = c("D","R"), labels = c("Democratic", "Republican"))
```
We found that specific occupations tended to be more Democratic in their donations such as Social Workers and Software Engineers. As for tending more Republican (at least in the way they spend their money) we have Constrution Workers, Ranchers, and Anesthesiologists. However, we cannot conclude anything about the partisanship of different income brackets, but if we looked at more occupations perhaps such a conclusion could be drawn.
From this group of 18 occupations, there does not appear to be a significant enough difference in overall partisanship between each income level to determine an overall trend, but there are some interesting jobs that particularly stand out. Social workers are the most democratic occupation by far, followed by software engineers, and anesthesiologists are particularly republican within their income category. It appears that the jobs that are more focused on the hard labor industry (construction and the related contractor, farmer, and rancher) are also more republican than their counterparts within their income level.
Overall, we've seen that 1) The most money given to committees comes from high-income individuals, and so those high-income occupations have the most impact on candidate fundraising, 2) Lower and middle income workers tend to give a median donation that is very close to many high-income workers, meaning that they are actually donating *more* relative to their income level, and 3) There is no clear trend in partisanship across the income levels of these 18 occupations, but hard-labor oriented jobs did tend to be more republican than other jobs within the same income bracket.