-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathcar.Rmd
More file actions
66 lines (47 loc) · 1.78 KB
/
car.Rmd
File metadata and controls
66 lines (47 loc) · 1.78 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
title: "Auto-mpg"
output:
html_document:
df_print: paged
---
```{r}
library(gtsummary)
library(dplyr )
library(ggplot2)
```
```{r}
#Importing the dataset
df <- read.table("auto-mpg.data",col.names=c("mpg","cylinders","displacement","horsepower","weight","acceleration","model_year","origin","car_name"))
df$horsepower <- as.numeric(df$horsepower)
head(df)
```
#Problem 1. Table
```{r}
df2 <- df%>% select(-c(car_name))
df2 %>% tbl_summary(by = origin, missing = "no") %>% add_overall() %>% add_stat_label()
```
#Problem 2:Bad Figures
```{r}
ggplot(df, aes(x=as.factor(cylinders), fill=as.factor(cylinders))) +
geom_bar()+ylab ("Number of Cars") + xlab("Number of cylinders")+ggtitle("Most used popular cylinders")
```
The problem is While the plot is technically correct, it is not aesthetically pleasing. The colors are too bright and not useful. The background grid is too prominent. The text is displayed using five different fonts in five different sizes
```{r}
df$model_year <- as.factor(df$model_year)
grp_by_outcome <- df %>%
group_by(model_year) %>%
summarise(mpg = sum(mpg))
ggplot(grp_by_outcome, aes(x=model_year,y=mpg)) + geom_col() + ggtitle("Miles per gallon per year")+theme_void() + theme(legend.position="none")
```
The problem is Without an explicit x or y axis scale, the numbers represented by the lines cannot be ascertained
#Problem 3:Good Figures
```{r}
ggplot(df, aes(x=as.factor(cylinders))) +
geom_bar() + ylab ("Number of Cars") + xlab("Number of cylinders")+ggtitle("Most used popular cylinders")
```
```{r}
grp_by_outcome <- df %>%
group_by(model_year) %>%
summarise(mpg = sum(mpg))
ggplot(grp_by_outcome, aes(x=model_year,y=mpg)) + geom_col() + ggtitle("Miles per gallon per year")
```