Bio720/R_exercises/Bio720_R_InClass_Control_flow_fundamentals.Rmd at master · DworkinLab/Bio720 · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
title: "Control flow in R, the fundamentals"
author: "Ian Dworkin"
date: "`r format(Sys.time(),'%d %b %Y')`"
output:
  html_document:
    toc: yes
    number_sections: yes
    keep_md: yes
editor_options:
  chunk_output_type: console
---


# Worked examples using control flow

```{r}
knitr::opts_chunk$set(eval = FALSE)
```

**NOTE:** I have set the defaults (line above) for evaluation of blocks of code to be off (`eval = FALSE`). When you are running the code you might need to change these. I did this for rendering this activity.


## if else

Here is an example of using `if` and `else` to do different computation (or in this case, different output) depending on the input

```{r}
p_test <- function(p){
    if (p <= 0.05) print ("yeah!!!!")
        else if (p >= 0.9) print ("high!!!!")
        else print ("somewhere in the middle")
}
```


Try some numbers
```{r}
p_test(0.5)
```


Try a random number from a uniform distribution on $[0,1]$ using `runif`

```{r}
(test_val_1 <- runif(n = 1))
p_test(test_val_1)
```


But we often don't have just a single number, but a vector of numbers we need to evaluate and make decisions with.  Let's try it with 100 random numbers

```{r}
(test_val_2 <- runif(n = 100))
```


```{r}
p_test(test_val_2)
```


What does it not like?

The `if` and `else` function are not naturally vectorized.

What are our options?

(`for` loop, `sapply`, `ifelse`)

```{r}
p_test_2 <- function(p){
    ifelse(test = p <= 0.05,
           yes = print("yippee"),
           no = print("bummer")) }
```

```{r}
p_test_2(test_val_2)
```

How many times were we happy or sad
```{r}
table(p_test_2(test_val_2))
```

For genomics we do sometimes need to assess things just like this, how many of our genomic features "pass" some threshold
```{r}
feature_tests <- runif(n = 17300, min = 0, max = 1)
```


via for loops
```{r}
p_vals_out <- rep(NA, length(feature_tests))

for (i in 1:length(p_vals_out)) {
  p_vals_out[i] <- p_test(feature_tests[i])
}
```


via sapply
```{r}
p_vals_sapply_out <- sapply(feature_tests, p_test)
```

As you can see, we really should rewrite the functions so they are not outputting things to the screen


```{r}
p_test_V2 <- function(p){
    if (p <= 0.05) x = "yeah!!!!"
        else if (p >= 0.9) x = "high!!!!"
        else x = "somewhere in the middle"
    return(x)
}

p_test_V2(0.5)
```

clean output!
```{r}
p_vals_sapply_out <- sapply(feature_tests, p_test_V2)
```


```{r}
p_test_3 <- function(p){
    ifelse(test = p <= 0.05,
           yes = "yippee",
           no =  "bummer") }

p_test_3(feature_tests)
```