-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathprogrammingbasics_20190916_166_hwa.rmd
198 lines (162 loc) · 5.48 KB
/
programmingbasics_20190916_166_hwa.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
output: html_notebook
---
# 4.4 For-loops
The formula for the sum of the series $1+2+\cdots+n$ is $n(n+1)/2$. What if we weren't sure that was the right function? How could we check? Using what we learned about functions we can create one that computes the $S_n$
```{r}
compute_s_n <- function(n){
x <- 1:n
sum(x)
}
```
How can we compute $S_n$ for various values of $n$, say $n = 1,...,25$? Do we write 25 lines of code calling `compute_s_n`? No, that is what for-loops are for in programming. In this case, we are performing exactly the same task over and over, and the only thing that is changing is the value of $n$. For-loops let us define the range that our variable takes (in our example $n = 1,...,10$), then change the value and evaluate expression as you _loop_.
Perhaps the simplest example of a for-loop is this useless piece of code:
```{r}
for(i in 1:5){
print(i)
}
```
Here is the for-loop we would write for our $S_n$ example:
```{r}
m <- 25
s_n <- vector(length = m) #create an empty vector
for(n in 1:m){
s_n[n] <- compute_s_n(n)
}
```
In each iteration $n = 1, n = 2$ etc..., we compute $S_n$ and store it in the $n$th entry of `s_n`.
Now we can create a plot to search for a pattern:
```{r}
n <- 1:m
plot(n, s_n)
```
If you noticed that it appears to be a quadratic, you are on the right track because the formula is $n(n+1)/2$, which we can confirm with a table:
```{r}
head(data.frame(s_n = s_n, formula = n*(n+1)/2))
```
We can also overlay the two results by using the function `lines` to draw a line over the previously plotted points:
```{r}
plot(n, s_n)
lines(n, n*(n+1)/2)
```
# 4.5 Vectorization and functionals
Although for-loops are an important concept to understand, in R we rarely use them. As you learn more R, you will realize that _vectorization_ is preferred over for-loops since it results in shorter and clearer code. We already saw examples in the Vector Arithmetic Section. A _vectorized_ function is a function that will apply the same operation on each of the vectors.
```{r}
x <- 1:10
sqrt(x)
y <- 1:10
x*y
```
To make this calculation, there is no need for for-loops. However, not all functions work this way. For instance, the function we just wrote, `compute_s_n`, does not work element-wise since it is expecting a scalar. This piece of code does not run the function on each entry of `n`:
```{r}
n <- 1:25
compute_s_n(n)
```
_Functionals_ are functions that help us apply the same function to each entry in a vector, matrix, data frame or list. Here we cover the functional that operates on numeric, logical and character vectors: `sapply`.
The function `sapply` permits us to perform element-wise operations on any function. Here is how it works:
```{r}
x <- 1:10
sapply(x, sqrt)
```
Each element of `x` is passed on to the function `sqrt` and the result is returned. These results are concatenated. In this case, the result is a vector of the same length as the original `x`. This implies that the for-loop above can be written as follows:
```{r}
n <- 1:25
s_n <- sapply(n, compute_s_n)
plot(n, s_n)
```
Other functionals are `apply`,`lapply`,`tapply`,`mapply`,`vapply`, and `replicate`. We mostly use `sapply`,`apply`, and `replicate` in this book, but we recommend familiarizing yourselves with the others as they can be very useful.
# 4.6 Exercises
1. What will this conditional expression return? `Not all positives`
```{r}
x <- c(1,2,-3,4)
if(all(x>0)){
print("All Postives")
} else{
print("Not all positives")
}
```
2. Which of the following expressions is always `FALSE` when at least one entry of a logical vector `x` is TRUE?
A. `all(x)`
B. `any(x)`
C. `any(!x)`
**D. `all(!x)`**
```{r}
x <- c(TRUE, FALSE, FALSE)
all(x)
any(x)
any(!x)
all(!x)
```
```{r}
y <- c(TRUE, TRUE)
all(y)
any(y)
any(!y)
all(!y)
```
3. The function `nchar` tells you how many characters long a character vector is.
Write a line of code that assigns to the object `new_names` the state abbreviation when the state name is longer than 8 characters.
```{r}
library(dslabs)
data(murders)
new_names <- ifelse(nchar(murders$state)>8, murders$abb, murders$state)
new_names
```
4. Create a function `sum_n` that for any given value, say $n$, computes the sum of the integers from 1 to n (inclusive). Use the function to determine the sum of integers from 1 to 5,000.
```{r}
sum_n <- function(n){
x <- 1:n
sum(x)
}
sum_n(5000)
```
5. Create a function `altman_plot` that takes two arguments, `x` and `y`, and plots the difference against the sum.
```{r}
altman_plot <- function(x, y){
plot(x+y,y-x)
}
```
6. After running the code below, what is the value of `x`?
```{r}
x <- 3
my_func <- function(y){
x <- 5
y+5
}
x
```
7. Write a function `compute_s_n` that for any given $n$ computes the sum $S_n = 1^2 + 2^2 + 3^2 + ...n^2$. Report the value of the sum when $n = 10$.
```{r}
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
compute_s_n(10)
```
8. Define an empty numerical vector `s_n` of size 25 using `s_n <- vector("numeric", 25)` and store in the results of $S_1,S_2,...S_25$ using a for-loop.
```{r}
s_n <- vector("numeric", 25)
n <- 1:25
for(i in n){
s_n[i] <- compute_s_n(i)
}
s_n
```
9. Repeat exercise 8, but this time use `sapply`.
```{r}
s_n <- sapply(n, compute_s_n)
s_n
```
10. Repeat exercise 8, but this time use `map_dbl`.
```{r}
s_n <- map_dbl(n, compute_s_n)
s_n
```
11. Plot $S_n$ versus $n$. Use points defined by $n = 1,...,25$.
```{r}
plot(n, s_n)
```
12. Confirm that the formula for this sum is $S_n = n(n + 1)(2n + 1)/6$.
```{r}
identical(s_n, n*(n+1)*(2*n+1)/6)
```