-
Notifications
You must be signed in to change notification settings - Fork 1
/
00_01_3_IntroToR_practical.qmd
189 lines (131 loc) · 4.71 KB
/
00_01_3_IntroToR_practical.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
---
title: "P01. Introduction to R, part 3"
---
**A. Read in the same data as before**
This file can be downloaded from [here](data/TB_stats.txt).
```{r eval = F}
myTBdata <- read.table("TB_stats.txt", header=TRUE)
```
**B. Plot the mortality in HIV negative against HIV positive** check the plot function help file
```{r eval = F}
?plot
```
`plot` is a generic function, and depending on what type of data you pass the function R will use different sub-functions (you dont need to worry about how it handles this!).
```{r eval = F}
# make the plot
plot(x=myTBdata$HIV_neg_TB_mortality, y=myTBdata$HIV_pos_TB_mortality)
```
**C. Add meaningful axes labels**
```{r eval = F}
plot(x=myTBdata$HIV_neg_TB_mortality,
y=myTBdata$HIV_pos_TB_mortality,
xlab="Mortality in HIV negative people",
ylab="Mortality in HIV positive people")
```
**D. Add a meaningful title**
```{r eval = F}
plot(x=myTBdata$HIV_neg_TB_mortality,
y=myTBdata$HIV_pos_TB_mortality,
xlab="Mortality in HIV negative people",
ylab="Mortality in HIV positive people",
main="Comparison of mortality in HIV negative and positive")
```
**E. Change the colour of the points to red**
```{r eval = F}
plot(x=myTBdata$HIV_neg_TB_mortality,
y=myTBdata$HIV_pos_TB_mortality,
xlab="Mortality in HIV negative people",
ylab="Mortality in HIV positive people",
main="Comparison of mortality in HIV negative and positive",
col="red")
```
**F. It's hard to see the numbers because some are small and some very large** Using a log scale is useful for that.You can either log the values and re-plot, or use the log option in plot()
```{r eval = F}
plot(x=myTBdata$HIV_neg_TB_mortality,
y=myTBdata$HIV_pos_TB_mortality,
xlab="Mortality in HIV negative people",
ylab="Mortality in HIV positive people",
main="Comparison of mortality in HIV negative and positive",
col="red",
log="xy")
```
**G. Now let's make a different kind of plot** Show the distribution of Total_TB_mortality in a histogram and then change the x axis label. Note, same options as before.
```{r eval = F}
hist(myTBdata$Total_TB_mortality)
hist(myTBdata$Total_TB_mortality,
xlab="Number")
```
Now add a meaningful title to the plot
```{r eval = F}
# Add your code here
```
**H. Check what other aspects of the histogram you can change**
```{r eval = F}
?hist
```
Then change the color to "blue" in the last plot
```{r eval = F}
hist(myTBdata$Total_TB_mortality,
xlab="Number",
main="Total TB mortality",
col="blue")
```
**I. Now let's plot a histogram of mortality per 1000**
**Hint:** calculate it as in the previous practical
```{r eval = F}
# Add your code here
```
And add a title, and x axis label
```{r eval = F}
# Add your code here
```
change the color to something different hint: to find more colours, run "colors()" or google "Colors in R"
```{r eval = F}
# Add your code here
```
**J. Now let's show both histograms at the same time** you need to make a call to "par", short for parameters, setting the plot parameter "mfrow" (Multi-Figure ROW-wise) gives 1 row, and 2 columns of plot
```{r eval = F}
par(mfrow=c(1,2))
hist(myTBdata$Total_TB_mortality,
xlab="Number",
main="Total TB mortality",
col="blue")
```
cut and paste your plot code from I. here and run it. Then resize the plot window and see what happens
**K. Export the figure and save it as a PNG with a useful name**
**Hint:** use the Export button in the plot window
**L. R has functions for every kind of plot** for example:
```{r eval = F}
?barplot
?boxplot
?contour
```
and stackoverflow.com has a lot of comments and help on every kind of plot
**Advanced plotting exercises**\
Make a plot where:
- x=Total_TB_mortality and y=HIV_pos_TB_mortality
- both aces are on the on the log scale
- colour the points
- add axis labels and a title
```{r eval = F}
# Add your code here
```
add HIV_neg_TB_mortality on the same y axis, in a different colour.
**Hint:** use points(). see `?points` for information
```{r eval = F}
# Add your code here
```
do you need to change the y axis label? i.e. does it still make sense now that it shows negative and positive mortality?
Answer:
Some of the points no longer fit on the graph. Why is this? You need to alter the y limit (ylim), which is an option of plot. What value will you choose?
**Hint:** the maximum value that the data go to change the ylim of the plot.
```{r eval = F}
# Add your code here
```
The plot now has 2 data sets in different colours, so it needs a legend check the help of legend (there's a lot of options!)
**Hint:** use x="topright" instead of setting the x and y values for location.
**Hint:** use the option "fill" to change the colours
```{r eval = F}
# Add your code here
```
The solutions can be accessed [here](00_01_3_IntroToR_solutions.qmd).