-
Notifications
You must be signed in to change notification settings - Fork 9
Home
Well, it's absolutely awesome that this exists, and there's a huge quantity of information. The insight I've always had as a student is that the best pedagogical approach is to structure the learning around a real, running example. This way you can branch out to the concepts everyone will need (vectors, data frames, reading data, running analyses ...) but always return to the larger context, so you have a end-to-end research story. This is what none of the books and online guides do, and it's what will work! It would also provide a scaffold for seminars/workshops as students can apply the techniques they've learnt/are learning in the template study to their own research, whatever stage it's at.
I think it's worth introducing the concept of a use case. It's useful as a way to think about problem solving during analysis, and a way of teaching how you get from a problem statement to a good solution pattern. I write this now, as I have a good example of a simple, practical problem I want to solve, and a total inability to match the best pattern in R (despite all of my other language experience, including SQL):
I'm compiling a table to compare meditation studies which use the ANT. (As a psychologist!) I want to know the mean length of time my meditation group spent meditating (over their 4 weeks of training). I have the weekly totals for each participant in a data frame, however, it's repeated for each row since I left_join
ed the df with corresponding ANT data.
-
expand.grid(...) %>% rowwise() %>% do(.,foo(...))
- It would take you years to find this pattern without a mentor. It's something you're likely to have to do with every data set i.e. As a psychologist, I want to pre-process the data collected for each participant.
I know a bit about this, having refactored code into a package. I would be fairly confident explaining the 'whys' and 'hows' of this.
Following on from our conversation about "Table 1", what I think would be nice, would be an early session on R Markdown and experimental design. Rmd is essential but easy and there's a lot of value in turning your proposed design into a template which
- You can populate with simulation data
- You will ultimately populate with real data
You could do the same thing with ggplot()
so that you have a record of your predictions in graphical form and can compare them with your final data. This is a great way to get a heads up on all of the analyses you need to understand i.e. it creates a "view" into the complete syllabus which is going to be of particular interest to you when you're writing up your study.
This goes beyond the Exeter MSc stats syllabus in that it shows you the best toolchain to practically do proper science i.e. it puts your hypotheses, research diary, protocol etc. in a single, open place (github).