Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge #24

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
84927a9
Merge remote-tracking branch 'refs/remotes/InseadDataAnalytics/master'
anadeandres Jan 7, 2016
6f841c6
Merge pull request #1 from InseadDataAnalytics/master
anadeandres Jan 7, 2016
2475197
Merge branch 'master' of https://github.com/anadeandres/INSEADAnalytics
anadeandres Jan 7, 2016
29fbfaa
results of the exercise
anadeandres Jan 7, 2016
2f5017f
Merge branch 'master' of https://github.com/anadeandres/INSEADAnalytics
anadeandres Jan 7, 2016
39358b8
Renamed exercise
anadeandres Jan 12, 2016
b30929a
changes
anadeandres Jan 12, 2016
783b4bb
Merge remote-tracking branch 'refs/remotes/origin/master' into Insead…
anadeandres Jan 12, 2016
f3d1341
Merge remote-tracking branch 'refs/remotes/InseadDataAnalytics/master'
anadeandres Jan 12, 2016
e3c12f9
Merge remote-tracking branch 'refs/remotes/InseadDataAnalytics/master'
anadeandres Jan 20, 2016
79400c5
Merge remote-tracking branch 'refs/remotes/origin/master' into Insead…
anadeandres Jan 20, 2016
7f57f89
Exercise 2a
anadeandres Jan 20, 2016
0f8d097
Merge remote-tracking branch 'refs/remotes/origin/master' into Insead…
anadeandres Jan 21, 2016
4445fb5
Merge remote-tracking branch 'refs/remotes/origin/master'
anadeandres Jan 21, 2016
3bcd62e
Merge remote-tracking branch 'refs/remotes/origin/master' into Insead…
anadeandres Jan 21, 2016
c30672a
Merge remote-tracking branch 'upstream/master'
anadeandres Jan 21, 2016
288f97d
Merge remote-tracking branch 'upstream/master'
anadeandres Jan 27, 2016
e3c0641
Session 3-4
anadeandres Jan 27, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
439 changes: 439 additions & 0 deletions CourseSessions/Sessions23/CopyOfSession2inclass_Ana.Rmd

Large diffs are not rendered by default.

19,472 changes: 19,472 additions & 0 deletions CourseSessions/Sessions23/CopyOfSession2inclass_Ana.html

Large diffs are not rendered by default.

Binary file added Exercises/Exerciseset1/DataSet1.Rdata
Binary file not shown.
19 changes: 18 additions & 1 deletion Exercises/Exerciseset1/ExerciseSet1.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@

---
title: "Exercise Set 1"
title: "Exercise Set 1 - Solutions Ana de Andres"
author: "T. Evgeniou"
<<<<<<< HEAD
date: "7 Jan 2016"
=======
>>>>>>> refs/remotes/InseadDataAnalytics/master
output: html_document
---

Expand Down Expand Up @@ -85,6 +89,14 @@ pnl_plot(SPY)
3. *(Extra points)* What if we want to also see the returns of another company, say Yahoo!, in the same period? Can you get that data and report the statistics for Yahoo!'s stock, too?

**Your Answers here:**

1. in dataSet1.R, the variable mytickers = c("SPY", "AAPL") - line 9 - includes AAPL, which is the trading name of Apple. THe lines from 13 to 46 download the data for the values on mytickers, which include the whole market and Apple specifically.

2. For this question I created a variable called AAPL = StockReturns[,"AAPL"]. The cumulative return is 384.1, the average daily return is 0.139 and the std deviation is 2.188 (using the same functions as with SPY: round(100*sum(AAPL),1), round(100*mean(AAPL),3) and round(100*sd(AAPL),3))
We can get the same results if we get the data from the second column of Stockreturns, which stores the values of Apple (example for cumulative round(100*sum(StockReturns[,2]),1))

3. We could get all the different companies traded from Yahoo. We would need to add the trading names to the variable mytickers. We would then store the data in additional columns in both StockPrices and StockReturns.

<br>
<br>
<br>
Expand Down Expand Up @@ -126,6 +138,9 @@ We can also transpose the matrix of returns to create a new "horizontal" matrix.
2. Based on the help for the R function *apply* (`help(apply)`), can you create again the portfolio of S&P and Apple and plot the returns in a new figure below?

**Your Answers here:**
1. We use the command dim(transposedData) to get all the dimensions (which returs 2 2771). If we want to get only the numer of rows we use nrow(transposedData) and for the number of columns we use ncol(transposedData), which gives us again 2 and 2771.

2. I assume we are using the new variable transposedData for this question. With this variable, we would tell R to apply the sum over the columns, which is given by a 2 in the Margin argument. The code would then be portfolio2 = apply(transposedData, 2, mean). We can then plot the new variable portfolio2: pnl_plot(portfolio2).
<br>
<br>
<br>
Expand All @@ -144,6 +159,7 @@ This is an important step and will get you to think about the overall process on
2. *(Extra Exercise)* Can you get the returns of a few companies and plot the returns of an equal weighted portfolio with those companies during some period you select?

**Your Answers here:**
1. We should change line 11 in DataSet1.R, which includes the variable startDate = "2005-01-01". We would assign then startDate = "2001-01-01"
<br>
<br>
<br>
Expand Down Expand Up @@ -178,6 +194,7 @@ myData + StockReturns[1:40,])
```

**Your Answers here:**
1. It reutrns a value of 20.
<br>
<br>
<br>
Expand Down
216 changes: 216 additions & 0 deletions Exercises/Exerciseset1/ExerciseSet1.html

Large diffs are not rendered by default.

184 changes: 184 additions & 0 deletions Exercises/Exerciseset1/ExerciseSet1_Ana de Andres.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@

---
title: "Exercise Set 1 - Solutions Ana de Andres "
author: "T. Evgeniou"
date: "7 Jan 2016"
output: html_document
---


<br>

The purpose of this exercise is to become familiar with:

1. Basic statistics functions in R;
2. Simple matrix operations;
3. Simple data manipulations;
4. The idea of functions as well as some useful customized functions provided.

While doing this exercise we will also see how to generate replicable and customizable reports. For this purpose the exercise uses the R Markdown capabilities (see [Markdown Cheat Sheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf) or a [basic introduction to R Markdown](http://rmarkdown.rstudio.com/authoring_basics.html)). These capabilities allow us to create dynamic reports. For example today's date is `r Sys.Date()` (you need to see the .Rmd to understand that this is *not* a static typed-in date but it changes every time you compile the .Rmd - if the date changed of course).

Before starting, make sure you have pulled the [exercise files](https://github.com/InseadDataAnalytics/INSEADAnalytics/tree/master/Exercises/Exerciseset1) on your github repository (if you pull the course github repository you also get the exercise set files automatically). Moreover, make sure you are in the directory of this exercise. Directory paths may be complicated, and sometimes a frustrating source of problems, so it is recommended that you use these R commands to find out your current working directory and, if needed, set it where you have the main files for the specific exercise/project (there are other ways, but for now just be aware of this path issue):

```{r echo=TRUE, eval=FALSE, tidy=TRUE}
getwd()
setwd("Exercises/Exerciseset1/")
list.files()
```

**Note:** you can always use the `help` command in Rstudio to find out about any R function (e.g. type `help(list.files)` to learn what the R function `list.files` does).

Let's now see the exercise.

**IMPORTANT:** You should answer all questions by simply adding your code/answers in this document through editing the file ExerciseSet1.Rmd and then clicking on the "Knit HTML" button in RStudio. Once done, please post your .Rmd and html files in your github repository.

### Exercise Data

We download daily prices (open, high, low, close, and adjusted close) and volume data of publicly traded companies and markets from the web (e.g. Yahoo! or Google, etc). This is done by sourcing the file data.R as well as some helper functions in herpersSet1.R which also installs a number of R libraries (hence the first time you run this code you will see a lot of red color text indicating the *download* and *installation* process):

```{r eval = TRUE, echo=TRUE, error = FALSE, warning=FALSE,message=FALSE,results='asis'}
source("helpersSet1.R")
source("dataSet1.R")
```

We have `r nrow(StockReturns)` days of data, starting from `r rownames(StockReturns)[1]` until `r tail(rownames(StockReturns),1)`.

### Part I: Statistics of S&P Daily Returns

Here are some basic statistics about the S&P returns:

1. The cumulative returns of the S&P index during this period is `r round(100*sum(StockReturns[,1]),1)`%.
2. The average daily returns of the S&P index during this period is `r round(100*mean(StockReturns[,1]),3)`%;
2. The standard deviation of the daily returns of the S&P index during this period is `r round(100*sd(StockReturns[,1]),3)`%;

Here are returns of the S&P in this period (note the use of the helper function pnl_plot - defined in file helpersSet1.R):

```{r echo=FALSE, comment=NA, warning=FALSE, message=FALSE,results='asis',fig.align='center', fig.height=4,fig.width= 6, fig=TRUE}
SPY = StockReturns[,"SPY"]
pnl_plot(SPY)
```

#### Questions

1. Notice that the code also downloads the returns of Apple during the same period. Can you explain where this is done in the code (including the .R files used)?
2. What are the cumulative, average daily returns, and the standard deviation of the daily returns of Apple in the same period?
3. *(Extra points)* What if we want to also see the returns of another company, say Yahoo!, in the same period? Can you get that data and report the statistics for Yahoo!'s stock, too?

**Your Answers here:**

1. in dataSet1.R, the variable mytickers = c("SPY", "AAPL") - line 9 - includes AAPL, which is the trading name of Apple. THe lines from 13 to 46 download the data for the values on mytickers, which include the whole market and Apple specifically.

2. For this question I created a variable called AAPL = StockReturns[,"AAPL"]. The cumulative return is 384.1, the average daily return is 0.139 and the std deviation is 2.188 (using the same functions as with SPY: round(100*sum(AAPL),1), round(100*mean(AAPL),3) and round(100*sd(AAPL),3))
We can get the same results if we get the data from the second column of Stockreturns, which stores the values of Apple (example for cumulative round(100*sum(StockReturns[,2]),1))

3. We could get all the different companies traded from Yahoo. We would need to add the trading names to the variable mytickers. We would then store the data in additional columns in both StockPrices and StockReturns.

<br>
<br>

### Part II: Simple Matrix Manipulations

For this part of the exercise we will do some basic manipulations of the data. First note that the data are in a so-called matrix format. If you run these commands in RStudio (use help to find out what they do) you will see how matrices work:

```{r eval = FALSE, echo=TRUE}
class(StockReturns)
dim(StockReturns)
nrow(StockReturns)
ncol(StockReturns)
StockReturns[1:4,]
head(StockReturns,5)
tail(StockReturns,5)
```

We will now use an R function for matrices that is extremely useful for analyzing data. It is called *apply*. Check it out using help in R.

For example, we can now quickly estimate the average returns of S&P and Apple (of course this can be done manually, too, but what if we had 500 stocks - e.g. a matrix with 500 columns?) and plot the returns of that 50-50 on S&P and Apple portfolio:

```{r echo=FALSE, comment=NA, warning=FALSE, message=FALSE,results='asis',fig.align='center', fig=TRUE}
portfolio = apply(StockReturns,1,mean)
names(portfolio) <- rownames(StockReturns)
pnl_plot(portfolio)
```


We can also transpose the matrix of returns to create a new "horizontal" matrix. Let's call this matrix (variable name) transposedData. We can do so using this command: `transposedData = t(StockReturns)`.

#### Questions

1. What R commands can you use to get the number of rows and number of columns of the new matrix called transposedData?
2. Based on the help for the R function *apply* (`help(apply)`), can you create again the portfolio of S&P and Apple and plot the returns in a new figure below?

**Your Answers here:**
1. We use the command dim(transposedData) to get all the dimensions (which returs 2 2771). If we want to get only the numer of rows we use nrow(transposedData) and for the number of columns we use ncol(transposedData), which gives us again 2 and 2771.

2. I assume we are using the new variable transposedData for this question. With this variable, we would tell R to apply the sum over the columns, which is given by a 2 in the Margin argument. The code would then be portfolio2 = apply(transposedData, 2, mean). We can then plot the new variable portfolio2: pnl_plot(portfolio2).
<br>
<br>

### Part III: Reproducibility and Customization

This is an important step and will get you to think about the overall process once again.

#### Questions

1. We want to re-do all this analysis with data since 2001-01-01: what change do we need to make in the code (hint: all you need to change is one line - exactly 1 number! - in data.R file), and how can you get the new exercise set with the data since 2001-01-01?
2. *(Extra Exercise)* Can you get the returns of a few companies and plot the returns of an equal weighted portfolio with those companies during some period you select?

**Your Answers here:**
1. We should change line 11 in DataSet1.R, which includes the variable startDate = "2005-01-01". We would assign then startDate = "2001-01-01"
<br>
<br>

### Part IV: Read/Write .CSV files

Finally, one can read and write data in .CSV files. For example, we can save the first 20 days of data for S&P and Apple in a file using the command:

```{r eval = TRUE, echo=TRUE, comment=NA, warning=FALSE, message=FALSE,results='asis'}
write.csv(StockReturns[1:20,c("SPY","AAPL")], file = "twentydays.csv", row.names = TRUE, col.names = TRUE)
```

Do not get surpsised if you see the csv file in your directories suddenly! You can then read the data from the csv file using the read.csv command. For example, this will load the data from the csv file and save it in a new variable that now is called "myData":

```{r eval = TRUE, echo=TRUE, comment=NA, warning=FALSE, message=FALSE,results='asis'}
myData <- read.csv(file = "twentydays.csv", header = TRUE, sep=";")
```

Try it!

#### Questions

1. Once you write and read the data as described above, what happens when you run this command in the console of the RStudio: `sum(myData != StockReturns[1:20,])`
2. *(Extra exercise)* What do you think will happen if you now run this command, and why:

```{r eval = FALSE, echo=TRUE}
myData + StockReturns[1:40,])
```

**Your Answers here:**
1. It reutrns a value of 20.
<br>
<br>

### Extra Question

Can you now load another dataset from some CSV file and report some basic statistics about that data?

<br>

### Creating Interactive Documents

Finally, just for fun, one can add some interactivity in the report using [Shiny](http://rmarkdown.rstudio.com/authoring_shiny.html).All one needs to do is set the eval flag of the code chunk below (see the .Rmd file) to "TRUE", add the line "runtime: shiny" at the very begining of the .Rmd file, make the markdown output to be "html_document", and then press "Run Document".

```{r, eval=FALSE, echo = TRUE}
sliderInput("startdate", "Starting Date:", min = 1, max = length(portfolio),
value = 1)
sliderInput("enddate", "End Date:", min = 1, max = length(portfolio),
value = length(portfolio))

renderPlot({
pnl_plot(portfolio[input$startdate:input$enddate])
})
```

Have fun.

Loading