-
Notifications
You must be signed in to change notification settings - Fork 1
/
ex1.Rmd
99 lines (72 loc) · 2.44 KB
/
ex1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
title: "Exercise 1"
author: "Jai Broome"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
html_document:
toc: true
number_sections: true
---
```{r dependencies, message = FALSE}
require(ggplot2)
require(dplyr)
require(knitr)
```
# Simple Octave/MATLAB function
I'll skip over this since I figure anyone wanting to follow along is already familiar with R
## Submitting Solutions
You won't be able to submit this for credit, but hopefully you still find this useful
# Linear regression with one variable
Here I read in the data, split it into `X` and `y`, and generate starting values for `theta`
```{r read-transform}
q1data <- read.table("data/ex1data1.txt",
sep = ",",
col.names = c("population", "profit"))
q1data$ones <- 1
q1data <- q1data %>% select(ones, population, profit)
X <- select(q1data, -profit)
y <- q1data$profit
theta <- rep(0, times = 2)
```
## Plotting the Data
```{r viz-exploration}
g1 <- ggplot(q1data, aes(x = population, y = profit)) +
geom_point(shape = 4, color = "red", size = 3) +
xlab("Population of City in 10,000s") +
ylab("Profit in $10,000s")
g1
```
## Gradient Descent
```{r grad-descent-options}
iterations <- 1500
alpha <- 0.01
```
Functions are defined in a separate script so they can be accessed for other assignments. See `?knitr::read_chunk`
```{r read_chunks}
read_chunk("ex1/ex1_chunks.R")
```
```{r lin-reg-cost-func}
```
```{r gradStep}
```
Note that in the loop, `thetas` get stored in a temporary variable for simultaneous updating.
```{r gradientDescent}
```
```{r costFunctionCheck}
computeCost(X, y, theta)
```
Should be `32.07`
## Debugging
Pay particular attention to indexing in these exercizes, because unlike Matlab, R *does* start indexing at 1
```{r implement-descent}
jhist <- gradientDescent(X, y, theta, alpha, iterations)
jhist <- jhist[2:(nrow(jhist) - 1),] #makes graph more interpretible
```
```{r viz-results}
ggplot(jhist, aes(x = iterations, y = loss)) + geom_line()
g1 + geom_abline(slope = tail(jhist$theta1, n = 1), intercept = tail(jhist$theta0, n = 1))
```
## Visualizing $J(\theta)$
I skipped over this section since the code is already written in Matlab by the course instructors. See the assignment page for the images that show how different values of the thetas yield different costs
# Optional exercises
These are still incomplete. I'm planning on returning to these after making it through the main course content