Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
mhjensen committed Oct 22, 2023
1 parent cfcb644 commit 569aca6
Show file tree
Hide file tree
Showing 9 changed files with 2,054 additions and 1,060 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ _Detailed notes at the link_ https://compphysics.github.io/MachineLearning/doc/L
| Recommended readings | Hastie et al Chapter 3 |
| | Lecture material at https://compphysics.github.io/MLErasmus/doc/web/course.html sessions 3 and 4 |
| | Video of Lecture at https://youtu.be/iqRKUPJr_bY |
| | Handwritten notes at Handwritten notes at https://github.com/CompPhysics/MLErasmus/blob/master/doc/HandwrittenNotes/2023/NotesOct162023.pdf |
| | Handwritten notes at https://github.com/CompPhysics/MLErasmus/blob/master/doc/HandwrittenNotes/2023/NotesOct162023.pdf |
| Monday October 23 | - _Lecture 815am-10am_: Resampling Methods and Bias-Variance tradeoff (MHJ) |
| Recommended readings | Hastie et al chapter 7 |
| | Lecture material at https://compphysics.github.io/MLErasmus/doc/web/course.html session 4 material |
Expand Down
81 changes: 77 additions & 4 deletions doc/pub/day3/html/day3-bs.html
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,11 @@
('Exercise 2: Expectation values for Ridge regression',
2,
None,
'exercise-2-expectation-values-for-ridge-regression')]}
'exercise-2-expectation-values-for-ridge-regression'),
('Exercise 3: Bias-Variance tradeoff',
2,
None,
'exercise-3-bias-variance-tradeoff')]}
end of tocinfo -->

<body>
Expand Down Expand Up @@ -405,6 +409,7 @@
<!-- navigation toc: --> <li><a href="#overarching-aims-of-the-exercises-this-week" style="font-size: 80%;">Overarching aims of the exercises this week</a></li>
<!-- navigation toc: --> <li><a href="#exercise-1-expectation-values-for-ordinary-least-squares-expressions" style="font-size: 80%;">Exercise 1: Expectation values for ordinary least squares expressions</a></li>
<!-- navigation toc: --> <li><a href="#exercise-2-expectation-values-for-ridge-regression" style="font-size: 80%;">Exercise 2: Expectation values for Ridge regression</a></li>
<!-- navigation toc: --> <li><a href="#exercise-3-bias-variance-tradeoff" style="font-size: 80%;">Exercise 3: Bias-Variance tradeoff</a></li>

</ul>
</li>
Expand Down Expand Up @@ -433,7 +438,7 @@ <h1>Data Analysis and Machine Learning: Ridge and Lasso Regression and Resamplin
</center>
<br>
<center>
<h4>October 15 and 22, 2023</h4>
<h4>October 16 and 23, 2023</h4>
</center> <!-- date -->
<br>

Expand All @@ -447,8 +452,8 @@ <h2 id="plans-for-sessions-4-6" class="anchor">Plans for Sessions 4-6 </h2>
<li> More on Ridge and Lasso Regression</li>
<li> Statistics, probability theory and resampling methods</li>
<ul>
<li> <a href="https://youtu.be/" target="_self">Video of Lecture October 15 to be added</a></li>
<li> <a href="https://youtu.be/" target="_self">Video of Lecture October 22 to be added</a></li>
<li> <a href="https://youtu.be/iqRKUPJr_bY" target="_self">Video of Lecture October 16 to be added</a></li>
<li> <a href="https://youtu.be/" target="_self">Video of Lecture October 23 to be added</a></li>
</ul>
</ul>
<!-- !split -->
Expand Down Expand Up @@ -3571,6 +3576,74 @@ <h2 id="exercise-2-expectation-values-for-ridge-regression" class="anchor">Exerc

<p>and it is easy to see that if the parameter \( \lambda \) goes to infinity then the variance of the Ridge parameters \( \boldsymbol{\beta} \) goes to zero.</p>

<!-- --- end exercise --- -->

<!-- --- begin exercise --- -->
<h2 id="exercise-3-bias-variance-tradeoff" class="anchor">Exercise 3: Bias-Variance tradeoff </h2>

<p>The aim of the exercises is to derive the equations for the bias-variance tradeoff to be used in project 1 as well as testing this for a simpler function using the bootstrap method. </p>

<p>Consider a
dataset \( \mathcal{L} \) consisting of the data
\( \mathbf{X}_\mathcal{L}=\{(y_j, \boldsymbol{x}_j), j=0\ldots n-1\} \).
</p>

<p>We assume that the true data is generated from a noisy model</p>

$$
\boldsymbol{y}=f(\boldsymbol{x}) + \boldsymbol{\epsilon}.
$$

<p>Here \( \epsilon \) is normally distributed with mean zero and standard
deviation \( \sigma^2 \).
</p>

<p>In our derivation of the ordinary least squares method we defined
an approximation to the function \( f \) in terms of the parameters
\( \boldsymbol{\beta} \) and the design matrix \( \boldsymbol{X} \) which embody our model,
that is \( \boldsymbol{\tilde{y}}=\boldsymbol{X}\boldsymbol{\beta} \).
</p>

<p>The parameters \( \boldsymbol{\beta} \) are in turn found by optimizing the mean
squared error via the so-called cost function
</p>

$$
C(\boldsymbol{X},\boldsymbol{\beta}) =\frac{1}{n}\sum_{i=0}^{n-1}(y_i-\tilde{y}_i)^2=\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right].
$$

<p>Here the expected value \( \mathbb{E} \) is the sample value. </p>

<p>Show that you can rewrite this in terms of a term which contains the variance of the model itself (the so-called variance term), a
term which measures the deviation from the true data and the mean value of the model (the bias term) and finally the variance of the noise.
That is, show that
</p>
$$
\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right]=\mathrm{Bias}[\tilde{y}]+\mathrm{var}[\tilde{y}]+\sigma^2,
$$

<p>with </p>
$$
\mathrm{Bias}[\tilde{y}]=\mathbb{E}\left[\left(\boldsymbol{y}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right],
$$

<p>and </p>
$$
\mathrm{var}[\tilde{y}]=\mathbb{E}\left[\left(\tilde{\boldsymbol{y}}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right]=\frac{1}{n}\sum_i(\tilde{y}_i-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right])^2.
$$

<p>Explain what the terms mean and discuss their interpretations.</p>

<p>Perform then a bias-variance analysis of a simple one-dimensional (or other models of your choice) function by
studying the MSE value as function of the complexity of your model. Use ordinary least squares only.
</p>

<p>Discuss the bias and variance trade-off as function
of your model complexity (the degree of the polynomial) and the number
of data points, and possibly also your training and test data using the <b>bootstrap</b> resampling method.
You can follow the code example in the jupyter-book at <a href="https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff" target="_self"><tt>https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff</tt></a>.
</p>

<!-- --- end exercise --- -->
<!-- ------------------- end of main content --------------- -->
</div> <!-- end container -->
Expand Down
84 changes: 81 additions & 3 deletions doc/pub/day3/html/day3-reveal.html
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ <h1 style="text-align: center;">Data Analysis and Machine Learning: Ridge and La
</center>
<br>
<center>
<h4>October 15 and 22, 2023</h4>
<h4>October 16 and 23, 2023</h4>
</center> <!-- date -->
<br>

Expand All @@ -202,9 +202,9 @@ <h2 id="plans-for-sessions-4-6">Plans for Sessions 4-6 </h2>
<p><li> Statistics, probability theory and resampling methods</li>
<ul>

<p><li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 15 to be added</a></li>
<p><li> <a href="https://youtu.be/iqRKUPJr_bY" target="_blank">Video of Lecture October 16 to be added</a></li>

<p><li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 22 to be added</a></li>
<p><li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 23 to be added</a></li>
</ul>
<p>
</ul>
Expand Down Expand Up @@ -3667,6 +3667,84 @@ <h2 id="exercise-2-expectation-values-for-ridge-regression">Exercise 2: Expectat

<p>and it is easy to see that if the parameter \( \lambda \) goes to infinity then the variance of the Ridge parameters \( \boldsymbol{\beta} \) goes to zero.</p>

<!-- --- end exercise --- -->

<!-- --- begin exercise --- -->
<h2 id="exercise-3-bias-variance-tradeoff">Exercise 3: Bias-Variance tradeoff </h2>

<p>The aim of the exercises is to derive the equations for the bias-variance tradeoff to be used in project 1 as well as testing this for a simpler function using the bootstrap method. </p>

<p>Consider a
dataset \( \mathcal{L} \) consisting of the data
\( \mathbf{X}_\mathcal{L}=\{(y_j, \boldsymbol{x}_j), j=0\ldots n-1\} \).
</p>

<p>We assume that the true data is generated from a noisy model</p>

<p>&nbsp;<br>
$$
\boldsymbol{y}=f(\boldsymbol{x}) + \boldsymbol{\epsilon}.
$$
<p>&nbsp;<br>

<p>Here \( \epsilon \) is normally distributed with mean zero and standard
deviation \( \sigma^2 \).
</p>

<p>In our derivation of the ordinary least squares method we defined
an approximation to the function \( f \) in terms of the parameters
\( \boldsymbol{\beta} \) and the design matrix \( \boldsymbol{X} \) which embody our model,
that is \( \boldsymbol{\tilde{y}}=\boldsymbol{X}\boldsymbol{\beta} \).
</p>

<p>The parameters \( \boldsymbol{\beta} \) are in turn found by optimizing the mean
squared error via the so-called cost function
</p>

<p>&nbsp;<br>
$$
C(\boldsymbol{X},\boldsymbol{\beta}) =\frac{1}{n}\sum_{i=0}^{n-1}(y_i-\tilde{y}_i)^2=\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right].
$$
<p>&nbsp;<br>

<p>Here the expected value \( \mathbb{E} \) is the sample value. </p>

<p>Show that you can rewrite this in terms of a term which contains the variance of the model itself (the so-called variance term), a
term which measures the deviation from the true data and the mean value of the model (the bias term) and finally the variance of the noise.
That is, show that
</p>
<p>&nbsp;<br>
$$
\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right]=\mathrm{Bias}[\tilde{y}]+\mathrm{var}[\tilde{y}]+\sigma^2,
$$
<p>&nbsp;<br>

<p>with </p>
<p>&nbsp;<br>
$$
\mathrm{Bias}[\tilde{y}]=\mathbb{E}\left[\left(\boldsymbol{y}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right],
$$
<p>&nbsp;<br>

<p>and </p>
<p>&nbsp;<br>
$$
\mathrm{var}[\tilde{y}]=\mathbb{E}\left[\left(\tilde{\boldsymbol{y}}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right]=\frac{1}{n}\sum_i(\tilde{y}_i-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right])^2.
$$
<p>&nbsp;<br>

<p>Explain what the terms mean and discuss their interpretations.</p>

<p>Perform then a bias-variance analysis of a simple one-dimensional (or other models of your choice) function by
studying the MSE value as function of the complexity of your model. Use ordinary least squares only.
</p>

<p>Discuss the bias and variance trade-off as function
of your model complexity (the degree of the polynomial) and the number
of data points, and possibly also your training and test data using the <b>bootstrap</b> resampling method.
You can follow the code example in the jupyter-book at <a href="https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff" target="_blank"><tt>https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff</tt></a>.
</p>

<!-- --- end exercise --- -->
</section>

Expand Down
80 changes: 76 additions & 4 deletions doc/pub/day3/html/day3-solarized.html
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,11 @@
('Exercise 2: Expectation values for Ridge regression',
2,
None,
'exercise-2-expectation-values-for-ridge-regression')]}
'exercise-2-expectation-values-for-ridge-regression'),
('Exercise 3: Bias-Variance tradeoff',
2,
None,
'exercise-3-bias-variance-tradeoff')]}
end of tocinfo -->

<body>
Expand Down Expand Up @@ -346,7 +350,7 @@ <h1>Data Analysis and Machine Learning: Ridge and Lasso Regression and Resamplin
</center>
<br>
<center>
<h4>October 15 and 22, 2023</h4>
<h4>October 16 and 23, 2023</h4>
</center> <!-- date -->
<br>

Expand All @@ -357,8 +361,8 @@ <h2 id="plans-for-sessions-4-6">Plans for Sessions 4-6 </h2>
<li> More on Ridge and Lasso Regression</li>
<li> Statistics, probability theory and resampling methods</li>
<ul>
<li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 15 to be added</a></li>
<li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 22 to be added</a></li>
<li> <a href="https://youtu.be/iqRKUPJr_bY" target="_blank">Video of Lecture October 16 to be added</a></li>
<li> <a href="https://youtu.be/" target="_blank">Video of Lecture October 23 to be added</a></li>
</ul>
</ul>
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
Expand Down Expand Up @@ -3472,6 +3476,74 @@ <h2 id="exercise-2-expectation-values-for-ridge-regression">Exercise 2: Expectat

<p>and it is easy to see that if the parameter \( \lambda \) goes to infinity then the variance of the Ridge parameters \( \boldsymbol{\beta} \) goes to zero.</p>

<!-- --- end exercise --- -->

<!-- --- begin exercise --- -->
<h2 id="exercise-3-bias-variance-tradeoff">Exercise 3: Bias-Variance tradeoff </h2>

<p>The aim of the exercises is to derive the equations for the bias-variance tradeoff to be used in project 1 as well as testing this for a simpler function using the bootstrap method. </p>

<p>Consider a
dataset \( \mathcal{L} \) consisting of the data
\( \mathbf{X}_\mathcal{L}=\{(y_j, \boldsymbol{x}_j), j=0\ldots n-1\} \).
</p>

<p>We assume that the true data is generated from a noisy model</p>

$$
\boldsymbol{y}=f(\boldsymbol{x}) + \boldsymbol{\epsilon}.
$$

<p>Here \( \epsilon \) is normally distributed with mean zero and standard
deviation \( \sigma^2 \).
</p>

<p>In our derivation of the ordinary least squares method we defined
an approximation to the function \( f \) in terms of the parameters
\( \boldsymbol{\beta} \) and the design matrix \( \boldsymbol{X} \) which embody our model,
that is \( \boldsymbol{\tilde{y}}=\boldsymbol{X}\boldsymbol{\beta} \).
</p>

<p>The parameters \( \boldsymbol{\beta} \) are in turn found by optimizing the mean
squared error via the so-called cost function
</p>

$$
C(\boldsymbol{X},\boldsymbol{\beta}) =\frac{1}{n}\sum_{i=0}^{n-1}(y_i-\tilde{y}_i)^2=\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right].
$$

<p>Here the expected value \( \mathbb{E} \) is the sample value. </p>

<p>Show that you can rewrite this in terms of a term which contains the variance of the model itself (the so-called variance term), a
term which measures the deviation from the true data and the mean value of the model (the bias term) and finally the variance of the noise.
That is, show that
</p>
$$
\mathbb{E}\left[(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2\right]=\mathrm{Bias}[\tilde{y}]+\mathrm{var}[\tilde{y}]+\sigma^2,
$$

<p>with </p>
$$
\mathrm{Bias}[\tilde{y}]=\mathbb{E}\left[\left(\boldsymbol{y}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right],
$$

<p>and </p>
$$
\mathrm{var}[\tilde{y}]=\mathbb{E}\left[\left(\tilde{\boldsymbol{y}}-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right]\right)^2\right]=\frac{1}{n}\sum_i(\tilde{y}_i-\mathbb{E}\left[\boldsymbol{\tilde{y}}\right])^2.
$$

<p>Explain what the terms mean and discuss their interpretations.</p>

<p>Perform then a bias-variance analysis of a simple one-dimensional (or other models of your choice) function by
studying the MSE value as function of the complexity of your model. Use ordinary least squares only.
</p>

<p>Discuss the bias and variance trade-off as function
of your model complexity (the degree of the polynomial) and the number
of data points, and possibly also your training and test data using the <b>bootstrap</b> resampling method.
You can follow the code example in the jupyter-book at <a href="https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff" target="_blank"><tt>https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/chapter3.html#the-bias-variance-tradeoff</tt></a>.
</p>

<!-- --- end exercise --- -->
<!-- ------------------- end of main content --------------- -->
<center style="font-size:80%">
Expand Down
Loading

0 comments on commit 569aca6

Please sign in to comment.