Miguel A. Bessa | [email protected] | Associate Professor
What: This course aims to be an introduction to machine learning from a probabilistic perspective.
Where: This notebook comes from this repository
Reference: Murphy, Kevin P. Probabilistic machine learning: an introduction. MIT press, 2022. Available online here
How: We try to follow Murphy's book closely, but the sequence of Chapters and Sections is different. The intention is to use notebooks as an introduction to the topic and Murphy's book as a resource.
- If working offline: Go through this notebook and read the book.
- If attending class in person: listen to me (!) but also go through the notebook in your laptop at the same time. Read the book.
- If attending lectures remotely: listen to me (!) via Zoom and (ideally) use two screens where you have the notebook open in 1 screen and you see the lectures on the other. Read the book.
Folder structure
- The "Lectures" folder contains each lecture in a separate folder "LectureX" where X is the lecture number.
- Each "LectureX" folder contains:
- A jupyter notebook "3dasm_LectureX.ipynb" that you can run locally or in servers like Google Colab.
- A pdf "3dasm_LectureX slides.pdf" with the slides of that lecture.
- A "your_data" folder that you can use to create data or other things in your own computer.
- The preferred method to follow the course is to look directly into the jupyter notebook, as it contains additional notes and working code.
Grading
Homeworks 30%, Midterm 30%, and Final Project 40%.
Homeworks will be graded only with 5 levels: A+ (100%; fully correct), A (90%; has minor error), B (75%; has significant error), C (60%; mostly incorrect but homework was delivered), D (0%, not delivered). If you deliver something with an honest attempt at solving the homework you get 60% for that homework.
Note
Late Homework can only get up to A (90%).
The worst Homework is removed from the final grade.
Course outline
DATE | SUBJECT | Notebook | Homework | Colab | |
---|---|---|---|---|---|
Wed 9/4 | Introduction: Basics of univariate statistics | Lecture 1 | Slides | HW1 assigned | |
Fri 9/6 | Practical tutorial: Handling data with Pandas | Lecture 2 | Slides | ||
Mon 9/9 | Bayes' rule: joint & conditional distributions | Lecture 3 | Slides | ||
Wed 9/11 | Multivariate statistics: visualization of joint & conditional distributions | Lecture 4 | Slides | HW1 due & HW2 assigned |
|
Fri 9/13 | Bayesian inference for one hidden rv: Example with Gaussian likelihood and Uniform prior (Part I) | Lecture 5 | Slides | ||
Mon 9/16 | Bayesian inference for one hidden rv: Example with Gaussian likelihood and Uniform prior (Part II) | Lecture 6 | Slides | ||
Wed 9/18 | Bayesian inference for one hidden rv: Redo example but now with Gaussian prior | Lecture 7 | Slides | HW2 due & HW3 assigned |
|
Fri 9/20 | Machine Learning without going Bayesian: Point Estimates | Lecture 8 | Slides | ||
Mon 9/23 | Linear Regression: Practical tutorial (Part I: noiseless 1D example; underfitting vs. overfitting; interpolation vs. extrapolation) | Lecture 9 | Slides | ||
Wed 9/25 | Linear Regression: Practical tutorial (Part II: noiseless vs. noisy datasets; train/test dataset split; multi-dimensional example) | Lecture 10 | Slides | HW3 due & HW4 assigned |
|
Fri 9/27 | Linear Regression: Linear Least Squares model (Gaussian likelihood, Uniform prior, and Point Estimate) | Lecture 11 | Slides | ||
Mon 9/30 | Linear Regression: Ridge, Lasso and Bayesian linear regression models (different likelihoods, priors and posteriors) | Lecture 12 | Slides | ||
Wed 10/2 | Gaussian process regression: theory | Lecture 13 | Slides | HW4 due & HW5 assigned |
|
Fri 10/4 | Gaussian process regression: One dimensional tutorial (Part I: noiseless) | Lecture 14 | Slides | ||
Mon 10/7 | Gaussian process regression: One dimensional tutorial (Part II: noisy) | Lecture 14 (continued) | Slides | ||
Wed 10/9 | Gaussian process regression: Multidimensional tutorial & importance of dataset scaling | Lecture 15 | Slides | HW5 due & HW6 assigned |
|
Fri 10/11 | Bayesian model selection & Hyperparameter optimization | Lecture 16 | Slides | ||
Mon 10/14 | HOLIDAY 🥹 | ||||
Wed 10/16 | Q&A session | HW6 due |
|||
Fri 10/18 | MIDTERM Exam 🦾 |
||||
Mon 10/21 | Framework for Data-Driven Design & Analysis of Structures & Materials: f3dasm | Lecture 17 | Slides | ||
Wed 10/23 | f3dasm tutorial: Data-driven process; Sampling methods; Simple model selection example | Lecture 18 | Slides | HW7 assigned | |
Fri 10/25 | f3dasm tutorial: General use case (object oriented) | Lecture 19 | Slides | ||
Mon 10/28 | Final Project Overview & Assignment 🦾 |
Lecture 20 | Slides | Final Project assigned | |
Wed 10/30 | Introduction to classification: Tutorial with 3 simple classifiers on Iris dataset | Lecture 21 | Slides | HW7 due & HW8 assigned |
|
Fri 11/1 | Logistic regression classifier: Classification as a regression task; Bernoulli observation distribution; the sigmoid trick | Lecture 22 | Slides | ||
Mon 11/4 | Logistic regression classifier: Point estimate (e.g. MLE) as nonlinear optimization; classification as the mode of PPD | Lecture 22 (continued) | Slides | ||
Wed 11/6 | Gaussian discriminant analysis (GDA) classifier: Classification by a generative model | Lecture 23 | Slides | HW8 due & HW9 assigned |
|
Fri 11/8 | Optimization: Part I | Lecture 24 | Slides | ||
Mon 11/11 | Optimization: Part II | Lecture 25 | Slides | ||
Wed 11/13 | Optimization: Part III | Lecture 26 | Slides | ||
Fri 11/15 | Artificial Neural Networks: Part I | Lecture 27 | Slides | ||
Mon 11/18 | Artificial Neural Networks: Part II | Lecture 28 | Slides | ||
Wed 11/20 | Artificial Neural Networks: Part III | Lecture 29 | Slides | HW9 due |
|
Fri 11/22 | Artificial Neural Networks: Part IV | Lecture 30 | Slides | ||
Mon 11/25 | Thanksgiving week 🦃 | ||||
Wed 11/27 | Thanksgiving week 🦃 | ||||
Fri 11/29 | Thanksgiving week 🦃 | ||||
Mon 12/02 | TBD | Lecture 31 | Slides | ||
Wed 12/04 | TBD | Lecture 32 | Slides | ||
Fri 12/06 | Final Project presentations 🦾 |
Final Project report due |
Homework 1 contains detailed instructions to install the virtual environment with all the packages required for this course. Below are more concise instructions for people familiar with installing mamba and tensorflow:
- Install Mamba as described here. (See Homework 1 for additional instructions)
- Install Jupyter notebook and extensions in base environment
mamba install -c anaconda notebook nb_conda rise
- Create a virtual enviroment for this course called '3dasm':
mamba create -n 3dasm python==3.10 numpy scipy matplotlib pandas scikit-learn ipykernel ipywidgets f3dasm
- Install git, open command window & clone the repository to your computer:
git clone https://github.com/bessagroup/3dasm_course
-
Install tensorflow in the '3dasm' virtual environment. (See Homework 1 for additional instructions)
-
Install scikeras in '3dasm' virtual environment:
mamba activate 3dasm
pip install scikeras
After you installed every package, you are ready to go!
- Open a (mamba) command window and load jupyter notebook (it will open in your internet browser):
jupyter notebook
- Open notebook (for example: 3dasm_course/Lectures/Lecture1/3dasm_Lecture1.ipynb) and choose the '3dasm' kernel.
You're all set!
- go to Google Colab
- Login with your credentials
- File > Open notebook
- Click on Github (no need to login or authorize anything)
- paste the git link: https://github.com/bessagroup/3dasm_course
- click search and then click on the notebook (for example: 3dasm_course/Lectures/Lecture1/3dasm_Lecture1.ipynb)