-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
7a27ec1
commit 20acf8a
Showing
2 changed files
with
866 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,382 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import numpy as np\n", | ||
"\n", | ||
"import matplotlib.pyplot as plt\n", | ||
"%matplotlib inline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Today's data\n", | ||
"\n", | ||
"400 fotos of human faces. Each face is a 2d array [64x64] of pixel brightness." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from sklearn.datasets import fetch_olivetti_faces\n", | ||
"data = fetch_olivetti_faces().images" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# @this code showcases matplotlib subplots. The syntax is: plt.subplot(height, width, index_starting_from_1)\n", | ||
"plt.subplot(2,2,1)\n", | ||
"plt.imshow(data[0],cmap='gray')\n", | ||
"plt.subplot(2,2,2)\n", | ||
"plt.imshow(data[1],cmap='gray')\n", | ||
"plt.subplot(2,2,3)\n", | ||
"plt.imshow(data[2],cmap='gray')\n", | ||
"plt.subplot(2,2,4)\n", | ||
"plt.imshow(data[3],cmap='gray')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Face reconstruction problem\n", | ||
"\n", | ||
"Let's solve the face reconstruction problem: given left halves of facex __(X)__, our algorithm shall predict the right half __(y)__. Our first step is to slice the photos into X and y using slices.\n", | ||
"\n", | ||
"* In regular python, slice looks roughly like this: `a[2:5]` _(select elements from 2 to 5)_\n", | ||
"* Numpy allows you to slice N-dimensional arrays along each dimension: [image_index, height, width]\n", | ||
" * `data[:10]` - Select first 10 images\n", | ||
" * `data[:, :10]` - For all images, select a horizontal stripe 10 pixels high at the top of the image\n", | ||
" * `data[10:20, :, -25:-15]` - Take images [10, 11, ..., 19], for each image select a _vetrical stripe_ of width 10 pixels, 15 pixels away from the _right_ side.\n", | ||
"\n", | ||
"\n", | ||
"Let's use slices to select all __left image halves as X__ and all __right halves as y__." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# select left half of each face as X, right half as Y\n", | ||
"X = <Slice left half-images>\n", | ||
"y = <Slice right half-images>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# If you did everything right, you're gonna see left half-image and right half-image drawn separately in natural order\n", | ||
"plt.subplot(1,2,1)\n", | ||
"plt.imshow(X[0],cmap='gray')\n", | ||
"plt.subplot(1,2,2)\n", | ||
"plt.imshow(y[0],cmap='gray')\n", | ||
"\n", | ||
"assert X.shape == y.shape == (len(data), 64, 32), \"Please slice exactly the left half-face to X and right half-face to Y\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def glue(left_half,right_half):\n", | ||
" # merge photos back together\n", | ||
" left_half = left_half.reshape([-1,64,32])\n", | ||
" right_half = right_half.reshape([-1,64,32])\n", | ||
" return np.concatenate([left_half,right_half],axis=-1)\n", | ||
"\n", | ||
"\n", | ||
"# if you did everything right, you're gonna see a valid face\n", | ||
"plt.imshow(glue(X,y)[99],cmap='gray')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Machine learning stuff" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from sklearn.model_selection import train_test_split\n", | ||
"X_train,X_test,Y_train,Y_test = train_test_split(X.reshape([len(X),-1]),\n", | ||
" y.reshape([len(y),-1]),\n", | ||
" test_size=0.05,random_state=42)\n", | ||
"\n", | ||
"print(X_test.shape)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from sklearn.linear_model import LinearRegression\n", | ||
"model = LinearRegression()\n", | ||
"model.fit(X_train,Y_train)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"measure mean squared error" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from sklearn.metrics import mean_squared_error\n", | ||
"\n", | ||
"print(\"Train MSE:\", mean_squared_error(Y_train,model.predict(X_train)))\n", | ||
"print(\"Test MSE:\", mean_squared_error(Y_test,model.predict(X_test)))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Train predictions\n", | ||
"pics = glue(X_train,model.predict(X_train))\n", | ||
"plt.figure(figsize=[16,12])\n", | ||
"for i in range(20):\n", | ||
" plt.subplot(4,5,i+1)\n", | ||
" plt.imshow(pics[i],cmap='gray')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Test predictions\n", | ||
"pics = glue(X_test,model.predict(X_test))\n", | ||
"plt.figure(figsize=[16,12])\n", | ||
"for i in range(20):\n", | ||
" plt.subplot(4,5,i+1)\n", | ||
" plt.imshow(pics[i],cmap='gray')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"\n", | ||
"# Ridge regression\n", | ||
"RidgeRegression is just a LinearRegression, with l2 regularization - penalized for $ \\alpha \\cdot \\sum _i w_i^2$\n", | ||
"\n", | ||
"Let's train such a model with alpha=0.5" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from sklearn.linear_model import Ridge\n", | ||
"\n", | ||
"ridge = Ridge(alpha=0.5)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"<YOUR CODE: fit the model on training set>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"<YOUR CODE: predict and measure MSE on train and test>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Test predictions\n", | ||
"pics = glue(X_test,ridge.predict(X_test))\n", | ||
"plt.figure(figsize=[16,12])\n", | ||
"for i in range(20):\n", | ||
" plt.subplot(4,5,i+1)\n", | ||
" plt.imshow(pics[i],cmap='gray')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"```\n", | ||
"\n", | ||
"# Grid search\n", | ||
"\n", | ||
"Train model with diferent $\\alpha$ and find one that has minimal test MSE. It's okay to use loops or any other python stuff here." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"<YOUR CODE>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Test predictions\n", | ||
"pics = glue(X_test,<predict with your best model>)\n", | ||
"plt.figure(figsize=[16,12])\n", | ||
"for i in range(20):\n", | ||
" plt.subplot(4,5,i+1)\n", | ||
" plt.imshow(pics[i],cmap='gray')" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.6.2" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 1 | ||
} |
Oops, something went wrong.