diff --git a/doc/pub/Introduction/html/Introduction-bs.html b/doc/pub/Introduction/html/Introduction-bs.html
index 2be47e2..dac87ad 100644
--- a/doc/pub/Introduction/html/Introduction-bs.html
+++ b/doc/pub/Introduction/html/Introduction-bs.html
@@ -372,7 +372,7 @@
Nuclear Talent course on Machine Learning in Nuclear Experiment and Theory<
-Sep 25, 2022
+Sep 24, 2023
diff --git a/doc/pub/Introduction/html/Introduction-reveal.html b/doc/pub/Introduction/html/Introduction-reveal.html
index c3bf6dc..2df9a24 100644
--- a/doc/pub/Introduction/html/Introduction-reveal.html
+++ b/doc/pub/Introduction/html/Introduction-reveal.html
@@ -184,7 +184,7 @@ Nuclear Talent course on Machine Learning in Nu
-Sep 25, 2022
+Sep 24, 2023
diff --git a/doc/pub/Introduction/html/Introduction-solarized.html b/doc/pub/Introduction/html/Introduction-solarized.html
index 819010b..1d73bd8 100644
--- a/doc/pub/Introduction/html/Introduction-solarized.html
+++ b/doc/pub/Introduction/html/Introduction-solarized.html
@@ -307,7 +307,7 @@ Nuclear Talent course on Machine Learning in Nuclear Experiment and Theory<
-Sep 25, 2022
+Sep 24, 2023
diff --git a/doc/pub/Introduction/html/Introduction.html b/doc/pub/Introduction/html/Introduction.html
index f5a049a..d49fa44 100644
--- a/doc/pub/Introduction/html/Introduction.html
+++ b/doc/pub/Introduction/html/Introduction.html
@@ -384,7 +384,7 @@ Nuclear Talent course on Machine Learning in Nuclear Experiment and Theory<
-Sep 25, 2022
+Sep 24, 2023
diff --git a/doc/pub/Introduction/ipynb/Introduction.ipynb b/doc/pub/Introduction/ipynb/Introduction.ipynb
index 6c9031e..37045bd 100644
--- a/doc/pub/Introduction/ipynb/Introduction.ipynb
+++ b/doc/pub/Introduction/ipynb/Introduction.ipynb
@@ -2,8 +2,10 @@
"cells": [
{
"cell_type": "markdown",
- "id": "c7ef21df",
- "metadata": {},
+ "id": "2274e914",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
@@ -12,19 +14,23 @@
},
{
"cell_type": "markdown",
- "id": "97f45f07",
- "metadata": {},
+ "id": "b82d8ae5",
+ "metadata": {
+ "editable": true
+ },
"source": [
"# Nuclear Talent course on Machine Learning in Nuclear Experiment and Theory\n",
"**[Morten Hjorth-Jensen](http://mhjgit.github.io/info/doc/web/)**, Department of Physics and Astronomy and Facility for Rare Isotope Beams, Michigan State University, East Lansing, Michigan, USA and Department of Physics and Center for Computing in Science Education, University of Oslo, Oslo, Norway\n",
"\n",
- "Date: **Sep 25, 2022**"
+ "Date: **Sep 24, 2023**"
]
},
{
"cell_type": "markdown",
- "id": "8a9bffe0",
- "metadata": {},
+ "id": "f44d0bba",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Introduction\n",
"\n",
@@ -50,8 +56,10 @@
},
{
"cell_type": "markdown",
- "id": "30c5ba3f",
- "metadata": {},
+ "id": "cce75d86",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Overview of these introductory notes\n",
"\n",
@@ -79,16 +87,20 @@
},
{
"cell_type": "markdown",
- "id": "6df0c291",
- "metadata": {},
+ "id": "55fda48b",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Machine Learning, short overview"
]
},
{
"cell_type": "markdown",
- "id": "ba4a69e2",
- "metadata": {},
+ "id": "5910126a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Machine Learning, a small (and probably biased) introduction\n",
"\n",
@@ -105,8 +117,10 @@
},
{
"cell_type": "markdown",
- "id": "a3fb13f7",
- "metadata": {},
+ "id": "249492da",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Machine Learning, an extremely rich field\n",
"\n",
@@ -127,8 +141,10 @@
},
{
"cell_type": "markdown",
- "id": "1cf698ec",
- "metadata": {},
+ "id": "db10ddbd",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## A multidisciplinary approach\n",
"\n",
@@ -144,8 +160,10 @@
},
{
"cell_type": "markdown",
- "id": "c2c4c286",
- "metadata": {},
+ "id": "671a6b16",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Learning outcomes\n",
"\n",
@@ -175,8 +193,10 @@
},
{
"cell_type": "markdown",
- "id": "b3426c55",
- "metadata": {},
+ "id": "ec15b54c",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Types of Machine Learning\n",
"\n",
@@ -202,8 +222,10 @@
},
{
"cell_type": "markdown",
- "id": "d9ec7493",
- "metadata": {},
+ "id": "c35188ba",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Essential elements of ML\n",
"\n",
@@ -218,8 +240,10 @@
},
{
"cell_type": "markdown",
- "id": "8eef38f1",
- "metadata": {},
+ "id": "cb25a580",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## An optimization/minimization problem\n",
"\n",
@@ -228,8 +252,10 @@
},
{
"cell_type": "markdown",
- "id": "e6673eed",
- "metadata": {},
+ "id": "541b3d5f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## A Frequentist approach to data analysis\n",
"\n",
@@ -261,8 +287,10 @@
},
{
"cell_type": "markdown",
- "id": "9500c61e",
- "metadata": {},
+ "id": "a1c241e4",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## What is a good model?\n",
"\n",
@@ -289,8 +317,10 @@
},
{
"cell_type": "markdown",
- "id": "93cb1262",
- "metadata": {},
+ "id": "4c4f8a91",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## What is a good model? Can we define it?\n",
"\n",
@@ -318,16 +348,20 @@
},
{
"cell_type": "markdown",
- "id": "5311869b",
- "metadata": {},
+ "id": "d1c2ef4f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Practicalities, choice of programming language and other computational issues"
]
},
{
"cell_type": "markdown",
- "id": "f7dd05f6",
- "metadata": {},
+ "id": "cbbb5b96",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Choice of Programming Language\n",
"\n",
@@ -347,8 +381,10 @@
},
{
"cell_type": "markdown",
- "id": "ac6d9ab8",
- "metadata": {},
+ "id": "d2d07f41",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Software and needed installations\n",
"\n",
@@ -382,8 +418,10 @@
},
{
"cell_type": "markdown",
- "id": "de9b90af",
- "metadata": {},
+ "id": "f8d13461",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Python installers\n",
"\n",
@@ -413,8 +451,10 @@
},
{
"cell_type": "markdown",
- "id": "b9de94c5",
- "metadata": {},
+ "id": "72365393",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Useful Python libraries\n",
"Here we list several useful Python libraries we strongly recommend (if you use anaconda many of these are already there)\n",
@@ -444,16 +484,20 @@
},
{
"cell_type": "markdown",
- "id": "21296417",
- "metadata": {},
+ "id": "fd8e32a2",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More Practicalities, handling arrays"
]
},
{
"cell_type": "markdown",
- "id": "8181ac33",
- "metadata": {},
+ "id": "60a853ce",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Basic Matrix Features, Numpy examples and Important Matrix and vector handling packages\n",
"\n",
@@ -462,8 +506,10 @@
},
{
"cell_type": "markdown",
- "id": "88a7cbf8",
- "metadata": {},
+ "id": "bfc788ca",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathbf{A} =\n",
@@ -483,16 +529,20 @@
},
{
"cell_type": "markdown",
- "id": "a370cf16",
- "metadata": {},
+ "id": "cd78df52",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The inverse of a matrix is defined by"
]
},
{
"cell_type": "markdown",
- "id": "c64d3ba1",
- "metadata": {},
+ "id": "8ef8314d",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathbf{A}^{-1} \\cdot \\mathbf{A} = I\n",
@@ -501,8 +551,10 @@
},
{
"cell_type": "markdown",
- "id": "ea332ac4",
- "metadata": {},
+ "id": "750461b6",
+ "metadata": {
+ "editable": true
+ },
"source": [
"
\n",
"\n",
@@ -520,8 +572,10 @@
},
{
"cell_type": "markdown",
- "id": "d831a9cf",
- "metadata": {},
+ "id": "9d9ea5ef",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Some famous Matrices\n",
"\n",
@@ -546,8 +600,10 @@
},
{
"cell_type": "markdown",
- "id": "a0ef408a",
- "metadata": {},
+ "id": "d1643c35",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More Basic Matrix Features\n",
"\n",
@@ -570,8 +626,10 @@
},
{
"cell_type": "markdown",
- "id": "c784a027",
- "metadata": {},
+ "id": "5c736fdb",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Numpy and arrays\n",
"[Numpy](http://www.numpy.org/) provides an easy way to handle arrays in Python. The standard way to import this library is as"
@@ -580,8 +638,11 @@
{
"cell_type": "code",
"execution_count": 1,
- "id": "02cd7073",
- "metadata": {},
+ "id": "8eab8905",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
"outputs": [],
"source": [
"import numpy as np"
@@ -589,8 +650,10 @@
},
{
"cell_type": "markdown",
- "id": "1b389c59",
- "metadata": {},
+ "id": "67e213f3",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Here follows a simple example where we set up an array of ten elements, all determined by random numbers drawn according to the normal distribution,"
]
@@ -598,18 +661,12 @@
{
"cell_type": "code",
"execution_count": 2,
- "id": "f277fec9",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[ 1.24826685 1.20222048 0.17700469 0.12954931 -1.15497332 -0.43018674\n",
- " -0.00894835 -1.37166367 -1.10791697 0.16654221]\n"
- ]
- }
- ],
+ "id": "8a619af1",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"n = 10\n",
"x = np.random.normal(size=n)\n",
@@ -618,8 +675,10 @@
},
{
"cell_type": "markdown",
- "id": "dbd22f9d",
- "metadata": {},
+ "id": "f5e4b34c",
+ "metadata": {
+ "editable": true
+ },
"source": [
"We defined a vector $x$ with $n=10$ elements with its values given by the Normal distribution $N(0,1)$.\n",
"Another alternative is to declare a vector as follows"
@@ -628,17 +687,12 @@
{
"cell_type": "code",
"execution_count": 3,
- "id": "27332d9c",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1 2 3]\n"
- ]
- }
- ],
+ "id": "96338fb0",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"x = np.array([1, 2, 3])\n",
@@ -647,8 +701,10 @@
},
{
"cell_type": "markdown",
- "id": "ca6a0178",
- "metadata": {},
+ "id": "0ae5f183",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Here we have defined a vector with three elements, with $x_0=1$, $x_1=2$ and $x_2=3$. Note that both Python and C++\n",
"start numbering array elements from $0$ and on. This means that a vector with $n$ elements has a sequence of entities $x_0, x_1, x_2, \\dots, x_{n-1}$. We could also let (recommended) Numpy to compute the logarithms of a specific array as"
@@ -657,17 +713,12 @@
{
"cell_type": "code",
"execution_count": 4,
- "id": "d47de077",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.38629436 1.94591015 2.07944154]\n"
- ]
- }
- ],
+ "id": "4c57fd82",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"x = np.log(np.array([4, 7, 8]))\n",
@@ -676,8 +727,10 @@
},
{
"cell_type": "markdown",
- "id": "5f379483",
- "metadata": {},
+ "id": "8194fadb",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More Examples\n",
"\n",
@@ -693,17 +746,12 @@
{
"cell_type": "code",
"execution_count": 5,
- "id": "d7b14782",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1 1 2]\n"
- ]
- }
- ],
+ "id": "f13cf964",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"from math import log\n",
@@ -715,8 +763,10 @@
},
{
"cell_type": "markdown",
- "id": "1f1e87e1",
- "metadata": {},
+ "id": "605144ec",
+ "metadata": {
+ "editable": true
+ },
"source": [
"We note that our code is much longer already and we need to import the **log** function from the **math** module. \n",
"The attentive reader will also notice that the output is $[1, 1, 2]$. Python interprets automagically our numbers as integers (like the **automatic** keyword in C++). To change this we could define our array elements to be double precision numbers as"
@@ -725,17 +775,12 @@
{
"cell_type": "code",
"execution_count": 6,
- "id": "5f0a358a",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.38629436 1.94591015 2.07944154]\n"
- ]
- }
- ],
+ "id": "b694f766",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"x = np.log(np.array([4, 7, 8], dtype = np.float64))\n",
@@ -744,8 +789,10 @@
},
{
"cell_type": "markdown",
- "id": "d0e99292",
- "metadata": {},
+ "id": "6457ab02",
+ "metadata": {
+ "editable": true
+ },
"source": [
"or simply write them as double precision numbers (Python uses 64 bits as default for floating point type variables), that is"
]
@@ -753,17 +800,12 @@
{
"cell_type": "code",
"execution_count": 7,
- "id": "8b836c8d",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.38629436 1.94591015 2.07944154]\n"
- ]
- }
- ],
+ "id": "35a567ef",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"x = np.log(np.array([4.0, 7.0, 8.0]))\n",
@@ -772,8 +814,10 @@
},
{
"cell_type": "markdown",
- "id": "37e1c191",
- "metadata": {},
+ "id": "2c24fe55",
+ "metadata": {
+ "editable": true
+ },
"source": [
"To check the number of bytes (remember that one byte contains eight bits for double precision variables), you can use simple use the **itemsize** functionality (the array $x$ is actually an object which inherits the functionalities defined in Numpy) as"
]
@@ -781,17 +825,12 @@
{
"cell_type": "code",
"execution_count": 8,
- "id": "b8c854b1",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.38629436 1.94591015 2.07944154]\n"
- ]
- }
- ],
+ "id": "2837c505",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"x = np.log(np.array([4.0, 7.0, 8.0]))\n",
@@ -800,8 +839,10 @@
},
{
"cell_type": "markdown",
- "id": "2bdfaf0c",
- "metadata": {},
+ "id": "c3328494",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Matrices in Python\n",
"\n",
@@ -813,19 +854,12 @@
{
"cell_type": "code",
"execution_count": 9,
- "id": "a83f8003",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[[1.38629436 1.94591015 2.07944154]\n",
- " [1.09861229 2.30258509 2.39789527]\n",
- " [1.38629436 1.60943791 1.94591015]]\n"
- ]
- }
- ],
+ "id": "f13acaa9",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"A = np.log(np.array([ [4.0, 7.0, 8.0], [3.0, 10.0, 11.0], [4.0, 5.0, 7.0] ]))\n",
@@ -834,8 +868,10 @@
},
{
"cell_type": "markdown",
- "id": "54e22f3d",
- "metadata": {},
+ "id": "199a7303",
+ "metadata": {
+ "editable": true
+ },
"source": [
"If we use the **shape** function we would get $(3, 3)$ as output, that is verifying that our matrix is a $3\\times 3$ matrix. We can slice the matrix and print for example the first column (Python organized matrix elements in a row-major order, see below) as"
]
@@ -843,17 +879,12 @@
{
"cell_type": "code",
"execution_count": 10,
- "id": "1efa1ca9",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.38629436 1.09861229 1.38629436]\n"
- ]
- }
- ],
+ "id": "d9123012",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"A = np.log(np.array([ [4.0, 7.0, 8.0], [3.0, 10.0, 11.0], [4.0, 5.0, 7.0] ]))\n",
@@ -863,8 +894,10 @@
},
{
"cell_type": "markdown",
- "id": "dccf8fb1",
- "metadata": {},
+ "id": "5575f5f5",
+ "metadata": {
+ "editable": true
+ },
"source": [
"We can continue this was by printing out other columns or rows. The example here prints out the second column"
]
@@ -872,17 +905,12 @@
{
"cell_type": "code",
"execution_count": 11,
- "id": "45b96449",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[1.09861229 2.30258509 2.39789527]\n"
- ]
- }
- ],
+ "id": "a5bd9cff",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"A = np.log(np.array([ [4.0, 7.0, 8.0], [3.0, 10.0, 11.0], [4.0, 5.0, 7.0] ]))\n",
@@ -892,8 +920,10 @@
},
{
"cell_type": "markdown",
- "id": "c42595ae",
- "metadata": {},
+ "id": "5c669557",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Numpy contains many other functionalities that allow us to slice, subdivide etc etc arrays. We strongly recommend that you look up the [Numpy website for more details](http://www.numpy.org/). Useful functions when defining a matrix are the **np.zeros** function which declares a matrix of a given dimension and sets all elements to zero"
]
@@ -901,8 +931,11 @@
{
"cell_type": "code",
"execution_count": 12,
- "id": "fed5b056",
- "metadata": {},
+ "id": "c2085403",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
"outputs": [],
"source": [
"import numpy as np\n",
@@ -914,8 +947,10 @@
},
{
"cell_type": "markdown",
- "id": "e76b4c7e",
- "metadata": {},
+ "id": "0c6009e7",
+ "metadata": {
+ "editable": true
+ },
"source": [
"or initializing all elements to"
]
@@ -923,8 +958,11 @@
{
"cell_type": "code",
"execution_count": 13,
- "id": "2bb38dfb",
- "metadata": {},
+ "id": "184f59b0",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
"outputs": [],
"source": [
"import numpy as np\n",
@@ -936,8 +974,10 @@
},
{
"cell_type": "markdown",
- "id": "a2c3ecb9",
- "metadata": {},
+ "id": "8db84aa6",
+ "metadata": {
+ "editable": true
+ },
"source": [
"or as unitarily distributed random numbers (see the material on random number generators in the statistics part)"
]
@@ -945,8 +985,11 @@
{
"cell_type": "code",
"execution_count": 14,
- "id": "8b30d53c",
- "metadata": {},
+ "id": "4bab2072",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
"outputs": [],
"source": [
"import numpy as np\n",
@@ -958,8 +1001,10 @@
},
{
"cell_type": "markdown",
- "id": "c789e06d",
- "metadata": {},
+ "id": "8b426031",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More Examples, Covariance matrix\n",
"\n",
@@ -970,8 +1015,10 @@
},
{
"cell_type": "markdown",
- "id": "70d079bf",
- "metadata": {},
+ "id": "2e80be67",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\boldsymbol{\\Sigma} = \\begin{bmatrix} \\sigma_{xx} & \\sigma_{xy} & \\sigma_{xz} \\\\\n",
@@ -983,16 +1030,20 @@
},
{
"cell_type": "markdown",
- "id": "bbedf417",
- "metadata": {},
+ "id": "8c8eba8d",
+ "metadata": {
+ "editable": true
+ },
"source": [
"where for example"
]
},
{
"cell_type": "markdown",
- "id": "3bab3dc6",
- "metadata": {},
+ "id": "b13f4900",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\sigma_{xy} =\\frac{1}{n} \\sum_{i=0}^{n-1}(x_i- \\overline{x})(y_i- \\overline{y}).\n",
@@ -1001,8 +1052,10 @@
},
{
"cell_type": "markdown",
- "id": "7c13419d",
- "metadata": {},
+ "id": "aa9ece7a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The Numpy function **np.cov** calculates the covariance elements using the factor $1/(n-1)$ instead of $1/n$ since it assumes we do not have the exact mean values. \n",
"The following simple function uses the **np.vstack** function which takes each vector of dimension $1\\times n$ and produces a $3\\times n$ matrix $\\boldsymbol{W}$"
@@ -1010,8 +1063,10 @@
},
{
"cell_type": "markdown",
- "id": "0b2767e2",
- "metadata": {},
+ "id": "d3a197a0",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\boldsymbol{W} = \\begin{bmatrix} x_0 & y_0 & z_0 \\\\\n",
@@ -1026,8 +1081,10 @@
},
{
"cell_type": "markdown",
- "id": "6ab47f8e",
- "metadata": {},
+ "id": "d6c0d341",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More on the Covariance Matrix\n",
"\n",
@@ -1041,8 +1098,11 @@
{
"cell_type": "code",
"execution_count": 15,
- "id": "b18f6bb7",
- "metadata": {},
+ "id": "64babda3",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
"outputs": [],
"source": [
"# Importing various packages\n",
@@ -1064,16 +1124,20 @@
},
{
"cell_type": "markdown",
- "id": "29817f31",
- "metadata": {},
+ "id": "dcd59049",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Practicalities, Reminder on Statistics"
]
},
{
"cell_type": "markdown",
- "id": "08994c2b",
- "metadata": {},
+ "id": "024cb789",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Brief Reminder on Statistical Analysis\n",
"The *probability distribution function (PDF)* is a function\n",
@@ -1083,8 +1147,10 @@
},
{
"cell_type": "markdown",
- "id": "3faa4374",
- "metadata": {},
+ "id": "6d77c41f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"p(x) = \\mathrm{prob}(X=x)\n",
@@ -1093,8 +1159,10 @@
},
{
"cell_type": "markdown",
- "id": "3b474f5e",
- "metadata": {},
+ "id": "9370c4e0",
+ "metadata": {
+ "editable": true
+ },
"source": [
"In the continuous case, the PDF does not directly depict the\n",
"actual probability. Instead we define the probability for the\n",
@@ -1107,8 +1175,10 @@
},
{
"cell_type": "markdown",
- "id": "f355e7de",
- "metadata": {},
+ "id": "5ca77a5e",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{prob}(a\\leq X\\leq b) = \\int_a^b p(x)dx\n",
@@ -1117,8 +1187,10 @@
},
{
"cell_type": "markdown",
- "id": "e4c05358",
- "metadata": {},
+ "id": "115f2dc7",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Qualitatively speaking, a stochastic variable represents the values of\n",
"numbers chosen as if by chance from some specified PDF so that the\n",
@@ -1127,8 +1199,10 @@
},
{
"cell_type": "markdown",
- "id": "2950a8b9",
- "metadata": {},
+ "id": "932260d0",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, moments\n",
"A particularly useful class of special expectation values are the\n",
@@ -1138,8 +1212,10 @@
},
{
"cell_type": "markdown",
- "id": "dd3b4aa6",
- "metadata": {},
+ "id": "4a082486",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\langle x^n\\rangle \\equiv \\int\\! x^n p(x)\\,dx\n",
@@ -1148,8 +1224,10 @@
},
{
"cell_type": "markdown",
- "id": "82a0124d",
- "metadata": {},
+ "id": "e36ffa9b",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The zero-th moment $\\langle 1\\rangle$ is just the normalization condition of\n",
"$p$. The first moment, $\\langle x\\rangle$, is called the *mean* of $p$\n",
@@ -1158,8 +1236,10 @@
},
{
"cell_type": "markdown",
- "id": "e9037771",
- "metadata": {},
+ "id": "a47cf536",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\langle x\\rangle = \\mu \\equiv \\int\\! x p(x)\\,dx\n",
@@ -1168,8 +1248,10 @@
},
{
"cell_type": "markdown",
- "id": "83b3d249",
- "metadata": {},
+ "id": "d601e5ce",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, central moments\n",
"A special version of the moments is the set of *central moments*,\n",
@@ -1178,8 +1260,10 @@
},
{
"cell_type": "markdown",
- "id": "e058bada",
- "metadata": {},
+ "id": "82b67f16",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\langle (x-\\langle x \\rangle )^n\\rangle \\equiv \\int\\! (x-\\langle x\\rangle)^n p(x)\\,dx\n",
@@ -1188,8 +1272,10 @@
},
{
"cell_type": "markdown",
- "id": "ea24bda3",
- "metadata": {},
+ "id": "31dfa639",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The zero-th and first central moments are both trivial, equal $1$ and\n",
"$0$, respectively. But the second central moment, known as the\n",
@@ -1199,8 +1285,10 @@
},
{
"cell_type": "markdown",
- "id": "bd72bdca",
- "metadata": {},
+ "id": "e25c8e30",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1216,8 +1304,10 @@
},
{
"cell_type": "markdown",
- "id": "23f4ded6",
- "metadata": {},
+ "id": "726b6af6",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1233,8 +1323,10 @@
},
{
"cell_type": "markdown",
- "id": "fb6a8d9e",
- "metadata": {},
+ "id": "314cb1a2",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1249,8 +1341,10 @@
},
{
"cell_type": "markdown",
- "id": "978d68ec",
- "metadata": {},
+ "id": "dfbfce81",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1265,8 +1359,10 @@
},
{
"cell_type": "markdown",
- "id": "c8a80bcd",
- "metadata": {},
+ "id": "4717ef67",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The square root of the variance, $\\sigma =\\sqrt{\\langle (x-\\langle x\\rangle)^2\\rangle}$ is called the *standard deviation* of $p$. It is clearly just the RMS (root-mean-square)\n",
"value of the deviation of the PDF from its mean value, interpreted\n",
@@ -1275,8 +1371,10 @@
},
{
"cell_type": "markdown",
- "id": "2378f1c8",
- "metadata": {},
+ "id": "983ba9ba",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, covariance\n",
"Another important quantity is the so called covariance, a variant of\n",
@@ -1288,8 +1386,10 @@
},
{
"cell_type": "markdown",
- "id": "fd3a99c1",
- "metadata": {},
+ "id": "1b095a8f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{cov}(X_i,\\,X_j) \\equiv \\langle (x_i-\\langle x_i\\rangle)(x_j-\\langle x_j\\rangle)\\rangle\n",
@@ -1299,8 +1399,10 @@
},
{
"cell_type": "markdown",
- "id": "d281b532",
- "metadata": {},
+ "id": "603d72cc",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1317,16 +1419,20 @@
},
{
"cell_type": "markdown",
- "id": "2c05a33a",
- "metadata": {},
+ "id": "4a4becd9",
+ "metadata": {
+ "editable": true
+ },
"source": [
"with"
]
},
{
"cell_type": "markdown",
- "id": "e1554578",
- "metadata": {},
+ "id": "eb342bf7",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\langle x_i\\rangle =\n",
@@ -1336,8 +1442,10 @@
},
{
"cell_type": "markdown",
- "id": "314edffc",
- "metadata": {},
+ "id": "4023d375",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, more covariance\n",
"If we consider the above covariance as a matrix $C_{ij}=\\mathrm{cov}(X_i,\\,X_j)$, then the diagonal elements are just the familiar\n",
@@ -1350,8 +1458,10 @@
},
{
"cell_type": "markdown",
- "id": "a8440bcb",
- "metadata": {},
+ "id": "5f793dd3",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1366,8 +1476,10 @@
},
{
"cell_type": "markdown",
- "id": "bd93e9ad",
- "metadata": {},
+ "id": "152feb6c",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1382,8 +1494,10 @@
},
{
"cell_type": "markdown",
- "id": "5da7529c",
- "metadata": {},
+ "id": "f29e7119",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1399,8 +1513,10 @@
},
{
"cell_type": "markdown",
- "id": "bb3ead26",
- "metadata": {},
+ "id": "25f62c6c",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1416,8 +1532,10 @@
},
{
"cell_type": "markdown",
- "id": "e273da78",
- "metadata": {},
+ "id": "16bf231d",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1432,8 +1550,10 @@
},
{
"cell_type": "markdown",
- "id": "a8072593",
- "metadata": {},
+ "id": "187dfb80",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, independent variables\n",
"If $X_i$ and $X_j$ are independent, we get \n",
@@ -1444,8 +1564,10 @@
},
{
"cell_type": "markdown",
- "id": "5a7e9409",
- "metadata": {},
+ "id": "e03a7a5f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, more variance\n",
"Since the variance is just $\\mathrm{var}(X_i) = \\mathrm{cov}(X_i, X_i)$, we get\n",
@@ -1454,8 +1576,10 @@
},
{
"cell_type": "markdown",
- "id": "b0827cfa",
- "metadata": {},
+ "id": "d29e9ca5",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1470,8 +1594,10 @@
},
{
"cell_type": "markdown",
- "id": "0b3bdfc5",
- "metadata": {},
+ "id": "b4baf0f6",
+ "metadata": {
+ "editable": true
+ },
"source": [
"And in the special case when the stochastic variables are\n",
"uncorrelated, the off-diagonal elements of the covariance are as we\n",
@@ -1480,8 +1606,10 @@
},
{
"cell_type": "markdown",
- "id": "52383e37",
- "metadata": {},
+ "id": "96ee711d",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{var}(U)=\\sum_i a_i^2\\mathrm{cov}(X_i, X_i) = \\sum_i a_i^2 \\mathrm{var}(X_i),\n",
@@ -1490,16 +1618,20 @@
},
{
"cell_type": "markdown",
- "id": "ef4cd7b0",
- "metadata": {},
+ "id": "476e7381",
+ "metadata": {
+ "editable": true
+ },
"source": [
"and"
]
},
{
"cell_type": "markdown",
- "id": "81b21052",
- "metadata": {},
+ "id": "3f7ca529",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{var}(\\sum_i a_i X_i) = \\sum_i a_i^2 \\mathrm{var}(X_i)\n",
@@ -1508,8 +1640,10 @@
},
{
"cell_type": "markdown",
- "id": "6b176024",
- "metadata": {},
+ "id": "1f4379b7",
+ "metadata": {
+ "editable": true
+ },
"source": [
"which will become very useful in our study of the error in the mean\n",
"value of a set of measurements."
@@ -1517,8 +1651,10 @@
},
{
"cell_type": "markdown",
- "id": "984bc899",
- "metadata": {},
+ "id": "4abdd76e",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics and stochastic processes\n",
"A *stochastic process* is a process that produces sequentially a\n",
@@ -1527,8 +1663,10 @@
},
{
"cell_type": "markdown",
- "id": "2102af12",
- "metadata": {},
+ "id": "0545b421",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\{x_1, x_2,\\dots\\,x_k,\\dots\\}.\n",
@@ -1537,8 +1675,10 @@
},
{
"cell_type": "markdown",
- "id": "0d14864b",
- "metadata": {},
+ "id": "9d2d86b8",
+ "metadata": {
+ "editable": true
+ },
"source": [
"We will call these\n",
"values our *measurements* and the entire set as our measured\n",
@@ -1556,8 +1696,10 @@
},
{
"cell_type": "markdown",
- "id": "eb7e7146",
- "metadata": {},
+ "id": "c806ed42",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics and sample variables\n",
"In practical situations a sample is always of finite size. Let that\n",
@@ -1566,8 +1708,10 @@
},
{
"cell_type": "markdown",
- "id": "d3376649",
- "metadata": {},
+ "id": "f567485a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\bar{x}_n \\equiv \\frac{1}{n}\\sum_{k=1}^n x_k\n",
@@ -1576,16 +1720,20 @@
},
{
"cell_type": "markdown",
- "id": "0efde8df",
- "metadata": {},
+ "id": "ed2efdcf",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The *sample variance* is:"
]
},
{
"cell_type": "markdown",
- "id": "9911c9bb",
- "metadata": {},
+ "id": "687a3113",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{var}(x) \\equiv \\frac{1}{n}\\sum_{k=1}^n (x_k - \\bar{x}_n)^2\n",
@@ -1594,8 +1742,10 @@
},
{
"cell_type": "markdown",
- "id": "550ca901",
- "metadata": {},
+ "id": "6b2b529b",
+ "metadata": {
+ "editable": true
+ },
"source": [
"its square root being the *standard deviation of the sample*. The\n",
"*sample covariance* is:"
@@ -1603,8 +1753,10 @@
},
{
"cell_type": "markdown",
- "id": "51ed53a4",
- "metadata": {},
+ "id": "334af3f9",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\mathrm{cov}(x)\\equiv\\frac{1}{n}\\sum_{kl}(x_k - \\bar{x}_n)(x_l - \\bar{x}_n)\n",
@@ -1613,8 +1765,10 @@
},
{
"cell_type": "markdown",
- "id": "fb01f2f3",
- "metadata": {},
+ "id": "bf1509e8",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, sample variance and covariance\n",
"Note that the sample variance is the sample covariance without the\n",
@@ -1631,8 +1785,10 @@
},
{
"cell_type": "markdown",
- "id": "d88f8573",
- "metadata": {},
+ "id": "e3997fdf",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, law of large numbers\n",
"The law of large numbers\n",
@@ -1642,8 +1798,10 @@
},
{
"cell_type": "markdown",
- "id": "6144afbf",
- "metadata": {},
+ "id": "c8232d08",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\lim_{n\\to\\infty}\\bar{x}_n = \\mu_X^{\\phantom X}\n",
@@ -1652,8 +1810,10 @@
},
{
"cell_type": "markdown",
- "id": "9ab77fa8",
- "metadata": {},
+ "id": "f28dbe3a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The sample mean $\\bar{x}_n$ works therefore as an estimate of the true\n",
"mean $\\mu_X^{\\phantom X}$.\n",
@@ -1673,8 +1833,10 @@
},
{
"cell_type": "markdown",
- "id": "7dfc17ac",
- "metadata": {},
+ "id": "52685df5",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, more on sample error\n",
"Let us first take a look at what happens to the sample error as the\n",
@@ -1686,8 +1848,10 @@
},
{
"cell_type": "markdown",
- "id": "f1fe449a",
- "metadata": {},
+ "id": "17b667c8",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\overline X_n = \\frac{1}{n}\\sum_{i=1}^n X_i\n",
@@ -1696,8 +1860,10 @@
},
{
"cell_type": "markdown",
- "id": "ccb63cef",
- "metadata": {},
+ "id": "c597107c",
+ "metadata": {
+ "editable": true
+ },
"source": [
"All the coefficients are just equal $1/n$. The PDF of $\\overline X_n$,\n",
"denoted by $p_{\\overline X_n}(x)$ is the desired PDF of the sample\n",
@@ -1706,8 +1872,10 @@
},
{
"cell_type": "markdown",
- "id": "43d6b34b",
- "metadata": {},
+ "id": "3d2d0c07",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics\n",
"The probability density of obtaining a sample mean $\\bar x_n$\n",
@@ -1718,8 +1886,10 @@
},
{
"cell_type": "markdown",
- "id": "26cdc0fa",
- "metadata": {},
+ "id": "34766f17",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"p_{\\overline X_n}(x) = \\int p_X^{\\phantom X}(x_1)\\cdots\n",
@@ -1730,16 +1900,20 @@
},
{
"cell_type": "markdown",
- "id": "180b736d",
- "metadata": {},
+ "id": "53342f5a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"And in particular we are interested in its variance $\\mathrm{var}(\\overline X_n)$."
]
},
{
"cell_type": "markdown",
- "id": "bc497962",
- "metadata": {},
+ "id": "9b5e9c80",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Statistics, central limit theorem\n",
"It is generally not possible to express $p_{\\overline X_n}(x)$ in a\n",
@@ -1753,8 +1927,10 @@
},
{
"cell_type": "markdown",
- "id": "94f0e9ad",
- "metadata": {},
+ "id": "8e5251c1",
+ "metadata": {
+ "editable": true
+ },
"source": [
"\n",
"\n",
@@ -1771,8 +1947,10 @@
},
{
"cell_type": "markdown",
- "id": "177a4e65",
- "metadata": {},
+ "id": "0fc50642",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Covariance example\n",
"\n",
@@ -1782,8 +1960,10 @@
},
{
"cell_type": "markdown",
- "id": "b8647838",
- "metadata": {},
+ "id": "d2f9a0c8",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\boldsymbol{\\Sigma} = \\begin{bmatrix} \\sigma_{xx} & \\sigma_{xy} & \\sigma_{xz} \\\\\n",
@@ -1795,16 +1975,20 @@
},
{
"cell_type": "markdown",
- "id": "1fb8ee64",
- "metadata": {},
+ "id": "8faf4662",
+ "metadata": {
+ "editable": true
+ },
"source": [
"where for example"
]
},
{
"cell_type": "markdown",
- "id": "459b5931",
- "metadata": {},
+ "id": "af30715f",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\sigma_{xy} =\\frac{1}{n} \\sum_{i=0}^{n-1}(x_i- \\overline{x})(y_i- \\overline{y}).\n",
@@ -1813,8 +1997,10 @@
},
{
"cell_type": "markdown",
- "id": "996f5655",
- "metadata": {},
+ "id": "0a37e1be",
+ "metadata": {
+ "editable": true
+ },
"source": [
"The Numpy function **np.cov** calculates the covariance elements using\n",
"the factor $1/(n-1)$ instead of $1/n$ since it assumes we do not have\n",
@@ -1827,8 +2013,10 @@
},
{
"cell_type": "markdown",
- "id": "950e7dbc",
- "metadata": {},
+ "id": "54464f41",
+ "metadata": {
+ "editable": true
+ },
"source": [
"$$\n",
"\\boldsymbol{W} = \\begin{bmatrix} x_0 & y_0 & z_0 \\\\\n",
@@ -1843,8 +2031,10 @@
},
{
"cell_type": "markdown",
- "id": "2946ac65",
- "metadata": {},
+ "id": "52e31f84",
+ "metadata": {
+ "editable": true
+ },
"source": [
"which in turn is converted into into the $3\\times 3$ covariance matrix\n",
"$\\boldsymbol{\\Sigma}$ via the Numpy function **np.cov()**. We note that we can\n",
@@ -1856,31 +2046,23 @@
},
{
"cell_type": "markdown",
- "id": "3b643958",
- "metadata": {},
+ "id": "fa89bf2a",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Covariance in numpy"
]
},
{
"cell_type": "code",
- "execution_count": 12,
- "id": "858e99b8",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "0.09163914747417538\n",
- "4.375115764185571\n",
- "0.6892618254093753\n",
- "[[ 1.3495844 4.03015049 4.44924571]\n",
- " [ 4.03015049 13.06494732 13.31433367]\n",
- " [ 4.44924571 13.31433367 22.33736214]]\n"
- ]
- }
- ],
+ "execution_count": 16,
+ "id": "9d21a1c1",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"# Importing various packages\n",
"import numpy as np\n",
@@ -1899,16 +2081,20 @@
},
{
"cell_type": "markdown",
- "id": "71d921e7",
- "metadata": {},
+ "id": "737f1bd2",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Practicalities, Useful Python Packages"
]
},
{
"cell_type": "markdown",
- "id": "5e4ede91",
- "metadata": {},
+ "id": "df198d83",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Meet the Pandas\n",
"\n",
@@ -1930,82 +2116,13 @@
},
{
"cell_type": "code",
- "execution_count": 13,
- "id": "4ec4aa9e",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " First Name | \n",
- " Last Name | \n",
- " Place of birth | \n",
- " Date of Birth T.A. | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " Frodo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2968 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " Bilbo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2890 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " Aragorn II | \n",
- " Elessar | \n",
- " Eriador | \n",
- " 2931 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " Samwise | \n",
- " Gamgee | \n",
- " Shire | \n",
- " 2980 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " First Name Last Name Place of birth Date of Birth T.A.\n",
- "0 Frodo Baggins Shire 2968\n",
- "1 Bilbo Baggins Shire 2890\n",
- "2 Aragorn II Elessar Eriador 2931\n",
- "3 Samwise Gamgee Shire 2980"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
+ "execution_count": 17,
+ "id": "600ab712",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import pandas as pd\n",
"from IPython.display import display\n",
@@ -2020,8 +2137,10 @@
},
{
"cell_type": "markdown",
- "id": "e2a3cdc5",
- "metadata": {},
+ "id": "ad6a38f9",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## Data Frames in Pandas\n",
"\n",
@@ -2033,82 +2152,13 @@
},
{
"cell_type": "code",
- "execution_count": 14,
- "id": "1cec5f2d",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " First Name | \n",
- " Last Name | \n",
- " Place of birth | \n",
- " Date of Birth T.A. | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Frodo | \n",
- " Frodo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2968 | \n",
- "
\n",
- " \n",
- " Bilbo | \n",
- " Bilbo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2890 | \n",
- "
\n",
- " \n",
- " Aragorn | \n",
- " Aragorn II | \n",
- " Elessar | \n",
- " Eriador | \n",
- " 2931 | \n",
- "
\n",
- " \n",
- " Sam | \n",
- " Samwise | \n",
- " Gamgee | \n",
- " Shire | \n",
- " 2980 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " First Name Last Name Place of birth Date of Birth T.A.\n",
- "Frodo Frodo Baggins Shire 2968\n",
- "Bilbo Bilbo Baggins Shire 2890\n",
- "Aragorn Aragorn II Elessar Eriador 2931\n",
- "Sam Samwise Gamgee Shire 2980"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
+ "execution_count": 18,
+ "id": "2befa1e2",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"data_pandas = pd.DataFrame(data,index=['Frodo','Bilbo','Aragorn','Sam'])\n",
"display(data_pandas)"
@@ -2116,130 +2166,46 @@
},
{
"cell_type": "markdown",
- "id": "79adee29",
- "metadata": {},
+ "id": "ca5f1c39",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Thereafter we display the content of the row which begins with the index **Aragorn**"
]
},
{
"cell_type": "code",
- "execution_count": 15,
- "id": "526647ff",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "First Name Aragorn II\n",
- "Last Name Elessar\n",
- "Place of birth Eriador\n",
- "Date of Birth T.A. 2931\n",
- "Name: Aragorn, dtype: object"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
+ "execution_count": 19,
+ "id": "c4105f00",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"display(data_pandas.loc['Aragorn'])"
]
},
{
"cell_type": "markdown",
- "id": "2ddd01cc",
- "metadata": {},
+ "id": "9690a7aa",
+ "metadata": {
+ "editable": true
+ },
"source": [
"We can easily append data to this, for example"
]
},
{
"cell_type": "code",
- "execution_count": 16,
- "id": "8299fdd9",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " First Name | \n",
- " Last Name | \n",
- " Place of birth | \n",
- " Date of Birth T.A. | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Frodo | \n",
- " Frodo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2968 | \n",
- "
\n",
- " \n",
- " Bilbo | \n",
- " Bilbo | \n",
- " Baggins | \n",
- " Shire | \n",
- " 2890 | \n",
- "
\n",
- " \n",
- " Aragorn | \n",
- " Aragorn II | \n",
- " Elessar | \n",
- " Eriador | \n",
- " 2931 | \n",
- "
\n",
- " \n",
- " Sam | \n",
- " Samwise | \n",
- " Gamgee | \n",
- " Shire | \n",
- " 2980 | \n",
- "
\n",
- " \n",
- " Pippin | \n",
- " Peregrin | \n",
- " Took | \n",
- " Shire | \n",
- " 2990 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " First Name Last Name Place of birth Date of Birth T.A.\n",
- "Frodo Frodo Baggins Shire 2968\n",
- "Bilbo Bilbo Baggins Shire 2890\n",
- "Aragorn Aragorn II Elessar Eriador 2931\n",
- "Sam Samwise Gamgee Shire 2980\n",
- "Pippin Peregrin Took Shire 2990"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
+ "execution_count": 20,
+ "id": "51c3f637",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"new_hobbit = {'First Name': [\"Peregrin\"],\n",
" 'Last Name': [\"Took\"],\n",
@@ -2252,8 +2218,10 @@
},
{
"cell_type": "markdown",
- "id": "b400f59e",
- "metadata": {},
+ "id": "de931d19",
+ "metadata": {
+ "editable": true
+ },
"source": [
"## More Pandas\n",
"\n",
@@ -2263,289 +2231,13 @@
},
{
"cell_type": "code",
- "execution_count": 17,
- "id": "6ad49953",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " 0 | \n",
- " 1 | \n",
- " 2 | \n",
- " 3 | \n",
- " 4 | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " -1.749765 | \n",
- " 0.342680 | \n",
- " 1.153036 | \n",
- " -0.252436 | \n",
- " 0.981321 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " 0.514219 | \n",
- " 0.221180 | \n",
- " -1.070043 | \n",
- " -0.189496 | \n",
- " 0.255001 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " -0.458027 | \n",
- " 0.435163 | \n",
- " -0.583595 | \n",
- " 0.816847 | \n",
- " 0.672721 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " -0.104411 | \n",
- " -0.531280 | \n",
- " 1.029733 | \n",
- " -0.438136 | \n",
- " -1.118318 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " 1.618982 | \n",
- " 1.541605 | \n",
- " -0.251879 | \n",
- " -0.842436 | \n",
- " 0.184519 | \n",
- "
\n",
- " \n",
- " 5 | \n",
- " 0.937082 | \n",
- " 0.731000 | \n",
- " 1.361556 | \n",
- " -0.326238 | \n",
- " 0.055676 | \n",
- "
\n",
- " \n",
- " 6 | \n",
- " 0.222400 | \n",
- " -1.443217 | \n",
- " -0.756352 | \n",
- " 0.816454 | \n",
- " 0.750445 | \n",
- "
\n",
- " \n",
- " 7 | \n",
- " -0.455947 | \n",
- " 1.189622 | \n",
- " -1.690617 | \n",
- " -1.356399 | \n",
- " -1.232435 | \n",
- "
\n",
- " \n",
- " 8 | \n",
- " -0.544439 | \n",
- " -0.668172 | \n",
- " 0.007315 | \n",
- " -0.612939 | \n",
- " 1.299748 | \n",
- "
\n",
- " \n",
- " 9 | \n",
- " -1.733096 | \n",
- " -0.983310 | \n",
- " 0.357508 | \n",
- " -1.613579 | \n",
- " 1.470714 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " 0 1 2 3 4\n",
- "0 -1.749765 0.342680 1.153036 -0.252436 0.981321\n",
- "1 0.514219 0.221180 -1.070043 -0.189496 0.255001\n",
- "2 -0.458027 0.435163 -0.583595 0.816847 0.672721\n",
- "3 -0.104411 -0.531280 1.029733 -0.438136 -1.118318\n",
- "4 1.618982 1.541605 -0.251879 -0.842436 0.184519\n",
- "5 0.937082 0.731000 1.361556 -0.326238 0.055676\n",
- "6 0.222400 -1.443217 -0.756352 0.816454 0.750445\n",
- "7 -0.455947 1.189622 -1.690617 -1.356399 -1.232435\n",
- "8 -0.544439 -0.668172 0.007315 -0.612939 1.299748\n",
- "9 -1.733096 -0.983310 0.357508 -1.613579 1.470714"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "0 -0.175300\n",
- "1 0.083527\n",
- "2 -0.044334\n",
- "3 -0.399836\n",
- "4 0.331939\n",
- "dtype: float64\n",
- "0 1.069584\n",
- "1 0.965548\n",
- "2 1.018232\n",
- "3 0.793167\n",
- "4 0.918992\n",
- "dtype: float64\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " 0 | \n",
- " 1 | \n",
- " 2 | \n",
- " 3 | \n",
- " 4 | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " 3.061679 | \n",
- " 0.117430 | \n",
- " 1.329492 | \n",
- " 0.063724 | \n",
- " 0.962990 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " 0.264421 | \n",
- " 0.048920 | \n",
- " 1.144993 | \n",
- " 0.035909 | \n",
- " 0.065026 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " 0.209789 | \n",
- " 0.189367 | \n",
- " 0.340583 | \n",
- " 0.667239 | \n",
- " 0.452553 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " 0.010902 | \n",
- " 0.282259 | \n",
- " 1.060349 | \n",
- " 0.191963 | \n",
- " 1.250636 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " 2.621102 | \n",
- " 2.376547 | \n",
- " 0.063443 | \n",
- " 0.709698 | \n",
- " 0.034047 | \n",
- "
\n",
- " \n",
- " 5 | \n",
- " 0.878123 | \n",
- " 0.534362 | \n",
- " 1.853835 | \n",
- " 0.106431 | \n",
- " 0.003100 | \n",
- "
\n",
- " \n",
- " 6 | \n",
- " 0.049462 | \n",
- " 2.082875 | \n",
- " 0.572069 | \n",
- " 0.666597 | \n",
- " 0.563167 | \n",
- "
\n",
- " \n",
- " 7 | \n",
- " 0.207888 | \n",
- " 1.415201 | \n",
- " 2.858185 | \n",
- " 1.839818 | \n",
- " 1.518895 | \n",
- "
\n",
- " \n",
- " 8 | \n",
- " 0.296414 | \n",
- " 0.446453 | \n",
- " 0.000054 | \n",
- " 0.375694 | \n",
- " 1.689345 | \n",
- "
\n",
- " \n",
- " 9 | \n",
- " 3.003620 | \n",
- " 0.966899 | \n",
- " 0.127812 | \n",
- " 2.603636 | \n",
- " 2.162999 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " 0 1 2 3 4\n",
- "0 3.061679 0.117430 1.329492 0.063724 0.962990\n",
- "1 0.264421 0.048920 1.144993 0.035909 0.065026\n",
- "2 0.209789 0.189367 0.340583 0.667239 0.452553\n",
- "3 0.010902 0.282259 1.060349 0.191963 1.250636\n",
- "4 2.621102 2.376547 0.063443 0.709698 0.034047\n",
- "5 0.878123 0.534362 1.853835 0.106431 0.003100\n",
- "6 0.049462 2.082875 0.572069 0.666597 0.563167\n",
- "7 0.207888 1.415201 2.858185 1.839818 1.518895\n",
- "8 0.296414 0.446453 0.000054 0.375694 1.689345\n",
- "9 3.003620 0.966899 0.127812 2.603636 2.162999"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
+ "execution_count": 21,
+ "id": "a5e79c53",
+ "metadata": {
+ "collapsed": false,
+ "editable": true
+ },
+ "outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
@@ -2564,198 +2256,23 @@
},
{
"cell_type": "markdown",
- "id": "c1243846",
- "metadata": {},
+ "id": "cc8a1c21",
+ "metadata": {
+ "editable": true
+ },
"source": [
"Thereafter we can select specific columns only and plot final results"
]
},
{
"cell_type": "code",
- "execution_count": 18,
- "id": "b8544a35",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " First | \n",
- " Second | \n",
- " Third | \n",
- " Fourth | \n",
- " Fifth | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " -1.749765 | \n",
- " 0.342680 | \n",
- " 1.153036 | \n",
- " -0.252436 | \n",
- " 0.981321 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " 0.514219 | \n",
- " 0.221180 | \n",
- " -1.070043 | \n",
- " -0.189496 | \n",
- " 0.255001 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " -0.458027 | \n",
- " 0.435163 | \n",
- " -0.583595 | \n",
- " 0.816847 | \n",
- " 0.672721 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " -0.104411 | \n",
- " -0.531280 | \n",
- " 1.029733 | \n",
- " -0.438136 | \n",
- " -1.118318 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " 1.618982 | \n",
- " 1.541605 | \n",
- " -0.251879 | \n",
- " -0.842436 | \n",
- " 0.184519 | \n",
- "
\n",
- " \n",
- " 5 | \n",
- " 0.937082 | \n",
- " 0.731000 | \n",
- " 1.361556 | \n",
- " -0.326238 | \n",
- " 0.055676 | \n",
- "
\n",
- " \n",
- " 6 | \n",
- " 0.222400 | \n",
- " -1.443217 | \n",
- " -0.756352 | \n",
- " 0.816454 | \n",
- " 0.750445 | \n",
- "
\n",
- " \n",
- " 7 | \n",
- " -0.455947 | \n",
- " 1.189622 | \n",
- " -1.690617 | \n",
- " -1.356399 | \n",
- " -1.232435 | \n",
- "
\n",
- " \n",
- " 8 | \n",
- " -0.544439 | \n",
- " -0.668172 | \n",
- " 0.007315 | \n",
- " -0.612939 | \n",
- " 1.299748 | \n",
- "
\n",
- " \n",
- " 9 | \n",
- " -1.733096 | \n",
- " -0.983310 | \n",
- " 0.357508 | \n",
- " -1.613579 | \n",
- " 1.470714 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " First Second Third Fourth Fifth\n",
- "0 -1.749765 0.342680 1.153036 -0.252436 0.981321\n",
- "1 0.514219 0.221180 -1.070043 -0.189496 0.255001\n",
- "2 -0.458027 0.435163 -0.583595 0.816847 0.672721\n",
- "3 -0.104411 -0.531280 1.029733 -0.438136 -1.118318\n",
- "4 1.618982 1.541605 -0.251879 -0.842436 0.184519\n",
- "5 0.937082 0.731000 1.361556 -0.326238 0.055676\n",
- "6 0.222400 -1.443217 -0.756352 0.816454 0.750445\n",
- "7 -0.455947 1.189622 -1.690617 -1.356399 -1.232435\n",
- "8 -0.544439 -0.668172 0.007315 -0.612939 1.299748\n",
- "9 -1.733096 -0.983310 0.357508 -1.613579 1.470714"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "0.08352721390288316\n",
- "\n",
- "Int64Index: 10 entries, 0 to 9\n",
- "Data columns (total 5 columns):\n",
- " # Column Non-Null Count Dtype \n",
- "--- ------ -------------- ----- \n",
- " 0 First 10 non-null float64\n",
- " 1 Second 10 non-null float64\n",
- " 2 Third 10 non-null float64\n",
- " 3 Fourth 10 non-null float64\n",
- " 4 Fifth 10 non-null float64\n",
- "dtypes: float64(5)\n",
- "memory usage: 480.0 bytes\n",
- "None\n",
- " First Second Third Fourth Fifth\n",
- "count 10.000000 10.000000 10.000000 10.000000 10.000000\n",
- "mean -0.175300 0.083527 -0.044334 -0.399836 0.331939\n",
- "std 1.069584 0.965548 1.018232 0.793167 0.918992\n",
- "min -1.749765 -1.443217 -1.690617 -1.613579 -1.232435\n",
- "25% -0.522836 -0.633949 -0.713163 -0.785061 0.087887\n",
- "50% -0.280179 0.281930 -0.122282 -0.382187 0.463861\n",
- "75% 0.441264 0.657041 0.861676 -0.205231 0.923602\n",
- "max 1.618982 1.541605 1.361556 0.816847 1.470714\n"
- ]
- },
- {
- "data": {
- "image/png": "\n",
- "text/plain": [
- "