Merge pull request #2 from mike-ivs/gh-pages

Merge new changes from Mikes repo
carpentries-incubator · Mar 21, 2023 · 7d311dd · 7d311dd
2 parents 29da1b6 + 79f4c55
commit 7d311dd
Show file tree

Hide file tree

Showing 8 changed files with 105 additions and 74 deletions.
diff --git a/_episodes/02-regressionJens.md → 02-regressionJens.md b/_episodes/02-regressionJens.md → 02-regressionJens.md
diff --git a/_episodes/01-introduction.md b/_episodes/01-introduction.md
@@ -1,10 +1,10 @@
 ---
 title: "Introduction"
-teaching: 30
+teaching: 20
 exercises: 10
 questions:
-- What is machine learning?
-- What are some useful machine learning techniques?
+- "What is machine learning?"
+- "What are some useful machine learning techniques?"
 objectives:
 - "Gain an overview of what machine learning is and the techniques available."
 - "Understand how machine learning and artificial intelligence differ."
@@ -19,7 +19,7 @@ keypoints:
 
 # What is machine learning?
 
-Machine learning is a set of techniques that enable computers to improve in their performance of a given task. This is similar in concept to how humans learn to make predictions based upon previous experience and knowledge. Machine learning encompasses a wide range of activities, but broadly speaking it can be used to: find trends in a dataset, classify data into groups or categories, make decisions and predictions based upon data, and even "learn" how to interact with an environment when provided with goals to achieve.
+Machine learning is a set of techniques that enable computers to use data to improve in their performance of a given task. This is similar in concept to how humans learn to make predictions based upon previous experience and knowledge. Machine learning encompasses a wide range of activities, but broadly speaking it can be used to: find trends in a dataset, classify data into groups or categories, make decisions and predictions based upon data, and even "learn" how to interact with an environment when provided with goals to achieve.
 
 ### Machine learning in our daily lives
 
@@ -45,15 +45,26 @@ Machine learning has quickly become an important technology and is now frequentl
 
 The term machine learning (ML) is often mentioned alongside artificial intelligence (AI) and deep learning (DL). Deep learning is a subset of machine learning, and machine learning is a subset of artificial intelligence.
 
-AI is a broad term used to describe a system possessing a "general intelligence" that can be applied to solve problems, often mimicking the behaviour of intelligent biological systems. Another definition of AI dates back to the 1950s and Alan Turing's "Immitation Game". Turing said we could consider a system intelligent when it could fool a human into thinking they were talking to another human when they were actually talking to a computer. Modern attempts are getting close to fooling humans, but although there have been great advances in AI and ML research, human-like intelligence is only possible in a few specialist areas.
+AI is a broad term used to describe a system possessing a "general intelligence" that can be applied to solve a diverse range problems, often mimicking the behaviour of intelligent biological systems. Another definition of AI dates back to the 1950s and Alan Turing's "Immitation Game". Turing said we could consider a system intelligent when it could fool a human into thinking they were talking to another human when they were actually talking to a computer. Modern attempts are getting close to fooling humans, but although there have been great advances in AI and ML research, human-like intelligence is only possible in a few specialist areas.
 
-ML refers to techniques where a computer can "learn" patterns in data, usually by being shown many training examples. While computers can learn to solve specific problems, or multiple similar problems, they are not considered to possess a general intelligence. Computers often need hundreds or thousands of examples to learn a task and are confined to relatively simple classifications. A human-like system could learn much quicker, and potentially learn from a single example by using it's knowledge of many other problems.
+ML refers to techniques where a computer can "learn" patterns in data, usually by being shown many training examples. While ML-algorithms can learn to solve specific problems, or multiple similar problems, they are not considered to possess a general intelligence. ML-algorithms often need hundreds or thousands of examples to learn a task and are confined to tasks such as simple classifications. A human-like system could learn much quicker than this, and potentially learn from a single example by using it's knowledge of many other problems.
 
-DL is a particular field of machine learning where algorithms called neural networks are used to create highly-complex systems. Large collections of neural networks are able to learn from vast quantities of data. Deep learning can be used to solve a wide range of problems, but it can also require huge amounts of input data and computational resources to train. The image below shows some of the relationships between artificial intelligence, machine learning and deep learning.
+DL is a particular field of machine learning where algorithms called neural networks are used to create highly-complex systems. Large collections of neural networks are able to learn from vast quantities of data. Deep learning can be used to solve a wide range of problems, but it can also require huge amounts of input data and computational resources to train. 
+
+The image below shows the relationships between artificial intelligence, machine learning and deep learning.
 
 ![An infographic showing some of the relationships between AI, ML, and DL](../fig/01_AI_ML_DL_differences.png)
 The image above is by Tukijaaliwa, CC BY-SA 4.0, via Wikimedia Commons, original source
 
+> ## Where have you encountered machine learning already?
+> Now that we have explored machine learning in a bit more detail, discuss with the person next to you:
+>
+> 1. Where have I seen machine learning in use?
+> 2. What kind of input data does that machine learning system use to make predictions/classifications?
+> 3. Is there any evidence that your interaction with the system contributes to further training?
+> 4. Do you have any examples of the system failing?
+{: .challenge}
+
 # What are some useful types of Machine Learning?
 
 This lesson will introduce you to some of the key concepts and sub-domains of ML such as supervised learning, unsupervised learning, and neural networks.
@@ -67,7 +78,7 @@ The figure below provides a nice overview of some of the sub-domains of ML and t
 
 ### Garbage in = garbage out
 
-There is a classic expression in computer science, "garbage in = garbage out". This means that if the input data we use is garbage then the ouput will be too. If, for eample, we try to use a machine learning system to find a link between two unlinked variables then it may well manage to produce a model attempting this, but the output will be meaningless. 
+There is a classic expression in computer science, "garbage in = garbage out". This means that if the input data we use is garbage then the ouput will be too. If, for example, we try to use a machine learning system to find a link between two unlinked variables then it may well manage to produce a model attempting this, but the output will be meaningless. 
 
 ### Biases due to training data
 
@@ -85,13 +96,4 @@ Sometimes ML algorithms become over-trained and subsequently don't perform well
 
 Machine learning techniques will return an answer based on the input data and model parameters even if that answer is wrong. Most systems are unable to explain the logic used to arrive at that answer. This can make detecting and diagnosing problems difficult. 
 
-> ## Where have you encountered machine learning already?
-> Now that we have explored machine learning in a bit more detail, discuss with the person next to you:
->
-> 1. Where have I seen machine learning in use?
-> 2. What kind of input data does that machine learning system use to make predictions/classifications?
-> 3. Is there any evidence that your interaction with the system contributes to further training?
-> 4. Do you have any examples of the system failing?
-{: .challenge}
-
 {% include links.md %}
diff --git a/_episodes/02-regression.md b/_episodes/02-regression.md
@@ -3,11 +3,11 @@ title: "Regression"
 teaching: 30
 exercises: 20
 questions:
-- "How can I process data using Scikit-Learn?"
+- "What is Supervised Learning?"
+- "How can I model data and make predictions using regression?"
 objectives:
-- "Be aware of the built-in linear regression functions in Scikit-Learn."
-- "Measure the error between a regression model and real data."
 - "Apply linear regression with Scikit-Learn to create a model."
+- "Measure the error between a regression model and real data."
 - "Analyse and assess the accuracy of a linear model using Scikit-Learn's metrics library."
 - "Understand how more complex models can be built with non-linear equations."
 - "Apply polynomial modelling to non-linear data using Scikit-Learn."
@@ -17,15 +17,23 @@ keypoints:
 - "Scikit Learn includes a polynomial modelling function which is useful for modelling non-linear data."
 ---
 
- ## About Scikit-Learn
+# About Scikit-Learn
 
 [Scikit-Learn](http://github.com/scikit-learn/scikit-learn) is a python package designed to give access to well-known machine learning algorithms within Python code, through a clean API. It has been built by hundreds of contributors from around the world, and is used across industry and academia.
 
 Scikit-Learn is built upon Python's [NumPy (Numerical Python)](http://numpy.org) and [SciPy (Scientific Python)](http://scipy.org) libraries, which enable efficient in-core numerical and scientific computation within Python. As such, Scikit-Learn is not specifically designed for extremely large datasets, though there is [some work](https://github.com/ogrisel/parallel_ml_tutorial) in this area. For this introduction to ML we are going to stick to processing small to medium datasets with Scikit-Learn, without the need for a graphical processing unit (GPU).
 
-# Supervised Learning intro
+# Supervised Learning
+
+Classical machine learning is often divided into two categories – Supervised and Unsupervised Learning. 
+
+For the case of supervised learning we act as a "supervisor" or "teacher" for our ML-algorithms by providing the algorithm with "labelled data" that contains example answers of what we wish the algorithm to achieve. 
+
+For instance, if we wish to train our algorithm to distinguish between images of cats and dogs, we would provide our algorithm with images that have already been labelled as "cat" or "dog" so that it can learn from these examples. If we wished to train our algorithm to predict house prices over time we would provide our algorithm with example data of house prices that are "labelled" with time values.
+
+Supervised learning is split up into two further categories: classification and regression. For classification the labelled data is discrete, such as the "cat" or "dog" example, whereas for regression the labelled data is continuous, such as the house price example.
 
-blah
+In this episode we will explore how we can use regression to build a "model" that can be used to make predictions.
 
 ## Linear Regression with Scikit-Learn