Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand alt text in episode 2 #510

Merged
merged 3 commits into from
Jul 30, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions episodes/2-keras.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,10 @@
The `palmerpenguins` data contains size measurements for three penguin species observed on three islands in the Palmer Archipelago, Antarctica.
The physical attributes measured are flipper length, beak length, beak width, body mass, and sex.

![*Artwork by @allison_horst*][palmer-penguins]

Check warning on line 79 in episodes/2-keras.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[image missing alt-text]: fig/palmer_penguins.png


![*Artwork by @allison_horst*][penguin-beaks]

Check warning on line 82 in episodes/2-keras.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[image missing alt-text]: fig/culmen_depth.png


These data were collected from 2007 - 2009 by Dr. Kristen Gorman with the [Palmer Station Long Term Ecological Research Program](https://pal.lternet.edu/), part of the [US Long Term Ecological Research Network](https://lternet.edu/). The data were imported directly from the [Environmental Data Initiative](https://environmentaldatainitiative.org/) (EDI) Data Portal, and are available for use by CC0 license ("No Rights Reserved") in accordance with the [Palmer Station Data Policy](https://pal.lternet.edu/data/policies).
Expand Down Expand Up @@ -140,7 +140,7 @@
sns.pairplot(penguins, hue="species")
```

![][pairplot]
![][pairplot]{alt='Grid of scatter plots and histograms comparing observed values of the four physicial attributes (variables) measured in the penguins sampled. Scatter plots illustrate the distribution of values observed for each pair of variables. On the diagonal, where one physical attribute would be compared with itself, histograms are displayed that show the distribution of values observed for that attibute, coloured according to the species of the individual sampled. The pair plot shows distinct but overlapping clusters of data points representing the different species, with no pair of variables providing a clean separation of clusters on its own.'}
tobyhodges marked this conversation as resolved.
Show resolved Hide resolved
tobyhodges marked this conversation as resolved.
Show resolved Hide resolved

::: challenge

Expand All @@ -165,7 +165,7 @@
sns.pairplot(penguins, hue='sex')
```

![][sex_pairplot]
![][sex_pairplot]{alt='Grid of scatter plots and histograms comparing observed values of the four physicial attributes (variables) measured in the penguins sampled, with data points coloured according to the sex of the individual sampled. The pair plot shows similarly-shaped distribution of values observed for each variable in male and female penguins, with the distribution of measurements for females skewed towards smaller values.'}
tobyhodges marked this conversation as resolved.
Show resolved Hide resolved
tobyhodges marked this conversation as resolved.
Show resolved Hide resolved

You see that for each species females have smaller bills and flippers, as well as a smaller body mass.
You would need a combination of the species and the numerical features to successfully distinguish males from females.
Expand Down Expand Up @@ -526,7 +526,7 @@
```python
sns.lineplot(x=history.epoch, y=history.history['loss'])
```
![][training_curve]

Check warning on line 529 in episodes/2-keras.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[image missing alt-text]: fig/02_training_curve.png

This plot can be used to identify whether the training is well configured or whether there
are problems that need to be addressed.
Expand All @@ -545,7 +545,7 @@

3. (optional) Something went wrong here during training. What could be the problem, and how do you see that in the training curve?
Also compare the range on the y-axis with the previous training curve.
![](../fig/02_bad_training_history_1.png){alt='Very jittery training curve with the loss value jumping back and forth between 2 and 4. The range of the y-axis is from 2 to 4, whereas in the previous training curve it was from 0 to 2. The loss seems to decrease a litle bit, but not as much as compared to the previous plot where it dropped to almost 0. The minimum loss in the end is somewhere around 2.'}
![][bad-training-curve]

Check warning on line 548 in episodes/2-keras.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[missing file]: [](../fig/02_bad_training_history_1.png) [image missing alt-text]: ../fig/02_bad_training_history_1.png

:::: solution
## Solution
Expand Down Expand Up @@ -686,7 +686,7 @@
```python
sns.heatmap(confusion_df, annot=True)
```
![][confusion_matrix]

Check warning on line 689 in episodes/2-keras.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[image missing alt-text]: fig/confusion_matrix.png

::: challenge
## Confusion Matrix
Expand Down Expand Up @@ -771,17 +771,20 @@
{alt='Illustration of the three species of penguins found in the Palmer Archipelago, Antarctica: Chinstrap, Gentoo and Adele'}

[penguin-beaks]: fig/culmen_depth.png "Culmen Depth"
{alt='Illustration of the beak dimensions called culmen length and culmen depth in the dataset'}
{alt='Illustration of how the beak dimensions were measured. In the raw data, bill dimensions are recorded as "culmen length" and "culmen depth". The culmen is the dorsal ridge atop the bill.'}

[pairplot]: fig/pairplot.png "Pair Plot"
{alt='Pair plot showing the separability of the three species of penguin for combinations of dataset attributes'}
{alt='Grid of scatter plots and histograms comparing observed values of the four physicial attributes (variables) measured in the penguins sampled. Scatter plots illustrate the distribution of values observed for each pair of variables. On the diagonal, where one physical attribute would be compared with itself, histograms are displayed that show the distribution of values observed for that attibute, coloured according to the species of the individual sampled. The pair plot shows distinct but overlapping clusters of data points representing the different species, with no pair of variables providing a clean separation of clusters on its own.'}

[sex_pairplot]: fig/02_sex_pairplot.png "Pair plot grouped by sex"
{alt='Pair plot showing the separability of the two sexes of penguin for combinations of dataset attributes'}
{alt='Grid of scatter plots and histograms comparing observed values of the four physicial attributes (variables) measured in the penguins sampled, with data points coloured according to the sex of the individual sampled. The pair plot shows similarly-shaped distribution of values observed for each variable in male and female penguins, with the distribution of measurements for females skewed towards smaller values.'}

[training_curve]: fig/02_training_curve.png "Training Curve"
{alt='Training loss curve of the neural network training which depicts exponential decrease in loss before a plateau from ~10 epochs'}

[bad-training-curve]: ../fig/02_bad_training_history_1.png "Training Curve Gone Wrong"
{alt='Very jittery training curve with the loss value jumping back and forth between 2 and 4. The range of the y-axis is from 2 to 4, whereas in the previous training curve it was from 0 to 2. The loss seems to decrease a litle bit, but not as much as compared to the previous plot where it dropped to almost 0. The minimum loss in the end is somewhere around 2.'}

[confusion_matrix]: fig/confusion_matrix.png "Confusion Matrix"
{alt='Confusion matrix of the test set with high accuracy for Adelie and Gentoo classification and no correctly predicted Chinstrap'}

Expand Down
Loading