Skip to content

Commit

Permalink
Add penguin crossfiltering demo [build:penguin_crossfilter] (#121)
Browse files Browse the repository at this point in the history
* Add penguin crossfiltering demo

* Add hvplot dependency

* Add scipy dependency

* Various updates

* Fixed servable command

* Update penguin_crossfilter/penguin_crossfilter.ipynb

Co-authored-by: James A. Bednar <[email protected]>

* Update dependencies

* Fix flake

Co-authored-by: James A. Bednar <[email protected]>
  • Loading branch information
philippjfr and jbednar authored Dec 29, 2020
1 parent 95ea711 commit 72a0307
Show file tree
Hide file tree
Showing 7 changed files with 716 additions and 0 deletions.
57 changes: 57 additions & 0 deletions penguin_crossfilter/anaconda-project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# To reproduce: install 'anaconda-project', then 'anaconda-project run'
name: penguins
description: Palmer Penguin Cross-Filtering
maintainers:
- philippjfr
labels:
- panel
- holoviews
- hvplot

user_fields: [labels, skip, maintainers]

channels:
- pyviz

packages: &pkgs
- python=3.7
- scipy
- spatialpandas=0.3.6
- bokeh=2.2.2
- notebook=5.7.8
- ipykernel=5.1.0
- pyviz_comms=2.0.0
- holoviews=1.14.1
- panel=0.10.2
- hvplot=0.6.0

dependencies: *pkgs

commands:
dashboard:
unix: panel serve penguin_crossfilter.ipynb
supports_http_options: true
notebook:
notebook: penguin_crossfilter.ipynb
test:
unix: pytest --nbsmoke-run -k *.ipynb --ignore envs
windows: pytest --nbsmoke-run -k *.ipynb --ignore envs
env_spec: test
lint:
unix: pytest --nbsmoke-lint -k *.ipynb --ignore envs
windows: pytest --nbsmoke-lint -k *.ipynb --ignore envs
env_spec: test

variables: {}
downloads: {}

env_specs:
default: {}
test:
packages:
- nbsmoke ==0.2.8
- pytest ==4.4.1
platforms:
- linux-64
- osx-64
- win-64
Binary file added penguin_crossfilter/culmen_depth.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added penguin_crossfilter/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added penguin_crossfilter/lter_penguins.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
325 changes: 325 additions & 0 deletions penguin_crossfilter/penguin_crossfilter.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,325 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cross-filtering Palmer Penguins"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"import holoviews as hv\n",
"import hvplot.pandas # noqa\n",
"\n",
"hv.extension('bokeh')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this introduction to building interactive dashboards we will primarily be using 4 libraries:\n",
" \n",
"1. **Pandas**: To load and manipulate the data\n",
"2. **hvPlot**: To quickly generate plots using a simple and familiar API\n",
"3. **HoloViews**: To link selections between plots easily\n",
"4. **Panel**: To build a dashboard we can deploy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Building some plots\n",
"\n",
"Let us first load the Palmer penguin dataset ([Gorman et al.](https://allisonhorst.github.io/palmerpenguins/)) which contains measurements about a number of penguin species:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"penguins = pd.read_csv('penguins.csv')\n",
"penguins"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This diagram provides some background about what these measurements mean:\n",
"\n",
"<img src=\"culmen_depth.png\"></img>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we define an explicit colormapping for each species:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"colors = {\n",
" 'Adelie Penguin': '#1f77b4',\n",
" 'Gentoo penguin': '#ff7f0e',\n",
" 'Chinstrap penguin': '#2ca02c'\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can start plotting the data with hvPlot, which provides a familiar API to pandas `.plot` users but generates interactive plots.\n",
"\n",
"We start with a simple scatter plot of the culmen (think bill) length and depth for each species:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scatter = penguins.hvplot.scatter('Culmen Length (mm)', 'Culmen Depth (mm)', c='Species', cmap=colors, responsive=True, min_height=300)\n",
"scatter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we generate a histogram of the body mass colored by species:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"histogram = penguins.hvplot.hist('Body Mass (g)', by='Species', color=hv.dim('Species').categorize(colors), legend=False, alpha=0.5, responsive=True, min_height=300)\n",
"histogram"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we count the number of individuals of each species and generate a bar plot:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bars = penguins.hvplot.bar('Species', 'Individual ID', c='Species', cmap=colors, responsive=True, min_height=300).aggregate(function=np.count_nonzero)\n",
"bars"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally we generate violin plots of the flipper length of each species split by the sex:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"violin = penguins.hvplot.violin('Flipper Length (mm)', by=['Species', 'Sex'], cmap='Category20', responsive=True, min_height=300).opts(split='Sex')\n",
"\n",
"violin"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linking the plots"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"hvPlot let us build interactive plots very quickly but what if we want to gain deeper insights about this data by selecting along one dimension and seeing that selection appear on other plots? Using HoloViews we can easily compose and link these plots:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ls = hv.link_selections.instance()\n",
"\n",
"ls(scatter.opts(show_legend=False) + bars + histogram + violin).cols(2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Building a dashboard"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As a final step we will compose these plots into a dashboard using Panel, so as a first step we will load the Palmer penguins logo:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import panel as pn\n",
"\n",
"logo = pn.panel('logo.png', height=60)\n",
"logo"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we define use some functionality on the `link_selections` object to display the count of currently selected penguins:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def count(selected):\n",
" return pn.pane.Markdown(f\"## {len(selected)}/{len(penguins)} penguins selected\", align='center')\n",
"\n",
"pn.panel(pn.bind(count, ls.selection_param(penguins)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we will compose these two items into a Row which will serve as the header of our dashboard:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"welcome = \"## Welcome and meet the Palmer penguins!\"\n",
"\n",
"penguins_art = pn.panel('./lter_penguins.png', width=250)\n",
"\n",
"credit = \"### Artwork by @allison_horst\"\n",
"\n",
"instructions = \"\"\"\n",
"Use the box-select and lasso-select tools to select a subset of penguins\n",
"and reveal more information about the selected subgroup through the power\n",
"of cross-filtering.\n",
"\"\"\"\n",
"\n",
"license = \"\"\"\n",
"### License\n",
"\n",
"Data are available by CC-0 license in accordance with the Palmer Station LTER Data Policy and the LTER Data Access Policy for Type I data.\"\n",
"\"\"\"\n",
"art = pn.Column(welcome, penguins_art, credit, instructions, license, sizing_mode='stretch_width')\n",
"art"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploying the dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pn.config.raw_css.append(\"body .bk-root { font-family: Ubuntu !important; }\")\n",
"\n",
"material = pn.template.MaterialTemplate(logo='logo.png', title='Palmer Penguins')\n",
"\n",
"ls = hv.link_selections.instance()\n",
"\n",
"header = pn.Row(\n",
" pn.layout.HSpacer(),\n",
" pn.bind(count, ls.selection_param(penguins)),\n",
" pn.Spacer(width=100),\n",
" sizing_mode='stretch_width'\n",
")\n",
"header\n",
"\n",
"selections = ls(scatter.opts(show_legend=False) + bars.opts(show_legend=False) + histogram + violin).cols(2)\n",
"\n",
"material.header.append(header)\n",
"material.sidebar.append(art)\n",
"material.main.append(selections)\n",
"\n",
"material.servable();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div><img src=\"screenshot.png\" style=\"display: flex; margin: auto;\" width=\"80%\"></img></div>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading

0 comments on commit 72a0307

Please sign in to comment.