-
Notifications
You must be signed in to change notification settings - Fork 139
Change Log
Added clarifying language about which clustering method is used for automatic hotspots, and tip for using hdbscan on smaller-scale data.
MacOS 12.2.1 Apple Silicon (M1 chip) with GPU acceleration
We've seen some great results from running multiple UMAP reductions of the 2,048-dimension feature space, and then animating between them.
To recap, you can execute pixplot with a runtime argument such as:
pixplot --images="images/*.jpg" --metadata="metadata.csv" --n_neighbors 5 10 20 --min_dist 0.01 0.1
...which will run UMAP with six different combinations of the n_neighbors and min_dist hyperparameters. Once this pixplot is built, you can animate between these various reduction states by using the two sliders exposed by clicking on the gear at the top right of your screen.
Part of what makes this feature possible is the Aligned UMAP feature of Leland McInnes' package. This variant way of running multiple dimensionality reductions attempts to minimize the random variation between plural layouts, allowing you to focus on what's actually changing or moving between different hyperparameter settings.
However, we'd noticed the highest value for n_neighbors
-- regardless of what it was -- sometimes resulted in a blob of a reduction, compared with other values' more interesting layouts. With this commit, we're adjusting how we invoke Aligned UMAP slightly, and we seem to be getting better results for that one highest value of n_neighbors
. We'd welcome feedback from other folks who are exploring this feature.
With this commit, we're enhancing the tools available to users from making considered selections of images from a pixplot visualization.
First, we've added the ability to shift-click images to add or remove them from the lasso'd group. You can use this to draw a big loop around the images you want, and then shift-click to selectively add or remove a few from the group.
Secondly, when you used the lasso to draw a line around a set of interesting images, you could bring up a "light table" view of those images, and use this gridded layout to selectively remove images from a list to be downloaded for other purposes. However, your choices in removing certain images (a zebra that snuck into a group of horses, for example) weren't reflected in the grouping 'behind' the light table view -- the images you had selected with the lasso originally. Now, they are.
We've added a feature to allow for an automated "tour" through a pixplot visualization. Adding #demo
to the end of any pixplot url, for example:
http://pixplot.yale.edu/v2/attract-mk/#demo
... will, after a few seconds, begin a tour through the hotspots that exist in the visualization. The camera will zoom in and out of each hotspot, and show a few representative images from each. The timing of the demonstration mode is customizable by looking at this section of the code:
this.delays = {
initialize: 60000, // ms of inactivity required to start attract mode
layoutChange: 4000, // ms of inactivity between zooming out and changing layout
clusterZoom: 4000, // ms between changing layout and flying to a cluster
beforeLightbox: 4000, // ms after zoom until we show the lightbox
betweenLightbox: 3000, // ms between images in the lightbox
afterLightbox: 2000, // ms after closing lightbox until next view
}
Visualizations built with pixplot after this commit will have the demo option available, but disabled by default. Add #demo
on to the end of a URL to activate the demonstration mode.
We now support rapids.ai-accelerated UMAP dimensionality reduction, greatly reducing the time necessary to embed the 2,048-dimensional space into two dimensions. Here is a comparison of regular (umap-learn) and cuda-accelerated (rapids.ai) UMAP on the same dataset:
-----
oslo no hyperparameters regular umap:
2021-07-20 16:18:44.520148: Creating UMAP layout
2021-07-20 16:19:08.822013: Creating umap pointgrid
= 24 seconds
-----
oslo no hyperparameters rapidsai umap:
2021-07-20 17:21:41.096082: Creating UMAP layout
2021-07-20 17:21:43.098537: Creating umap pointgrid
= 2 seconds
Due to the complexities of rapids.ai, we recommend setting up a conda environment as detailed on their setup page. Pixplot, which now uses TensorFlow 2.x (see below) can be set up in this conda environment and will automatically use gpu acceleration if the rapids.ai libraries are present and importable.
Pixplot now uses TensorFlow 2.x instead of 1.x. This will allow us to take advantage of newer GPUs, such as NVIDIA's 30x0 series, as well as gpu acceleration at other stages in our pipeline.
This shift does mean we have deprecated the TF1.x-based OpenPose support. We're looking at building a more robust framework for multiple kinds of specialized neural networks -- poses, faces, car makes, etc -- and hope to have some wiki pages about how to integrate these networks soon.