-
-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include explained variance in PCA component plot #1239
Conversation
Explained variance ratio (as percentage) included in brackets for each dimension in component plot, including 3D.
@bbengfort how do you feel about these changes? |
@gregparkes in a PCA projection, how would you use the explained variance in the legend? E.g. would you trust a projection more if the sum of the explained variance percentage was greater than 85% or differentiate between the different axes based on their explained variance? Would you mind attaching a figure produced from your changes to help us understand how the legend influences analysis of the visualization? The primary use of this projection is as a high dimensional data visualization tool; the goal of which is to discern separbility between classes or other patterns that might be easy to model. Explained variance is a useful tool to understanding PCA projection and we have a work in progress explained variance visualizer here: #1037 -- if you're interested that tool could definitely use some help getting to the finish line! |
Codecov Report
@@ Coverage Diff @@
## develop #1239 +/- ##
========================================
Coverage 90.58% 90.58%
========================================
Files 92 92
Lines 5213 5214 +1
========================================
+ Hits 4722 4723 +1
Misses 491 491
Continue to review full report at Codecov.
|
@bbengfort Thanks for your response! Q: in a PCA projection, how would you use the explained variance in the legend? Q+: E.g. would you trust a projection more if the sum of the explained variance percentage was greater than 85% or differentiate between the different axes based on their explained variance? Q: Would you mind attaching a figure produced from your changes to help us understand how the legend influences analysis of the visualization? from yellowbrick.features import PCA
from yellowbrick.datasets import load_credit
# Specify the features of interest and the target
X, y = load_credit()
classes = ['account in default', 'current with bills']
visualizer = PCA(scale=True, classes=classes)
visualizer.fit_transform(X, y)
visualizer.show() The primary use of this projection is as a high dimensional data visualization tool; the goal of which is to discern separability between classes or other patterns that might be easy to model. Explained variance is a useful tool to understanding PCA projection and we have a work in progress explained variance visualizer here: #1037 -- if you're interested that tool could definitely use some help getting to the finish line! |
Explained variance ratio (as percentage) included in brackets for each dimension in component plot, including 3D.
This PR contributes to #476 which suggests enhancements to PCA component plots, among other things.
I've added additional text into the x, y and z labels of the plot which include the explained_variance_ratio property of a fitted sklearn PCA model.
This is a very small change and hence doesn't really warrant an example as all changes are clear in the diff.
CHECKLIST
pytest
?