Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] KMeans clear results on k change, do not recluster #2915

Merged
merged 4 commits into from
Apr 3, 2018

Conversation

pavlin-policar
Copy link
Collaborator

@pavlin-policar pavlin-policar commented Feb 19, 2018

Issue

run optimization for 2 to (say) 10 clusters, uncheck 'Apply Automatically', increase "From" to 5. At this point the 'Silhouette Scores' table is not updated. Select '2' from the scores table (or '3' just as long as it causes a send_data to be called). On the output is a data table with 7 clusters.

  • KMeans reclusters data when metas or class_vars change, even when X doesn't.
Description of changes

Fix both issues. Based on #2901. Remove possibility of crash on settings migrations (not that this ever happened)

Includes
  • Code changes
  • Tests
  • Documentation

@codecov-io
Copy link

codecov-io commented Feb 19, 2018

Codecov Report

Merging #2915 into master will increase coverage by 0.02%.
The diff coverage is 98.33%.

@@            Coverage Diff             @@
##           master    #2915      +/-   ##
==========================================
+ Coverage    82.1%   82.12%   +0.02%     
==========================================
  Files         328      328              
  Lines       56262    56392     +130     
==========================================
+ Hits        46194    46313     +119     
- Misses      10068    10079      +11

@pavlin-policar pavlin-policar force-pushed the kmeans-fixes branch 3 times, most recently from 4fe83c0 to 7b20a0d Compare February 19, 2018 15:04
@ajdapretnar
Copy link
Contributor

This looks ok to me. Just one question. When I change metas with Select Columns and don't have Apply Automatically selected in k-Means, the data still gets transmitted through the widget (the output changes). I would expect the output not to change, unless I press Apply. Correct me, if I am wrong.

p.s. I could not reproduce the mentioned issue, but everything works for me, so no complaints here.

@pavlin-policar
Copy link
Collaborator Author

You're right, this is not expected. I'll fix it right away.

The issue was a bit subtle, but you can notice it if you look at the output in a data table and look at how the clusters change.

Open k-means on e.g. iris, disable auto-apply and run optimization from 2-8 clusters. Show the output in a data table, selecting 2 clusters will give values C1 and C2, as expected. Now change the parameter "from" to 5 without pressing "Apply". The results will still show the previous range 2-8, but now selecting 2 will output clusterings for k=5 (values C1-C5), which is wrong (as seen on image).

@lanzagar lanzagar added this to the 3.12 milestone Mar 30, 2018
@lanzagar lanzagar merged commit 6ad4ad4 into biolab:master Apr 3, 2018
@pavlin-policar pavlin-policar deleted the kmeans-fixes branch April 4, 2018 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants