Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ftr peak ml #1361

Open
wants to merge 117 commits into
base: develop
Choose a base branch
from
Open

Ftr peak ml #1361

wants to merge 117 commits into from

Conversation

sakshikukreja14
Copy link
Contributor

No description provided.

Copy link
Contributor

@saifulbkhan saifulbkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review part I

Added comments where I think some of the library-level code could be improved. Have yet to take a look at the front-end. Will do it in another sitting.

I did compile and try to test out basic functionality. Some comments here:

  • The slider is not in the same line as the signal/noise range labels. It is also very glitchy when I drag any of the two slider controls. Clicking along the bar works well.
  • The peak-detection dialog goes outside the bounds on my laptop screen (height-wise). Maybe we can have a separate PeakML tab in the detection dialog, even if it contains a handful elements. Anyhow, I feel some UI improvement can be made here.
  • The application crashed for me when I set cohorts for 8 example files. I had already run one detection (with PeakML) once without setting the cohorts. Tried again, crashed on the first run - with samples cohorts set. Attached the crash log here. Seems to be the same “missing cookie file” thing we talked about before.

Could not test further because of the crash happening every time now. Once you fix it, I will try out a few more things. Quite a bit has changed since I last worked on this branch.

.travis.yml Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.cpp Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.h Show resolved Hide resolved
src/gui/mzroll/QHistogramSlider.h Outdated Show resolved Hide resolved
@sakshikukreja14
Copy link
Contributor Author

Thank you @saifulbkhan for your reviews. Sorry, for the repeated crash. I will make the required changes soon.

Copy link
Contributor

@saifulbkhan saifulbkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review part II:

  • I was able to get it to not crash when I logged out of EPI and then logged back in when selecting “Peak curation” before doing peak detection. Also is “Peak curation” the right thing to call it? Should it not be something like “Use PeakML to classify peaks”? Maybe get some input from Richa and Surbhi on this.
  • Even when I was able to get it to work without crashing I do not get any classification. The “Label” column was blank for all peak-groups.
  • When PeakML is not able to classify anything I get “nan%” values in the label-filtering dropdown in peak tables. It should instead display “NA” or “0%”.
  • Another thing that came to my mind: are we taking care of saving enough state in the emDB sessions? Such that reloading the file restores all of the data and state as it was when saved? UI state positioning and visibility excluded, of course.

Also added some more comments on the code. Please make changes wherever required.

Still need to review 15 or so file. Will do in another sitting, next week. Hopefully you will have made changes to fix some of the blocking issues.

src/gui/mzroll/forms/infodialog.ui Outdated Show resolved Hide resolved
src/gui/mzroll/mainwindow.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/loginform.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/mainwindow.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/mainwindow.h Outdated Show resolved Hide resolved
src/gui/mzroll/peakdetectiondialog.cpp Outdated Show resolved Hide resolved
Comment on lines 204 to 212
if(checked){
getLoginForPeakMl();
}
else{
peakMlSet = false;
mainwindow->mavenParameters->peakMl = false;
modelTypes->setEnabled(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add analytics here too? Mixpanel would be preferable over GA. Since it gives us a better idea of who was interested in PeakML. But as I mentioned before, the current label for this group-box does not really hint at some cool ML tech.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't added it till now, was facing an issue. It will be resolved asap.

src/gui/mzroll/peakdetectiondialog.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/peakdetectiondialog.h Outdated Show resolved Hide resolved
src/gui/mzroll/superSlider.h Outdated Show resolved Hide resolved
@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 3 times, most recently from 9fd1666 to f77cc66 Compare December 21, 2020 17:11
@codecov-io
Copy link

codecov-io commented Dec 21, 2020

Codecov Report

Merging #1361 (3333a62) into develop (6325183) will increase coverage by 8.88%.
The diff coverage is 50.25%.

@@             Coverage Diff             @@
##           develop    #1361      +/-   ##
===========================================
+ Coverage    45.45%   54.33%   +8.88%     
===========================================
  Files           58       56       -2     
  Lines         9889     9310     -579     
===========================================
+ Hits          4495     5059     +564     
+ Misses        5394     4251    -1143     
Impacted Files Coverage Δ
src/core/libmaven/EIC.h 75.00% <ø> (ø)
src/core/libmaven/Fragment.cpp 11.01% <0.00%> (ø)
src/core/libmaven/Peak.h 40.00% <0.00%> (ø)
src/core/libmaven/PeakGroup.cpp 67.40% <ø> (+0.90%) ⬆️
src/core/libmaven/PeakGroup.h 44.73% <ø> (-16.81%) ⬇️
src/core/libmaven/PolyAligner.cpp 0.00% <ø> (ø)
src/core/libmaven/SRMList.cpp 23.85% <ø> (+0.21%) ⬆️
src/core/libmaven/classifier.cpp 1.35% <ø> (ø)
src/core/libmaven/classifierNeuralNet.cpp 54.08% <ø> (-0.09%) ⬇️
src/core/libmaven/csvreports.cpp 24.40% <ø> (-11.97%) ⬇️
... and 60 more

.travis.yml Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/core/libmaven/PeakGroup.cpp Outdated Show resolved Hide resolved
src/core/libmaven/csvreports.cpp Show resolved Hide resolved
src/core/libmaven/mavenparameters.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@saifulbkhan saifulbkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review part III - leaving some more comments here. Will do user-based testing now based on the feature document that you shared earlier. Might do this today itself.

src/gui/mzroll/pollyelmaveninterface.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/pollyelmaveninterface.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/pollyelmaveninterface.cpp Show resolved Hide resolved
src/gui/mzroll/pollyelmaveninterface.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/pollyelmaveninterface.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/tabledockwidget.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/tabledockwidget.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/tabledockwidget.cpp Outdated Show resolved Hide resolved
@@ -2827,6 +3658,7 @@ void ScatterplotTableDockWidget::setupPeakTable() {
colNames << "Max quality";
colNames << "MS2 score";
colNames << "#MS2 events";
colNames << "Probability"; // TODO: add this column conditionally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think we should name this "Classification probability" instead? Ignore if this decision was made after discussion already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure..Will discuss it with Richa and change it accordingly.

@@ -35,7 +36,7 @@ PollyIntegration::PollyIntegration(DownloadManager* dlManager):
nodeModulesPath = binDir + "node_modules" + QDir::separator();
#endif

indexFileURL = "https://raw.githubusercontent.com/ElucidataInc/polly-cli/master/prod/index.js";
indexFileURL = "https://raw.githubusercontent.com/sakshikukreja14/polly-cli/ftr_moi_api/prod/index.js";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to keep this comment up so that the branch does not get merged without this change.

@saifulbkhan
Copy link
Contributor

@sakshikukreja14 A few comments from previous reviews have not been resolved because I do not see the requested changes. Do take a look.

@sakshikukreja14
Copy link
Contributor Author

@saifulbkhan sure will resolve the comments soon. Sorry, for missing out on some previously.

@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 2 times, most recently from 12abfc2 to 8a7fa58 Compare December 22, 2020 19:00
src/gui/mzroll/backgroundopsthread.cpp Outdated Show resolved Hide resolved
src/gui/mzroll/backgroundopsthread.cpp Outdated Show resolved Hide resolved
@saifulbkhan
Copy link
Contributor

@sakshikukreja14 Adding another usability review for PeakML:

  • Peak detection dialog comes up with “Polly-PeakML” tab selected by default. Should be “Detection method”.
  • The “Preparing inputs for classification…” step is very slow for large peak-count. Is it the write-to-CSV part? I am on an SSD. Normal hard-disks can frustrate user. Eventually I force quit the application - not sure if this was just stuck or slow.
  • Tried the same dataset again in a fresh session and it worked very fast this time.
  • Sorting by label column does not work. Is that intentional?
  • Cycling through the “correlated peak-groups” worked well. But the currently selected peak-group in the peak-table was always the first entry in the correlation table. So if I cycle between masses 99, 100, 101 and 102 and the current mass selected is 101, then the order in the correlation table is 101, 99, 100, 102. Would be nice to have them always sorted by mass. The cycling itself is according to mass (which is the right behavior). So there is some inconsistency.
  • I edited one of the maybe peak-groups to include some of the missed peak areas and then marked it as “good” manually. Then pressed Cmd+z to see if undo is working. It worked the first time but was really slow (UI froze for about 5 seconds). Then I marked it as good again and then again tried undoing the marking this time it froze for 5 seconds again but did not undo my manually set “good” mark. Tried undo (pressed Cmd+z) one more time and then it worked. This was a table with less-than 200 peak-groups.
  • The force plot can do a little better when it comes to drawing labels for contributing attributes. In the example below, the markers are overlapping with the text (on the blue side) but there’s ample space to the right that can be used to position them without any overlaps.

Screenshot 2020-12-23 at 8 25 19 AM

  • I am able to select multiple peak-groups in the table and then explain their classification (but the force plot only comes up for the last selected group). We should disable the “Explain classification” menu item during multi-selection.
  • After relabeling the peak-table once or twice using new ranges, the context-menu (that comes up on right-click) for peak-groups no longer came up again for that table. It worked fine on any newly created table. Was not able to reproduce this bug again. Might be a one off case.

Everything else seems to be working as it should. The QA team might have some more feedback since they will probably play around with it more than I could.

@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 5 times, most recently from 7e8f04f to 48f045f Compare December 25, 2020 12:09
@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 2 times, most recently from 8b2c36f to dd494f5 Compare January 10, 2021 20:21
@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 3 times, most recently from 4175171 to 7373061 Compare January 21, 2021 08:44
@sakshikukreja14 sakshikukreja14 force-pushed the ftr_peak_ml branch 2 times, most recently from 14fbb98 to 0e8f61b Compare February 11, 2021 08:05
This is done because moi sometime takes time to run depending on
the no. of peaks, and user might think the system is hanged.
Hence, showing progress at different steps
Polly-phi didn't work because of the extra classified label in peak
tables. This has been corrected by removing extra columns in case of
polly export
Change the predicted_label column name to peakML_label_id and added
an extra column that would export labels in string format.
This is done because the upt add in downstream needed this extra
column
Added a warning for the user if he tries to access peakML without
uploading the cohort file.
When sorted on label, marking group good or bad pushes it down on
the sorted positon.
It is fixed by enabling and disabling sorting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants