-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match GTs and predictions based on participant_id
, acq_id
, and run_id
#17
Conversation
UDPATE: Resolved in-person with @mguaypaq using |
… acq_id, and run_id.
3b9d62e
to
ce16ec8
Compare
Just for the record, I removed the failing CI and unit tests from the original MetricsReloded repo in 21c8619 |
yeah, true! we almost always tried region-based segmentations and the models always segments SC so we/I never a failed case with this. But I agree, this should be more robust!
quick question, does it also work when the filename does not have the |
might be unrelated but because we're on the topic, i am documenting it I think we should compute the metrics individually per each class. That is, say, if we have region-based seg/lesion label, then the current script automatically computes metrics for both of them sequentially in a for loop. But, for the how about we:
|
Good question! Yes, pd.merge works if an entity (e.g., acq) is not included in the filenames. Please see recently pushed unit tests for details: 0c61d12 |
@naga-karthik, I open a separate issue (#19) for your comment above as it is indeed not directly related to this PR |
Otherwise, I think that the PR is now ready to be merged. @naga-karthik, do you mind testing on some of your data (maybe |
will check tomorrow! |
hey @valosekj, I tried running it on
|
I added session-based pairing in commit d45b330 but still got an error because of a mis-match in
|
I updated the fetch function by also getting the chunk-id (exists for bavaria dataset) in commit 17f1c39. With this the script runs as expected:
If my changes look fine, then please merge @valosekj ! |
Thanks for the review and improving the code, Naga! I added some extra tests to test |
So far, we have been matching GTs and predictions based on simple list sorting, which is prone to potential errors. Additionally, this approach worked only when the number of GTs and predictions was the same (which is not ideal; for example, sometimes, we got no prediction when a model failed to segment a lesion).
I thus improved the matching by using BIDS compatible keyes (
participant_id
,ses_id
,acq_id
,chunk_id
, andrun_id
) and pandas dataframes. This approach is more robust, and it also works for different numbers of GTs and predictions (after merging dataframe for GTs and predictions, I simply drop NaNs).