Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dice score cannot be calculated for each class separately #1602

Closed
asbjrnmunk opened this issue Mar 7, 2023 · 4 comments · Fixed by #2725
Closed

Dice score cannot be calculated for each class separately #1602

asbjrnmunk opened this issue Mar 7, 2023 · 4 comments · Fixed by #2725
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed v0.11.x
Milestone

Comments

@asbjrnmunk
Copy link

asbjrnmunk commented Mar 7, 2023

🐛 Bug

Dice score cannot be calculated without reduction, instead raising a runtime error.

To Reproduce

Minimal reproducing example:

>> import torch
>> from torchmetrics import Dice
>> Dice(average='none', num_classes=3)

raises the following error

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "[redacted]/site-packages/torchmetrics/classification/dice.py", line 168, in __init__
      raise ValueError(f"The `reduce` {average} is not valid.")
ValueError: The `reduce` none is not valid.

Same error is encountered with average=None.

Expected behavior

The documentation states

...
average:
Defines the reduction that is applied. Should be one of the following:
- 'micro' [default]: Calculate the metric globally, across all samples and classes.
- 'macro': Calculate the metric for each class separately, and average the
metrics across classes (with equal weights for each class).
- 'weighted': Calculate the metric for each class separately, and average the
metrics across classes, weighting each class by its support (tp + fn).
- 'none' or None: Calculate the metric for each class separately, and return
the metric for every class.
- 'samples': Calculate the metric for each sample, and average the metrics
across samples (with equal weights for each sample).

while neither 'none' nor None works.

Environment

  • TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source): pip, version 0.11.0 and 0.11.3.
  • Python & PyTorch Version (e.g., 1.0): 3.10.6, 1.13.1.
  • Any other relevant information such as OS (e.g., Linux): Mac.

Additional context

Looking at the code, something seems fishy. Comparing the following two code snippets of classification/dice.py:
https://github.com/Lightning-AI/metrics/blob/825d17f32ee0b9a2a8024c89d4a09863d7eb45c3/src/torchmetrics/classification/dice.py#L149-L151
and
https://github.com/Lightning-AI/metrics/blob/21b23b6d472ec542c764a789af63bd054fbb3512/src/torchmetrics/classification/dice.py#L167-L168

It seems line 167 is wrong since average is not modified between the two snippes.

@asbjrnmunk asbjrnmunk added bug / fix Something isn't working help wanted Extra attention is needed labels Mar 7, 2023
@SkafteNicki SkafteNicki added this to the future milestone Mar 21, 2023
@SkafteNicki SkafteNicki modified the milestones: future, v1.2.0 Aug 18, 2023
@Borda Borda modified the milestones: v1.2.0, v1.1.x Aug 25, 2023
@Borda
Copy link
Member

Borda commented Aug 25, 2023

@asbjrnmunk would you be interested in working on this case and adding no reduction?

@Borda Borda added the v0.11.x label Aug 25, 2023
@Borda Borda modified the milestones: v1.1.x, v1.2.x Sep 24, 2023
@lirfu
Copy link

lirfu commented Sep 25, 2023

Isn't Dice score equivalent to F1 score (link)? Mathematically it works out the same, not sure if there are some implementation nuances. If both are the same, maybe just add make an alias for people who prefer the name 'Dice' and a small line in docs of their eqivalence.

@noureddinekhiati
Copy link

noureddinekhiati commented Nov 27, 2023

Have the same issue, I used MulticlassF1Score with average None and it worked for me. (since dice is equivalent to F1 score : https://torchmetrics.readthedocs.io/en/stable/classification/f1_score.html#multiclassf1score:~:text=(values)-,MulticlassF1Score,-CLASS)
pred = torch.tensor([0,2,5,2,2,2,1])
target= torch.tensor([0,1,5,2,1,2,1])
dice = MulticlassF1Score(num_classes=6,average=None)
dice_score = dice(pred,target)

OUTPUT :
tensor([1.0000, 0.5000, 0.6667, 0.0000, 0.0000, 1.0000])

@ioangatop
Copy link

Hi any updates on this? it would be handy for Dice score to support average=None since its also in the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed v0.11.x
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants
@Borda @asbjrnmunk @lirfu @SkafteNicki @ioangatop @noureddinekhiati and others