Mismatched results between your lib vs huggingface #15

pentegroom · 2021-05-06T06:36:28Z

Hi team,

First of all, thank you very much for the library. But I need a clarification why your results are being different than huggingface's result for the same input? Can you please help me with this?
Thanks

laurahanu · 2021-05-06T11:29:23Z

Hello,

We are aware of this and have raised this issue with huggingface before.

Currently the huggingface text pipeline automatically runs a softmax over the outputs when there are multiple labels, which only allows for one correct output, however, in our case (i.e. a multilabel model), a sigmoid is needed to allow multiple high scoring outputs.

From their documentation:

If multiple classification labels are available (model.config.num_labels >= 2), the pipeline will run a softmax over the results. If there is a single label, the pipeline will run a sigmoid over the result.

I have now added a disclaimer to the model cards on huggingface, hopefully this helps!

pentegroom · 2021-05-06T15:38:14Z

Thank you for the answer. I also saw multiple output from their system and for some examples actually they are doing better than your approach. I can give some examples if you want to discuss further more.

laurahanu · 2021-05-06T16:16:51Z

Yes, would help to see some examples, which model are you trying? Also, by multiple high scoring outputs I meant the outputs don't add up to 1, so you can have 2 outputs or more with scores > 0.9, which currently doesn't seem to happen on their hosted inference API when trying the models.

pentegroom · 2021-05-06T16:48:06Z

That is great. Thank you. A tricky example would be the following:
Model: toxic-bert - original
Sample input: This will not end well for you or your family.
Your sigmoid: highest label is toxicity with a 0.029 score.
Hug. Softmax: highest label is a toxic with a score of 0.906

Another input: You won't see the sun rise tomorrow.
Your sigmoid: highest label is toxicity with a 0.0020 score.
Hug softmax: toxic with a score of 0.607

In addition to that, label name is also a bit different such as toxicity vs toxic.

laurahanu · 2021-05-06T19:20:47Z

I see, thank you for the examples!

Firstly, we have only validated our models using a sigmoid since the original and unbiased models are both multilabel classifiers that have been trained with multiple correct answers. Since all the other labels besides toxic are meant to be subcategories of toxicity, there are always at least 2 correct answers. So we can only guarantee a reasonable performance with a sigmoid.

For example, "I will kill you" gives a toxic score of 0.514 on huggingface and a threat score of 0.458 when both of these should be high, whereas by using a sigmoid our version of the model gives a score of 0.907 for toxicity and 0.897 for threat.

Secondly, I wouldn't necessarily expect our models to give a high score on your chosen inputs since they are quite subtle and from I have seen the models do struggle with more nuanced toxic examples. From the few examples I have tried, the results seem quite arbitrary on neutral or non toxic examples, so I would advise against using it in its current state.

The label names correspond to the original label names used in the Jigsaw challenges. We changed them in our library to make them consistent across the 3 different models.

pentegroom · 2021-05-06T22:38:27Z

Thank you so much for your clear description. I would like to contribute to it if you guys are working on it or maintain it actively for making it as a productionized version.

laurahanu · 2021-08-17T16:24:52Z

Thank you for your interest in contributing, you can check out our current roadmap #26 and see if there's anything of interest!

Regarding this issue, the aforementioned HuggingFace PR that would allow a sigmoid over the outputs has now been merged into master. To test it, you can install the master version of the transformers library (or wait for future versions > 4.9.2) and get the expected outputs like so:

pip install 'git+https://github.com/huggingface/transformers.git'

from transformers import pipeline

detoxify_pipeline = pipeline(
     'text-classification', 
     model='unitary/toxic-bert', 
     tokenizer='bert-base-uncased', 
     function_to_apply='sigmoid', 
     return_all_scores=True
     )

detoxify_pipeline('shut up, you idiot!')
# [[{'label': 'toxic', 'score': 0.9950607419013977}, 
# {'label': 'severe_toxic', 'score': 0.07963108271360397}, 
# {'label': 'obscene', 'score': 0.8713390231132507}, 
# {'label': 'threat', 'score': 0.0019536688923835754}, 
# {'label': 'insult', 'score': 0.9586619138717651}, 
# {'label': 'identity_hate', 'score': 0.014700635336339474}]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatched results between your lib vs huggingface #15

Mismatched results between your lib vs huggingface #15

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented Aug 17, 2021

Mismatched results between your lib vs huggingface #15

Mismatched results between your lib vs huggingface #15

Comments

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented May 6, 2021

pentegroom commented May 6, 2021

laurahanu commented Aug 17, 2021