-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] PT - convert BF16 tensor to float before calling .numpy() #1342
Conversation
def need_conversion_to_float(dtype): | ||
# pytorch: torch/csrc/utils/tensor_numpy.cpp:aten_to_numpy_dtype | ||
return dtype in [torch.bfloat16] | ||
|
||
numpy_dtype_converter = lambda x: x.float() if need_conversion_to_float(x.dtype) else x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly checking dtype in [torch.bfloat16]
is simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chunyuan-w @jgong5 👋
Thanks for the fix 👍
Some points:
We should add a function for the conversion in:
https://github.com/mindee/doctr/blob/main/doctr/models/utils/pytorch.py
and for TF in
https://github.com/mindee/doctr/blob/main/doctr/models/utils/tensorflow.py
because i expect we need this fix on multiple places:
for preds in self.postprocessor(prob_map.detach().cpu().permute((0, 2, 3, 1)).numpy()) |
out["preds"] = [dict(zip(self.class_names, preds)) for preds in self.postprocessor(prob_map.numpy())] |
for preds in self.postprocessor(prob_map.detach().cpu().permute((0, 2, 3, 1)).numpy()) |
out["preds"] = [dict(zip(self.class_names, preds)) for preds in self.postprocessor(prob_map.numpy())] |
Than a short test for the function in:
https://github.com/mindee/doctr/blob/main/tests/pytorch/test_models_utils_pt.py
and
https://github.com/mindee/doctr/blob/main/tests/tensorflow/test_models_utils_tf.py
Afterwards you can run
make style
make quality
(sometimes it shows an typing issue in https://github.com/mindee/doctr/tree/main/doctr/models/artefacts which can be ignored)
make test-common
make test-torch
make test-tf
EDIT:
After double checking we need the conversion also for each recognition model (except CRNN)
e.g.:
out["preds"] = self.postprocessor(decoded_features) |
And for the detection models i suggest to convert directly the
prob_map
if needede.g.:
prob_map = torch.sigmoid(logits) |
@chunyuan-w see: #1344 In your PR you can do the same for PyTorch and we are fine to merge 🤗 |
Thanks for the reference. Let me further refine this PR following #1344. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @chunyuan-w 👍
Could please add a short comment that it fixes the issue in torchbench
? :)
@odulcy-mindee mypy fix applied in #1344
Thanks for merging it! |
Thanks for the update 👍 |
Summary: Update the version of `doctr` to include the fix in mindee/doctr#1342 for BF16 mode. Remove the change of `rapidfuzz==2.15.1` in `requirements.txt` (#1555) since the version has been set in the model repo in the updated version (mindee/doctr#1176). Pull Request resolved: #1979 Reviewed By: aaronenyeshi Differential Revision: D50242780 Pulled By: xuzhao9 fbshipit-source-id: d8ed9164d463a1217114408106b2c745431bd159
.numpy()
in PyTorch only supports limited scalar types: aten_to_numpy_dtype.When running BF16 with autocast, an error will be thrown here when calling
.numpy()
:TypeError: Got unsupported ScalarType BFloat16
.Convert BF16 tensor to float before calling
.numpy()
to fix this error.