Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations and WAs to support HPU execution for Detr-Resnet-50 #1334

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

sandeep-maddipatla
Copy link

Modifications to the Detr transformer including WA's and Optimizations to run the Detr-Resnet-50 model in eager and lazy modes on the HPU.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@sandeep-maddipatla
Copy link
Author

This PR builds on #1155 , which is meant for Eager mode and adds changes necessary for Lazy mode execution. Some of the review feedback for the former PR is addressed here. More details below.

[x] pls. rebase/sync on top of main OH.
[x] run make style
[x] Pls. share the results of this test on g2 machines.
Done. Will share test result in another comment below.
[ ] we need to add README file for this in examples
Skipped. There is a README.md at https://github.com/huggingface/optimum-habana/tree/main/examples/object-detection. We haven't changed that particular inference example. Pls let us know if that still needs modification.
[x] pls. add the appropriate CI tests for this.
Done. Extended existing ci-test to add a detr-resnet-50 test as well.

@sandeep-maddipatla
Copy link
Author

sandeep-maddipatla commented Sep 16, 2024

make style result:
image

@sandeep-maddipatla
Copy link
Author

Test Result:
image

Copy link
Contributor

@vidyasiv vidyasiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase to latest main, there are some changes in modeling_utils.py

"""
Copied from https://github.com/huggingface/transformers/tree/v4.40.2
https://github.com/huggingface/transformers/blob/4fdf58afb72b0754da30037fc800b6044e7d9c99/src/transformers/models/detr/modeling_detr.py#L2287
The modications are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The modications are:
The modifications are:


# Compute the classification cost. Contrary to the loss, we don't use the NLL,
# but approximate it in 1 - proba[target class].
# The 1 is a constant that doesn't change the matching, it can be ommitted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The 1 is a constant that doesn't change the matching, it can be ommitted.
# The 1 is a constant that doesn't change the matching, it can be omitted.

Copy link
Contributor

@vidyasiv vidyasiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@splotnikv , please take a look if you're covering for Sandeep

@splotnikv
Copy link
Contributor

@splotnikv , please take a look if you're covering for Sandeep

Done. I don't have access right to update this PR, so created a new one. See #1404.

@sandeep-maddipatla
Copy link
Author

Rebased to latest optimum-habana, addressed feedback from #1404 , and merged in changes from that PR.

Now that I'm back working on this, will use this PR going forward to complete the review process. Sorry for the back-and-forth over the two PR's.

@vidyasiv
Copy link
Contributor

vidyasiv commented Oct 29, 2024

@sandeep-maddipatla Sorry was not able to work last few days so unable to review sooner.

  • Instead of jira link which shouldnt be pasted in public repo, can you add high level summary of the changes?
  • Are the tests meant to address both lazy and eager modes? or should i be manually setting env to test that?
GAUDI2_CI=1
RUN_SLOW=true
#lazy mode all 8 pass but eager 4 fail
PT_HPU_LAZY_MODE=0 pytest tests/test_object_detection.py 
FAILED tests/test_object_detection.py::GaudiDETRTester::test_inference_hpu_graphs - AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph'
FAILED tests/test_object_detection.py::GaudiDETRTester::test_no_latency_regression_autocast - AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph'
FAILED tests/test_object_detection.py::GaudiDetrResnet50_Tester::test_inference_hpu_graphs - AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph'
FAILED tests/test_object_detection.py::GaudiDetrResnet50_Tester::test_no_latency_regression_autocast - AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph'
  • README check (again not sure if eager mode is supported)
export PT_HPU_LAZY_MODE=0
python3 run_example.py \
	--model_name_or_path facebook/detr-resnet-101 \
	--image_path "http://images.cocodataset.org/val2017/000000039769.jpg" \
	--use_hpu_graphs \
	--bf16 \
	--print_result
AttributeError: module 'habana_frameworks.torch.hpu' has no attribute 'wrap_in_hpu_graph'

Lazy mode passes

Detected cat with confidence 0.996 at location [344.0, 25.25, 640.0, 376.0]
Detected remote with confidence 0.996 at location [328.0, 76.0, 372.0, 188.0]
Detected remote with confidence 0.996 at location [39.5, 69.5, 175.0, 119.0]
Detected cat with confidence 1.0 at location [15.62, 52.5, 316.0, 472.0]
Detected couch with confidence 0.996 at location [-1.25, 0.94, 640.0, 472.0]

Stats:
------------------------------------------------------------
Total latency (ms): 59.30161476135254 (for n_iterations=10) 
Average latency (ms): 5.930161476135254 (per iteration) 

Pls clarify how the testing is to be done.

@vidyasiv
Copy link
Contributor

vidyasiv commented Nov 6, 2024

@sandeep-maddipatla , could you update by EOW?

@vidyasiv
Copy link
Contributor

@sandeep-maddipatla please resolve merge conflicts

@emascarenhas
Copy link
Contributor

@sandeep-maddipatla , Please make changes and post results of retesting, otherwise this will be pushed to the 1.20 release.

@vidyasiv
Copy link
Contributor

@sandeep-maddipatla If we don't see an update by wed, this will need to be part of 1.20 release.

splotnikv and others added 6 commits December 2, 2024 20:00
- Add capability to ignore targets that have an out-of-range ID
- This helps to pad target objects to avoid graph recompilation and
  yet not affect the loss computation in training.
@sandeep-maddipatla
Copy link
Author

sandeep-maddipatla commented Dec 3, 2024

Sorry for the delayed update here. It appears that the hpu_graphs feature is no longer supported in eager mode. I adjusted the tests to skip the functions using hpu_graphs for eager mode as a WA. The results of the checks are as follows:

~/optimum-habana $ make style
...
ruff check . setup.py --fix
All checks passed!
ruff format . setup.py
401 files left unchanged
~/optimum-habana $ python setup.py install
~/optimum-habana $ pip install pytest timm sequencepiece
~/optimum-habana $ PT_HPU_LAZY_MODE=1 python -m pytest tests/test_object_detection.py
========================================================================================================================== test session starts ===========================================================================================================================
platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0
rootdir: /root/optimum-habana
configfile: setup.cfg
plugins: typeguard-4.3.0
collected 8 items

tests/test_object_detection.py ........                                                                                                                                                                                                                            [100%]

============================================================================================================================ warnings summary ============================================================================================================================
../../usr/lib/python3.10/inspect.py:288
  /usr/lib/python3.10/inspect.py:288: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead
    return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24
  /usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================================================ 8 passed, 2 warnings in 60.08s (0:01:00) ================================================================================================================


 ~/optimum-habana $ PT_HPU_LAZY_MODE=0 python -m pytest tests/test_object_detection.py
========================================================================================================================== test session starts ===========================================================================================================================
platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0
rootdir: /root/optimum-habana
configfile: setup.cfg
plugins: typeguard-4.3.0
collected 8 items

tests/test_object_detection.py ........                                                                                                                                                                                                                            [100%]

============================================================================================================================ warnings summary ============================================================================================================================
../../usr/lib/python3.10/inspect.py:288
  /usr/lib/python3.10/inspect.py:288: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead
    return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpex/kernels/__init__.py:18
  /usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpex/kernels/__init__.py:18: UserWarning: CustomNms, RoiAlignFunction, ScaledMaskedSoftmax from habana_frameworks.torch.hpex.kernels are no yet supported in eager mode
    warnings.warn(

../../usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24
  /usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
    warnings.warn(

tests/test_object_detection.py::GaudiDETRTester::test_inference_hpu_graphs
tests/test_object_detection.py::GaudiDetrResnet50_Tester::test_inference_hpu_graphs
  /root/optimum-habana/tests/test_object_detection.py:105: UserWarning: test_inference_hpu_graphs is supported only in lazy mode. Skipped
    warnings.warn("test_inference_hpu_graphs is supported only in lazy mode. Skipped")

tests/test_object_detection.py::GaudiDETRTester::test_no_latency_regression_autocast
tests/test_object_detection.py::GaudiDetrResnet50_Tester::test_no_latency_regression_autocast
  /root/optimum-habana/tests/test_object_detection.py:123: UserWarning: test_no_latency_regression_autocast is supported only in lazy mode. Skipped
    warnings.warn("test_no_latency_regression_autocast is supported only in lazy mode. Skipped")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================================================================================================== 8 passed, 7 warnings in 16.00s =====================================================================================================================

@sandeep-maddipatla
Copy link
Author

Test results with using the skipIf feature

~/optimum-habana $ PT_HPU_LAZY_MODE=1 python -m pytest tests/test_object_detection.py
========================================================================================================================== test session starts ===========================================================================================================================
platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0
rootdir: /root/optimum-habana
configfile: setup.cfg
plugins: typeguard-4.3.0
collected 8 items

tests/test_object_detection.py ........                                                                                                                                                                                                                            [100%]

============================================================================================================================ warnings summary ============================================================================================================================
../../usr/lib/python3.10/inspect.py:288
  /usr/lib/python3.10/inspect.py:288: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead
    return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24
  /usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================================================ 8 passed, 2 warnings in 63.79s (0:01:03) ================================================================================================================



~/optimum-habana $ PT_HPU_LAZY_MODE=0 python -m pytest tests/test_object_detection.py
========================================================================================================================== test session starts ===========================================================================================================================
platform linux -- Python 3.10.12, pytest-8.3.4, pluggy-1.5.0
rootdir: /root/optimum-habana
configfile: setup.cfg
plugins: typeguard-4.3.0
collected 8 items

tests/test_object_detection.py ..ss..ss                                                                                                                                                                                                                            [100%]

============================================================================================================================ warnings summary ============================================================================================================================
../../usr/lib/python3.10/inspect.py:288
  /usr/lib/python3.10/inspect.py:288: FutureWarning: `torch.distributed.reduce_op` is deprecated, please use `torch.distributed.ReduceOp` instead
    return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpex/kernels/__init__.py:18
  /usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpex/kernels/__init__.py:18: UserWarning: CustomNms, RoiAlignFunction, ScaledMaskedSoftmax from habana_frameworks.torch.hpex.kernels are no yet supported in eager mode
    warnings.warn(

../../usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24
  /usr/local/lib/python3.10/dist-packages/transformers-4.45.2-py3.10.egg/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================================================================== 4 passed, 4 skipped, 3 warnings in 18.70s ================================================================================================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants