Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(autoware_tensorrt_rtmdet): add tensorrt rtmdet model #8165

Open
wants to merge 65 commits into
base: main
Choose a base branch
from

Conversation

StepTurtle
Copy link
Contributor

@StepTurtle StepTurtle commented Jul 23, 2024

Description

This PR adds a instance segmentation method RTMDet to perception pipeline.

Test Video: https://youtu.be/dBJdZtc4BC8

Process Times

Average Time (ms) Standard Deviation (ms)
Preprocess 1.61101 ms 0.470295 ms
Inference 1.71506 ms 0.757723 ms
Postprocess 13.4203 ms 0.628055 ms
Visualization 23.4338 ms 6.98011 ms

Following table shows the total time for preprocess, inference and postprocess processes. It don't contain visualization

Average Time (ms) Standard Deviation (ms)
Total 15.791 ms 1.58901 ms

Process Times for 8 camera on single GPU

Average Time (ms) Standard Deviation (ms) Min (ms) Max (ms)
node-0 43.5821 ms 17.0037 ms 16.2328 ms 102.521 ms
node-1 42.7286 ms 15.8802 ms 22.3733 ms 98.7796 ms
node-2 41.4004 ms 15.0862 ms 15.0169 ms 87.0995 ms
node-3 42.3738 ms 15.7069 ms 21.1178 ms 105.505 ms
node-4 36.4766 ms 13.2308 ms 20.3997 ms 81.181 ms
node-5 42.2531 ms 16.1687 ms 16.265 ms 91.1761 ms
node-6 35.9258 ms 13.5001 ms 19.7352 ms 76.6217 ms
node-7 36.4776 ms 15.4815 ms 21.2165 ms 80.9634 ms

Computer Specifications

Device Model
GPU GeForce RTX 3090 (24 gB VRAM)
GPU AMD® Ryzen 7 2700x eight-core processor × 16
Memory 32 gB

Related links

Parent Issue:

PR for message type which is used in this PR:

Repository for Plugin:

How was this PR tested?

Testers can follow these steps to test PR.

1) Download the pre-trained model:

gdown https://drive.google.com/drive/folders/1P1JyMrstJIN9Y4F947nUkbKxQ2JDntgF?usp=drive_link -O ~/autoware_data/ --folder

2) For autoware_internal_msgs use following PR:

3) For trt_batched_nms use https://github.com/autowarefoundation/trt_batched_nms:

  • Clone the repository into universe/extarnal

4) Build the package and other dependencies of package

cd autoware
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-up-to tensorrt_rtmdet

5) Update the topic parameters from launch/rtmdet.launch.xml and run the launch file

ros2 launch autoware_tensorrt_rtmdet rtmdet.launch.xml

Notes for reviewers

🟨 Plugin Loading

Autoware has the same logic that I used to load the TensorRT plugin.

for (const auto & plugin_path : plugin_paths) {
    int32_t flags{RTLD_LAZY};
    void * handle = dlopen(plugin_path.c_str(), flags); // Plugin path is '/path/to/plugin.so'
    if (!handle) {
      logger_.log(nvinfer1::ILogger::Severity::kERROR, "Could not load plugin library");
    }
}

After compilation, a file with the extension '.so' is created. This file stored in build and it should be parameter of dlopen() function.

Is there any information about can we handle this in Cmake. If we cannot, how can I provide the path to the file located inside the 'build' folder?

I was able to load the plugin using the file paths below:

  • ./build/tensorrt_rtmdet/libtensorrt_rtmdet_plugin.so (relative path from workspace)
  • /home/user/projects/workspace/build/tensorrt_rtmdet/libtensorrt_rtmdet_plugin.so (absolute path)

🟨 TRTBatchedNMS Codebase

The RTMDet model uses the TRTBatchedNMS plugin (modified version of original TRTBatchedNMS by TensorRT) and I put all code base of plugin into src/trt_batched_nms and include/trt_batched_nms but I am not is it suitable or not?

🟨 int8 Precision Option

There are three precision option (fp16, fp32 and int8) and one of them was not work. In the current situation, it is working, but the result is not entirely correct. Watch video to see problem:

Video Link https://youtu.be/3YlY3a9Xnpk

🟨 Message Type

Interface changes

None.

Effects on system behavior

None.

@StepTurtle StepTurtle added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Jul 23, 2024
@StepTurtle StepTurtle self-assigned this Jul 23, 2024
@github-actions github-actions bot added the type:documentation Creating or refining documentation. (auto-assigned) label Jul 23, 2024
Copy link

github-actions bot commented Jul 23, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@xmfcx xmfcx added the tag:run-build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Jul 23, 2024
@StepTurtle StepTurtle force-pushed the feat/add_tensorrt_rtmdet branch 2 times, most recently from 2fb9f5c to d1149a4 Compare August 21, 2024 13:07
@StepTurtle StepTurtle changed the title feat(perception): add tensorrt rtmdet model feat(autoware_tensorrt_rtmdet): add tensorrt rtmdet model Aug 21, 2024
@StepTurtle StepTurtle added the tag:deploy-docs Mark for deploy-docs action generation. (used-by-ci) label Aug 22, 2024
@StepTurtle
Copy link
Contributor Author

Hi, @mitsudome-r

This PR looks ready to review.

@Owen-Liuyuxuan
Copy link
Contributor

Thank you for your PR. I am considering whether tensorrt_batched_nms should be a standalone package or add to the tensorrt_common. Any ideas?

@StepTurtle
Copy link
Contributor Author

StepTurtle commented Sep 2, 2024

Thank you for your PR. I am considering whether tensorrt_batched_nms should be a standalone package or add to the tensorrt_common. Any ideas?

Hi @Owen-Liuyuxuan, thanks for your thoughts,

I am not sure if it would be beneficial to use it as a separate package. When we organize it as a separate package, I would expect it to be used by multiple packages or models. Normally, the batchedNMSPlugin is a plugin published within the TensorRT library, and the version we are using has been modified by mmdeploy. Therefore, I don't think it will be used by another model or package in the future. If this plugin is used, it will most likely be in its original form and I guess we can directly reach the original form from TensorRT headers.

Apart from that, if we want to gather all currently used and future plugins within tensorrt_common, this seems reasonable to me, but it is equally reasonable to keep it within the package. So, both of them OK for me (keep current form or put it to tensorrt_common).

@Owen-Liuyuxuan
Copy link
Contributor

@StepTurtle
Your reasoning is solid. I agree to keep it as its current form.

Copy link
Contributor

@Shin-kyoto Shin-kyoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StepTurtle
As an honest opinion from a reviewer, reviewing a PR with more than 5000 lines is quite challenging.

May I suggest one of the following actions? 🙏

  • Add unit tests and ensure that they pass in CI.
    • The reviewer will then verify whether the tests are appropriate.
  • Split the PR into smaller parts.

cc: @mitsudome-r @kminoda @mojomex
Please give your comments.

@mojomex
Copy link
Contributor

mojomex commented Sep 7, 2024

@StepTurtle
Thank you for your PR! Thank you especially for the easy-to-follow PR description and documentation.

@Shin-kyoto

  • Add unit tests and ensure that they pass in CI.
    The reviewer will then verify whether the tests are appropriate.
  • Split the PR into smaller parts.

cc: @mitsudome-r @kminoda @mojomex
Please give your comments.

  • I agree that this PR is proably better split up into parts. I would suggest making trt_batched_nms and related code its own PR as was also proposed by @Owen-Liuyuxuan.
  • in calibrator.hpp, remove things like Int8LegacyCalibrator if there is no strong technical reason as to why multiple calibrators are needed.
  • I agree that unit tests, or a module test with small sample rosbags as ground truths should be included.

After compilation, a file with the extension '.so' is created. This file stored in build and it should be parameter of dlopen() function.

Is there any information about can we handle this in Cmake. If we cannot, how can I provide the path to the file located inside the 'build' folder?

This is best handled in the launch stage. ROS 2 provides substitutions like $(find-pkg-prefix <pkg_name>) which can be used to get the package's location. There is also $(exec-in-package <exec-name> <package-name>) which looks like it just might work for shared libraries as well (I haven't verified though).

Loading plugins should also definitely be covered by unit tests to ensure they work reliably during runtime.

Thank you for your consideration.

@StepTurtle StepTurtle marked this pull request as draft September 7, 2024 11:34
@StepTurtle
Copy link
Contributor Author

StepTurtle commented Sep 9, 2024

Hey, @mojomex @Shin-kyoto thanks for reviews.

At first, I thought we wouldn’t focus much on the plugin since it was developed by others, but it's nice that you reviewed the plugin code.

If we want to add the plugin as a separate package, should we put it under external folder or somewhere in perception folder? I can create a repository for the plugin under the leo-drive organization and we can clone the repository using the autoware.repos file.

I am working on the other topics you mentioned like calibrator.hpp code, unit tests, and so on. If I encounter any issues, I will let you know.

NOTE: Since there are a lot of TO DO, I switched the PR status as draft

Signed-off-by: Barış Zeren <[email protected]>
@StepTurtle
Copy link
Contributor Author

StepTurtle commented Sep 23, 2024

Hi again @mojomex @Shin-kyoto,

I’ve updated the PR and made it ready for review.

  • I moved the custom plugin codebase to a new repository under the leo-drive organization. This allows us to add it to autoware.repos, enabling us to clone the plugin to an external folder during installation. (https://github.com/leo-drive/trt_batched_nms)
  • @mojomex, we can find the plugin path as you mentioned, thanks: $(find-pkg-prefix trt_batched_nms)/lib/libtrt_batched_nms_plugin.so.
  • I removed two calibration options from calibrator.hpp, leaving only one option for int8: Entropy. Since there is a performance issue with int8, if we cannot resolve it, we may consider removing the last option for int8 as well.
  • I also added a unit test that works with a single image instead of rosbag (I can also add a small rosbag if we think it’s necessary.). If you could check the test code, I would appreciate it.

Thanks!

@StepTurtle StepTurtle marked this pull request as ready for review September 23, 2024 12:25
Signed-off-by: Barış Zeren <[email protected]>
@mojomex
Copy link
Contributor

mojomex commented Sep 24, 2024

@StepTurtle Thank you for the great work!
I've had a look and I think all of my feedback has been addressed 🥳 I'm not one of the main reviewers so I'll leave it up to the others to continue with the review process 🙇

Signed-off-by: Barış Zeren <[email protected]>
@StepTurtle
Copy link
Contributor Author

@Shin-kyoto, I just realized I forgot to mention something in this comment. I also removed the multi_scale and with_roi options to simplify the codebase, as these are not default options and won't be needed in the near future. We can always add them back with a separate PR if required later on.

related commit: ccbd818

@StepTurtle
Copy link
Contributor Author

StepTurtle commented Oct 7, 2024

Hey @Shin-kyoto, If we decide on the way to import the trt_batched_nms plugin, I will create a PR to update autoware.repos by adding https://github.com/leo-drive/trt_batched_nms.

Also we should add onnx model to autoware artifacts. So if you approve me I will also want to start work on it.

@kminoda
Copy link
Contributor

kminoda commented Oct 11, 2024

@StepTurtle Hi, thank you for your work (and also to all the reviewers).
I posted a comment here: #7235 (comment)

Please check, thanks!

Signed-off-by: Barış Zeren <[email protected]>
@StepTurtle
Copy link
Contributor Author

Closing the discussion and PRs as won't do due to a lack of motivation, requests on this topic, and no feedback received for a while.

Copy link
Contributor

@amadeuszsz amadeuszsz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I just did general review, more comments regarding source code will come soon. I can confirm it works, but I wonder if NMS works as expected (please, check attached image). What do you think about it? Also played with parameters and I didn't get expected results.
Screenshot from 2024-11-22 18-54-33

EDIT:
Missing words for internal dictionary you can add via workflow. Source code itself does not require major changes and can be merged to autoware.universe. However, to improve code quality and save your time, please consider use clang tooling - there are multiple unused variables and headers which can be highlighted by automated tools.

perception/autoware_tensorrt_rtmdet/package.xml Outdated Show resolved Hide resolved
perception/autoware_tensorrt_rtmdet/package.xml Outdated Show resolved Hide resolved
perception/autoware_tensorrt_rtmdet/package.xml Outdated Show resolved Hide resolved
ament_lint_auto_find_test_dependencies()

ament_add_ros_isolated_gtest(test_rtmdet test/test_rtmdet.cpp)
set_tests_properties(test_rtmdet PROPERTIES TIMEOUT 300) # It could take a long time on the first run to create the engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Review placeholder]
AFAIK, CI is limited to 60 s for each unit test, we need to check how it behaves at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as 60 sec, but I cannot complete model build process in 60 sec.

perception/autoware_tensorrt_rtmdet/README.md Show resolved Hide resolved
@StepTurtle
Copy link
Contributor Author

Thanks @amadeuszsz for your review.

Thank you for your contribution! I just did general review, more comments regarding source code will come soon. I can confirm it works, but I wonder if NMS works as expected (please, check attached image). What do you think about it? Also played with parameters and I didn't get expected results.

I never seen such problem. Can you share your bag file, so I can reproduce and try to fix it?

EDIT: Missing words for internal dictionary you can add via workflow. Source code itself does not require major changes and can be merged to autoware.universe. However, to improve code quality and save your time, please consider use clang tooling - there are multiple unused variables and headers which can be highlighted by automated tools.

Here are the remaining words which creates spelling error:

  • rtmdet
  • RTMDET
  • libtrt
  • libtensorrt

I could not able to create a PR with workflow. I could not see the run workflow button, I think it because I am not a member for TIER IV repositories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned) tag:deploy-docs Mark for deploy-docs action generation. (used-by-ci) tag:require-cuda-build-and-test tag:run-build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) type:documentation Creating or refining documentation. (auto-assigned)
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

7 participants