-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++ serving example: adding saved model path #290
base: master
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
27ef7b8
to
535c132
Compare
@meena-at-work: for review |
@DEKHTIARJonathan @meena-at-work any feedback please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First up, Thank you for the example!
Secondly, I feel that the distinction between the frozen graph and saved model is not brought out clearly enough in the example. Particularly, I feel that the load function is buried deep inside the rest of the example. I think it can be refactored such that there is only one example with command line options for saved model vs frozen graph. However, if it is required to keep the two examples, I'd keep the model loading functions in a separate file so that they're easier to see.
Also, I think the benchmarking code can be removed, as we have a separate benchmark now.
This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model. | ||
|
||
## Docker environment | ||
Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo -- s/convinient/convenient/
To replecate the below steps, start by pulling the NGC TF container: | ||
|
||
``` | ||
docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinhngx , I know this is probably a bother, but is it possible to use a more updated container for this example & make sure that it still works as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is a good suggestion. The example when developed used 22.01 which is current at the time. We can update to the latest now.
## Build the C++ example | ||
The NVIDIA NGC container should have everything you need to run this example installed already. | ||
|
||
To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/tftrt_build.sh/tftrt-build.sh/
2022-04-29 04:20:24.377394: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Mexican hairless (269): 0.000520922 | ||
``` | ||
|
||
The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm...I'm wondering if this (benchmarking) is necessary, given that we have benchmarks under the tftrt/benchmark-cpp/ directory?
|
||
## What's next | ||
|
||
Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question @tfeher -- is INT8 conversion known to work in C++?
This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model. | ||
|
||
## Docker environment | ||
Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before -- can you please test with 22.07 container & update the instructions here if everything works fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
|
||
set -e | ||
if [[ ! -f /opt/tensorflow/nvbuild.sh || ! -f /opt/tensorflow/nvbuildopts ]]; then | ||
echo This TF-TRT example is intended to be executed in the NGC TensorFlow container environment. Get one with, e.g. `docker pull nvcr.io/nvidia/tensorflow:19.10-py3`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an old docker image? Please update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
Thanks @meena-at-work for the review. By and large, I'd keep the two paths separate and clean. Conceptually it is possible to lump everything together and use "flags" to redirect, but I don't see much benefit if at all. Most users will use one path or another in their code base, not both. So better to keep them separate. I think we can make the loading code in a separate file. |
@vinhngx -- I was considering that too (command line option vs just making a new example), and I think that's fair from a demo application point of view. However, if that's the case && we want to gain most value from these examples, we should keep the common code & the model loading code in separate files (and I think we're in agreement on that, based on your comment!). |
90dd316
to
0de3370
Compare
@vinhngx this path is completely useless:
TF-TRT freeze the graph already during the conversion process |
02448bf
to
c73b400
Compare
f8c43a2
to
41bbfde
Compare
6cf16c4
to
ccc606f
Compare
0c12576
to
bfed396
Compare
This PR extends the current C++ image classification example with a saved model path. The two workflows are thus:
Reorganizing the two path into two separate subfolders and add a master readme.