Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ serving example: adding saved model path #290

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

vinhngx
Copy link
Contributor

@vinhngx vinhngx commented Apr 29, 2022

This PR extends the current C++ image classification example with a saved model path. The two workflows are thus:

  • Keras saved model ->TFTRT Python API -> frozen graph -> CPP serving
  • Keras saved model ->TFTRT Python API -> saved model -> CPP serving

Reorganizing the two path into two separate subfolders and add a master readme.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@DEKHTIARJonathan DEKHTIARJonathan force-pushed the master branch 3 times, most recently from 27ef7b8 to 535c132 Compare May 10, 2022 21:42
@DEKHTIARJonathan
Copy link
Collaborator

@meena-at-work: for review

@vinhngx
Copy link
Contributor Author

vinhngx commented Jun 30, 2022

@DEKHTIARJonathan @meena-at-work any feedback please?

Copy link
Collaborator

@meena-at-work meena-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First up, Thank you for the example!

Secondly, I feel that the distinction between the frozen graph and saved model is not brought out clearly enough in the example. Particularly, I feel that the load function is buried deep inside the rest of the example. I think it can be refactored such that there is only one example with command line options for saved model vs frozen graph. However, if it is required to keep the two examples, I'd keep the model loading functions in a separate file so that they're easier to see.

Also, I think the benchmarking code can be removed, as we have a separate benchmark now.

This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.

## Docker environment
Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo -- s/convinient/convenient/

To replecate the below steps, start by pulling the NGC TF container:

```
docker pull nvcr.io/nvidia/tensorflow:22.01-tf2-py3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vinhngx , I know this is probably a bother, but is it possible to use a more updated container for this example & make sure that it still works as is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is a good suggestion. The example when developed used 22.01 which is current at the time. We can update to the latest now.

## Build the C++ example
The NVIDIA NGC container should have everything you need to run this example installed already.

To build it, first, you need to copy the build scripts `tftrt_build.sh` to `/opt/tensorflow`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/tftrt_build.sh/tftrt-build.sh/

2022-04-29 04:20:24.377394: I tensorflow/examples/image_classification/frozen-graph/main.cc:276] Mexican hairless (269): 0.000520922
```

The program will also benchmark and output the throughput. Observe the improved throughput offered by moving from Python to C++ serving.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm...I'm wondering if this (benchmarking) is necessary, given that we have benchmarks under the tftrt/benchmark-cpp/ directory?


## What's next

Try to build TF-TRT FP16 and INT8 models and test on your own data, and serve them with C++.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question @tfeher -- is INT8 conversion known to work in C++?

This example is built based upon the original Google's TensorFlow C++ image classification [example](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/label_image), on top of which we added the TF-TRT conversion part and adapted the C++ code for loading and inferencing with the TF-TRT model.

## Docker environment
Docker images provide a convinient and repeatable environment for experimentation. This workflow was tested in the NVIDIA NGC TensorFlow 22.01 docker container that comes with a TensorFlow 2.x build. Tools required for building this example, such as Bazel, NVIDIA CUDA, CUDNN, NCCL libraries are all readily setup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before -- can you please test with 22.07 container & update the instructions here if everything works fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.


set -e
if [[ ! -f /opt/tensorflow/nvbuild.sh || ! -f /opt/tensorflow/nvbuildopts ]]; then
echo This TF-TRT example is intended to be executed in the NGC TensorFlow container environment. Get one with, e.g. `docker pull nvcr.io/nvidia/tensorflow:19.10-py3`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an old docker image? Please update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@vinhngx
Copy link
Contributor Author

vinhngx commented Jul 19, 2022

Thanks @meena-at-work for the review. By and large, I'd keep the two paths separate and clean.

Conceptually it is possible to lump everything together and use "flags" to redirect, but I don't see much benefit if at all. Most users will use one path or another in their code base, not both. So better to keep them separate.

I think we can make the loading code in a separate file.

@meena-at-work
Copy link
Collaborator

meena-at-work commented Jul 20, 2022

@vinhngx -- I was considering that too (command line option vs just making a new example), and I think that's fair from a demo application point of view. However, if that's the case && we want to gain most value from these examples, we should keep the common code & the model loading code in separate files (and I think we're in agreement on that, based on your comment!).
So if you can refactor the code to reflect the above along with the other changes, I can approve the PR.

@DEKHTIARJonathan DEKHTIARJonathan force-pushed the master branch 10 times, most recently from 90dd316 to 0de3370 Compare August 1, 2022 16:11
@DEKHTIARJonathan
Copy link
Collaborator

@vinhngx this path is completely useless:

  • Keras saved model ->TFTRT Python API -> frozen graph -> CPP serving

TF-TRT freeze the graph already during the conversion process

@DEKHTIARJonathan DEKHTIARJonathan force-pushed the master branch 3 times, most recently from 02448bf to c73b400 Compare August 3, 2022 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants