This is an endpoint for embedding text into a vector using the model carlesoctav/multi-qa-en-id-mMiniLMv2-L6-H384.
You can use the ready Docker image from carlesoctav/my_bert_model. Simply use this command to run the Docker image:
docker pull carlesoctav/my_bert_model
docker run -d -p 8501:8501 -p 8500:8500 --name carlesoctav/my_bert_model
PORT 8501 is for the REST API, and PORT 8500 is for the gRPC API.
Take a look at test_TF_serving_docker.py
for an example of the REST API. Before running the script, make sure the Docker image is running and all the requirements are installed. Then run the script using this command.
If you want to create the Docker image yourself, you can follow these steps:
- Create a SavedModel. To create a SavedModel, since I saved my model in Hugging Face, I need to convert it to a SavedModel. To do that, I use save_model_to_TF.py or copy this script:
from transformers import AutoTokenizer, TFAutoModel
model = TFAutoModel.from_pretrained("carlesoctav/multi-qa-en-id-mMiniLMv2-L6-H384")
model.save_pretrained("./model", saved_model=True)
- Create a Docker container with the SavedModel and run it. First, pull the TensorFlow Serving Docker image for CPU (for GPU, replace "serving" with "serving:latest-gpu"):
docker pull tensorflow/serving
- Next, run the serving image as a daemon named "serving_base":
docker run -d --name serving_base tensorflow/serving
- Copy the newly created SavedModel into the serving_base container's models folder:
docker cp my_model/saved_model serving_base:/models/bert
- Commit the container that serves the model by changing "MODEL_NAME" to match the model's name (here, "bert"). The name "bert" corresponds to the name we want to give to our SavedModel:
docker commit --change "ENV MODEL_NAME bert" serving_base my_bert_model
- Kill the serving_base image that was run as a daemon because we don't need it anymore:
docker kill serving_base
- Finally, run the image to serve our SavedModel as a daemon. We map the ports 8501 (REST API) and 8500 (gRPC API) in the container to the host, and we name the container "bert":
docker run -d -p 8501:8501 -p 8500:8500 --name bert my_bert_model