-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mlserver example #1110
base: main
Are you sure you want to change the base?
Mlserver example #1110
Conversation
|
||
NUM_THREADS = 2 | ||
URL = "http://localhost:8080/v2/models/text-classification-model/infer" | ||
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should * 100
be * NUM_THREADS
if we are taking only sentences[:NUM_THREADS]
elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rsnm2 see suggestion below
@@ -0,0 +1,75 @@ | |||
# **Step 1: Installation** | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best to add an intro paragraph to give users a heads up of what this example does.
threads = [threading.Thread(target=tfunc, args=(sentence,)) for sentence in sentences[:NUM_THREADS]] | ||
for thread in threads: | ||
thread.start() | ||
for thread in threads: | ||
thread.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this creates NUM_THREADS
threads to make the request, is that intended?
Might make more sense to create len(sentences)
threads and execute NUM_THREADS
at a time.
You can do this out of the box with ThreadPoolExecutor
with something like:
threads = [threading.Thread(target=tfunc, args=(sentence,)) for sentence in sentences[:NUM_THREADS]] | |
for thread in threads: | |
thread.start() | |
for thread in threads: | |
thread.join() | |
from concurrent.futures.thread import ThreadPoolExecutor | |
threadpool = ThreadPoolExecutor(max_workers=NUM_THREADS) | |
results = threadpool.map(tfunc, sentences) |
URL = "http://localhost:8080/v2/models/text-classification-model/infer" | ||
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100 | ||
|
||
def tfunc(text): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would rename to something more descriptive like inference_request
for output in resp["outputs"]: | ||
print(output["data"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executing a list printout while multithreaded may cause a race condition, any reason to not return the value and print in sequence at the end? (ie consider thread 1 and thread 2 happen to execute exactly at the same time, they will print their lines at the same time and might not tell which is which)
|
||
NUM_THREADS = 2 | ||
URL = "http://localhost:8080/v2/models/text-classification-model/infer" | ||
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rsnm2 see suggestion below
@@ -0,0 +1,27 @@ | |||
import requests, threading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would suggest a few in line comments for self-documentation
task = self._settings.parameters.task, | ||
model_path = self._settings.parameters.model_path, | ||
batch_size = self._settings.parameters.batch_size, | ||
sequence_length = self._settings.parameters.sequence_length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a place for generic kwargs in the settings? Would be cool if we could use that instead to dump extra pipeline args so we can get full generic pipeline support out of the box
@@ -0,0 +1,19 @@ | |||
from mlserver import MLModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great, love that it works out of the box - let's throw in the serving command as a comment just for convenience
No description provided.