-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an implementation of HF model and an example sentiment analysis a… #194
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for starting this, will be very good to have in the package; I left a few comments, and have some more higher level questions:
- Is this tied to a specific class of models? I see in the Inference docs
https://huggingface.co/docs/api-inference/detailed_parameters
that the parameters differ per task type? If not lets leave a comment (perhaps module level) that tells the users what they are expected to put in the "prompt fn" - Currently, there is no "input" key in the payload we send to the API, while all samples in their docs have the key - maybe "text" was an older API that is not official anymore?
- Perhaps we can make things even clearer by changing the name of the model to
HuggingFaceInferenceAPIModel
; I know thats quite a mouthful but there are HuggingFace Inference Endpoints that are paid, so we should disambiguate from that (+ also disambiguate from general HuggingFace models that run locally)
Let me know what you think and if something is unclear!
Indeed, different task types have different input/output formats and types. I tried only an example from the classification type.
Interesting. I did try that but "inputs" does not work, at least for the model I used in the asset. It returns
Sure!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the changes; The only big thing I'm worried about is the mismatching APIs ("inputs" in the docs vs "text" that seems to work for us) - we should get this committed nevertheless, but if you have some time lets try to dig into this and see why this is happening
llmebench/models/HuggingFace.py
Outdated
) | ||
if not response.ok: | ||
if response.status_code == 503: # model loading | ||
time.sleep(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason for the sleep here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hoping to give the model some time to load before retrying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry mechanism has an inherent random delay (that gets sampled from an exponentially increasing range), so we don't need to worry about it here
assets/benchmark_v1/sentiment/sentiment/ArSASSentiment_HF_ZeroShot.py
Outdated
Show resolved
Hide resolved
The returned format does not include the original text, and the dataset grountruth are labeled sentences which cannot be recovered using the model output alone.
No description provided.