-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up via xformers #120
Comments
Hi Jamie, thanks for reaching out! - I wanted to try this before answering but obviously it took me way too long. I already gave it a shot few weeks ago but failed to reach some significant speed-up but maybe I did something wrong (used it for translation on the new ProstT5 model). Have you made positive experience with this using some protein language models? |
Sorry to hear about that. I haven't tried this out yet for protein LLMs (only tested it out on stable-diffusion), but it is on my radar. Hoping that it could be useful for inference and speed up the embedding calculations (which we're noticing is a bottleneck for protein annotation) |
Hm, how many proteins are you trying to label? - From my experience ProtT5-XL-U50 encoder-only in half-precision using batching as described here reaches around 0.1s/protein on average for the 20k proteins human (so around 30m for human). |
I had a brief look and I stopped once I hit the following error: |
Regarding examples, I first saw xformers being used in
https://github.com/Stability-AI/stablediffusion — so yes I only saw this
used in diffusion models
We were trying to embed all of uniref at one point, but had to resort to
just a subset. Were trying to embed proteins in microbial metagenomes, and
those reference databases are often >50M proteins
…On Mon, Aug 28, 2023 at 10:02 AM Michael Heinzinger < ***@***.***> wrote:
I had a brief look and I stopped once I hit the following error: AttributeError:
'FeatureExtractionPipeline' object has no attribute
'enable_xformers_memory_efficient_attention' (tried to extract embeddings
from the ProtT5-XL-U50-fp16 model from my link in the post above).
So not sure whether it is as easily plug-n-play as I had hoped. In case
you find some example/tutorial that shows how this should be done for plain
Transformers (no diffusion etc), pls send by and I can give it a try. So
far, I only found tutorials on how to use this on diffusion models in
huggingface (but most likely I just missed the right source)
—
Reply to this email directly, view it on GitHub
<#120 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA75VXNWFRJVOGIAQY2OB73XXRGCZANCNFSM6AAAAAAZGF66MI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Yeah, I see your point. We also ran UniRef50 at one point but only to make predictions, not for embedding extraction (esp. as storing those embeddings becomes expensive quickly).
|
Just in case you weren't familiar with this, there is an xformers library that can allow for a >4x speed up on all transformer operations
https://github.com/facebookresearch/xformers
Could be low hanging fruit to speed up the operations in this library :)
The text was updated successfully, but these errors were encountered: