-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to reproduce memory snapshot in doc? #112
Comments
Hi @LucQueen - which GPU type are you running on? Thank you. |
@cpuhrsch thanks for your reply. The GPU type is A100 80G SXM. |
@LucQueen - oh ok! That's probably because you're using the fused kernel. So add_decomposed_rel_pos has been shortened and now doesn't materialize the full attention mask instead. Instead we're using See segment-anything-fast/segment_anything_fast/modeling/image_encoder.py Lines 233 to 247 in 387488b
|
@cpuhrsch thanks for your reply! I am using batch-size 16, whereas doc is using batch-size 8, so I am seeing a bigger memory footprint. But I'm still very confused, I see memory snapshot in doc is also using fused kernel, you can see it by marked in red box from the picture, why I can not get ‘add_decomposed_el_pos’ stack informations in memory snapshot. |
@LucQueen - Ah! Hm, I'm not sure. Is your picture from the latest version of segment-anything-fast? The picture you reference is from a section within the blog and not based on the most recent version of segment-anything-fast. It was recorded from an earlier version without the fused kernels. |
hi,how to reproduce memory snapshot in doc?
what i get is
I‘m very confused the reason that can not get ‘add_decomposed_rel_pos’ stack informations in memory snapshot, and how to get full stack backtrace.
The torch version is 2.2, following up instructions in https://github.com/pytorch-labs/segment-anything-fast/tree/main/experiments#installation-instructions
Looking forward to a reply.
The text was updated successfully, but these errors were encountered: