Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does ToMe work for focal modulation networks? #23

Open
subneed opened this issue Mar 21, 2023 · 2 comments
Open

Does ToMe work for focal modulation networks? #23

subneed opened this issue Mar 21, 2023 · 2 comments

Comments

@subneed
Copy link

subneed commented Mar 21, 2023

any help on modifying ToMe for focal modulation networks?
I guess in FMN we could apply to me on Q/M. Also it has downsampling layers in each stage, so r value changes each stage and model definition?

@dbolya
Copy link
Contributor

dbolya commented Mar 21, 2023

I'm not too familiar with FMNs, but it seems like it's a hierarchical network with a different attention mechanism? In principle you can use ToMe on anything that uses tokens, but like you said you'd need to be careful about the downsampling layers. You might be able to use ToMe instead of those downsampling layers, but that would probably require some exploration to figure out what's best.

@chengyangfu
Copy link
Contributor

This problem is still up for debate in the research world, so we can only answer things that have already been covered in our paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants