You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great work with an elegant but effective idea! Thanks for sharing. However, I have a minor suggestion.
It is well-known that in the LLM finetuning paradigm, adapter-tuning [1] — done by inserting lightweight modules between transformer layers and only updating such modules upon downstream tasks — is a popular approach. In this work, the “adapters” the authors refer to are not such modules, but rather a selection of layers from the pertained model. The authors clearly know this term overlap, as there are even combo experiments on offsite-tuning + adapter-tuning (Table 5).
Given both approaches are within the realm of parameter-efficient finetuning. I’d encourage the authors to find an alternative term for your “adapter” to avoid potential confusion and ambiguities.
A couple of preliminary examples I can come up with are “bridging/pluggable/relay/alignment/shared + layers/units/components.” Hope it helps!
[1] Houlsby et al., Parameter-efficient transfer learning for NLP. ICML 2019.
The text was updated successfully, but these errors were encountered:
Great work with an elegant but effective idea! Thanks for sharing. However, I have a minor suggestion.
It is well-known that in the LLM finetuning paradigm, adapter-tuning [1] — done by inserting lightweight modules between transformer layers and only updating such modules upon downstream tasks — is a popular approach. In this work, the “adapters” the authors refer to are not such modules, but rather a selection of layers from the pertained model. The authors clearly know this term overlap, as there are even combo experiments on offsite-tuning + adapter-tuning (Table 5).
Given both approaches are within the realm of parameter-efficient finetuning. I’d encourage the authors to find an alternative term for your “adapter” to avoid potential confusion and ambiguities.
A couple of preliminary examples I can come up with are “bridging/pluggable/relay/alignment/shared + layers/units/components.” Hope it helps!
[1] Houlsby et al., Parameter-efficient transfer learning for NLP. ICML 2019.
The text was updated successfully, but these errors were encountered: