You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think there is a bug in the DoRA implementation as it takes neither lora_dropout nor lora_alpha into account. These arguments are passed as *args to the __init__ call of the DoRA layers but subsequently ignored inside of dora.py. This can be easily missed as the DoRA paper does not include them in their equations, but they are mentioned elsewhere in the paper and should be applied the same as in the LoRA implementation.
Also note that lora_dropout is only applied to the LoRA/DoRA output, not the base model output, which I believe has an impact on these lines, as they currently assume that the same x is used for the base layer and the DoRA part.
The text was updated successfully, but these errors were encountered:
I think there is a bug in the DoRA implementation as it takes neither
lora_dropout
norlora_alpha
into account. These arguments are passed as*args
to the__init__
call of the DoRA layers but subsequently ignored inside of dora.py. This can be easily missed as the DoRA paper does not include them in their equations, but they are mentioned elsewhere in the paper and should be applied the same as in the LoRA implementation.Also note that
lora_dropout
is only applied to the LoRA/DoRA output, not the base model output, which I believe has an impact on these lines, as they currently assume that the samex
is used for the base layer and the DoRA part.The text was updated successfully, but these errors were encountered: