Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DoRA training not taking dropout or alpha into account #68

Open
BenjaminBossan opened this issue Aug 15, 2024 · 0 comments
Open

DoRA training not taking dropout or alpha into account #68

BenjaminBossan opened this issue Aug 15, 2024 · 0 comments

Comments

@BenjaminBossan
Copy link

I think there is a bug in the DoRA implementation as it takes neither lora_dropout nor lora_alpha into account. These arguments are passed as *args to the __init__ call of the DoRA layers but subsequently ignored inside of dora.py. This can be easily missed as the DoRA paper does not include them in their equations, but they are mentioned elsewhere in the paper and should be applied the same as in the LoRA implementation.

Also note that lora_dropout is only applied to the LoRA/DoRA output, not the base model output, which I believe has an impact on these lines, as they currently assume that the same x is used for the base layer and the DoRA part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant