Can someone explain how diffusers is implementing LoRA? #107

brian6091 · 2023-01-03T12:39:28Z

brian6091
Jan 3, 2023
Collaborator

There is a WIP PR huggingface/diffusers#1884 that is implementing LoRA in the diffusers library. Can someone tracking this explain why one has to alter the CrossAttention classes?

How does that compare to how it is currently implemented in @cloneofsimo 's library? Is cross-attention somehow applied incorrectly (or not at all) when simply "injecting" the LoRA up/down matrices and enabling training on these parameters? I'm confused here since the output of the LoRA adjustment has the same dimensionality as the original?
What about self-attention in the text encoder?
This kind of buries the weight adjustment into the CrossAttention call, which seems less general and less easy to follow.
What about the applying LoRA to other modules? By default, LoRA is applied to the GEGLU module. Is there any other equivalent change that has to be made (other than injecting the up/down matrices and enabling their training)?

@yasyf , I see you're working on a FLAX version, any insight here?

Thanks much in advance.

cloneofsimo · 2023-01-03T22:00:16Z

cloneofsimo
Jan 3, 2023
Maintainer

I would like to know this as well. Seems like what you have pointed out is all valid point.

0 replies

brian6091 · 2023-01-03T23:05:55Z

brian6091
Jan 3, 2023
Collaborator Author

I guess one possible reason is that as a defined part of the standard diffusers implementation, you can just save the model and load it and it will work out of the box without having to patch the pipe. But that seems to negate two things that I found so cool about LoRA: 1) the convenience of tiny files storing just the LoRA weights (with the minor inconvenience of having to patch them into the standard model), and 2) the fun of just applying LoRA in all sorts of different places and observing what happens (without having to rewrite classes to make it work).

If I could convince myself (or someone could clue me in) that the math works out the same, then I would definitely stick with the way you've set it up here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can someone explain how diffusers is implementing LoRA? #107

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Can someone explain how diffusers is implementing LoRA? #107

brian6091 Jan 3, 2023 Collaborator

Replies: 2 comments

cloneofsimo Jan 3, 2023 Maintainer

brian6091 Jan 3, 2023 Collaborator Author

brian6091
Jan 3, 2023
Collaborator

cloneofsimo
Jan 3, 2023
Maintainer

brian6091
Jan 3, 2023
Collaborator Author