Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diffusers implementation #9

Open
alexblattner opened this issue Sep 19, 2023 · 3 comments
Open

diffusers implementation #9

alexblattner opened this issue Sep 19, 2023 · 3 comments

Comments

@alexblattner
Copy link

Hi, is there a diffusers implementation somewhere? if not, I'd be happy to implement with some guidance

@dvruette
Copy link
Contributor

Hi Alex, this repo actually uses diffusers in the background, it’s just a bit hacky and doesn’t imlement a proper pipeline. Since the purpose of this codebase is to document the research, we’re probably not going to continue working on it, but it shouldn’t be too hard to package the existing code into a proper diffusers pipeline (much of it is just copied and adapted from the diffusers StableDiffusionPipeline).

I’m happy to answer any questions if you decide to give it a shot!

@alexblattner
Copy link
Author

thanks a lot @dvruette! I honestly do not know where to start. Where should I look at specifically to implement this code? Also, could you give a step by step explanation of how the code works (without the UI and other unnecessary stuff).

I really appreciate the time and am available to discuss on discord if you're fine with it: alexblattner

@dvruette
Copy link
Contributor

The main thing to take a look at is the AttentionBasedGenerator.generate function here. This is the main generation loop and where feedback is fed into the diffusion process. It roughly has the following steps:

  1. Resize feedback images and encode them to latent space
  2. In each diffusion step:
    1. Add noise to the feedback images
    2. Run the noised feedback images through the U-Net and store all activations in the self-attention layers
    3. Run the partially denoised image batch through the U-Net. In every self-attention layer, concatenate the pre-computed activations from the feedback images to the context (positive feedback for conditional images, negative feedback for unconditional images)
  3. Decode the final latents from latent space to get the generated images

In case you're unfamiliar with the vanilla diffusion process (without FABRIC), I recommend taking a look at the StableDiffusionPipeline from the official diffusers GitHub. Our code is based on and adapted from it:

https://github.com/huggingface/diffusers/blob/e312b2302b5445271198ebed8f2fbcd543633f78/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L70

I also wrote you on Discord, feel free to ping me if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants