Designing an API for diffusion models #1

francois-rozet · 2024-07-30T15:00:39Z

francois-rozet
Jul 30, 2024
Maintainer

Hello everyone 👋

This discussion is the continuation of probabilists/zuko#52. I have created the Azula repository. Its goal is to unify the different formalisms and notations of the generative diffusion models literature into a single, convenient and hackable interface. I have written a first draft for the API.

This post has been edited to take into account the changes in Azula 0.1.0.

Formalism

In Azula's formalism, a diffusion model is the composition of three elements: a noise schedule, a denoiser and a sampler.

A noise schedule is a mapping from a time $t \in [0, 1]$ to the signal scale $\alpha_t$ and the noise scale $\sigma_t$ in a perturbation kernel $p(X_t \mid X) = \mathcal{N}(X_t \mid \alpha_t X_t, \sigma_t^2 I)$ from a "clean" random variable $X \sim p(X)$ to a "noisy" random variable $X_t$.

Because $\alpha_t$ and $\sigma_t$ are not explicitly linked, any noise schedule can be implemented, such as the variance exploding ($\alpha_t = 1$) or variance preserving ($\sigma_t^2 = 1 - \alpha_t^2$) schedules.
A denoiser is a neural network trained to predict $X$ given $X_t$. In practice, it is a Gaussian denoiser $q_\phi(X \mid X_t) = \mathcal{N}(X \mid \mu_\phi(X_t), \Sigma_\phi(X_t))$. Different implementation parameterize the mean $\mu_\phi(X_t)$ and the covariance $\Sigma_\phi(X_t)$ differently.
A sampler defines a series of transition kernels $q_\phi(X_s \mid X_t)$ from $t$ to $s < t$ based on a noise schedule and a denoiser $q_\phi(X \mid X_t)$. Simulating these transitions from $t = 1$ to $0$ samples approximately from $p(X)$.

API

The API of the azula package closely follows this formalism and defines three core componenets: azula.noise.Schedule, azula.denoise.Denoiser, and azula.sample.Sampler.

In addition, the azula.guidance submodule implements guidance and posterior sampling algorithms. Finally, the azula.plugins submodule hosts contributed code and compatibility wrappers for other diffusion model libraries. For example azula.plugins.adm allows to load pre-trained diffusion models from the openai/guided-diffusion repository and use them with the same convenient interface.

Questions

What do you think of the current API? Is it convenient for you? What would you add/change?
What do you think of the README/docs/logo? What tutorials should we add?
Would you like to contribute? If yes, what part (architectures, sampling algorithms, guidance algorithms, compatibility wrappers, ...)?

Feel free to ask more questions!

gerome-andry · 2024-08-20T11:43:29Z

gerome-andry
Aug 20, 2024

Sounds good to me! We already discussed a lot of points for which features are coming soon.

I will be pleased to help for tutorials.

2 replies

francois-rozet Aug 22, 2024
Maintainer Author

Thank you @gerome-andry! There are two tutorials currently, but they are not commented. Do you want to work on that? I am also looking for a tutorial similar to Zuko's "basics" tutorial.

gerome-andry Aug 22, 2024

I'll take a look!
This should be within my reach.

francois-rozet · 2024-08-22T16:20:49Z

francois-rozet
Aug 22, 2024
Maintainer Author

Hello all 👋

I've made a lot of progress on the interface and the repo in general (tests, docs, contributing guidelines, README, ...). I consider the current version (0.1.0) as the first beta version of Azula. I'm now looking for feedback (see questions in the main post).

cc @gerome-andry, @JuliaLinhart, @bkmi, @michaeldeistler, @janfb, @simonschnake, @blt2114

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Designing an API for diffusion models #1

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Designing an API for diffusion models #1

francois-rozet Jul 30, 2024 Maintainer

Formalism

API

Questions

Replies: 2 comments · 2 replies

gerome-andry Aug 20, 2024

francois-rozet Aug 22, 2024 Maintainer Author

gerome-andry Aug 22, 2024

francois-rozet Aug 22, 2024 Maintainer Author

francois-rozet
Jul 30, 2024
Maintainer

Replies: 2 comments 2 replies

gerome-andry
Aug 20, 2024

francois-rozet Aug 22, 2024
Maintainer Author

francois-rozet
Aug 22, 2024
Maintainer Author