Learning with reparameterized gradients for native pytorch modules.
Mathematical formulations of learning with samples from reparameterized distributions separate posteriors
We achieve full separation of sampling procedures from network structures by implementing custom procedure for loading state dictionary (pytorch's default load_state_dict loses gradients of samples) for an arbitrary network's parameters.
The library can be installed using:
pip install git+https://github.com/tkusmierczyk/reparameterized_pytorch.git#egg=reparameterized
For native pytorch modules it is impossible to pass at once multiple sampled parameter sets for a network. Hence, when we sampled more than one set, we need to loop over them using take_parameters_sample. In each iteration the forward operation is then repeated, which makes execution slower.
- Learn Normalizing Flows for BNN with a single wide hidden layer and Matern52-like activation (compared against MCMC baseline in Pyro)
- Learn Normalizing Flow for Bayesian linear regression (13 dimensions; using BNN wrapper class)
- Learn full-rank Normal for Bayesian linear regression
- Learn factorized Normal for Bayesian linear regression
- Minimize KL(q|p) for q modeled as Bayesian Hypernetwork
- Minimize KL(q|p) for q modeled as RealNVP flow
- Minimize KL(q|p) for q and p being factorized Normal
RealNVP implementation is based on code from Jakub Tomczak. Code for flows includes contributions by Bartosz Wójcik [bartwojc(AT)gmail.com] and Marcin Sendera [marcin.sendera(AT)gmail.com].