Why

A lot of interesting information gets “summed away“ during mini-batch backpropogation. I'm curious to learn what this information can tell us.

What

The purpose of this project is to look at distributions of gradients and weight during mini-batch, before summations. These distributions can be marginalized in various ways to answer interesting questions. I used these uncertainty measures to alter the back prop. algorithm, and much more.

How

I implemented a quick and dirty neural network for the good ol' MNIST with PyTorch (which is quickly becoming my favorite library for prototyping) There is a slight tweak to this backprop. calculation which allows you to look at distributions of deltas that normally get reduce-summed in matmuls.

A blog post is in the works to explain the results of this project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Why

What

How

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Why

What

How