Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binning for mutual info calculation #12

Open
pmorerio opened this issue Feb 22, 2018 · 8 comments
Open

Binning for mutual info calculation #12

pmorerio opened this issue Feb 22, 2018 · 8 comments

Comments

@pmorerio
Copy link

pmorerio commented Feb 22, 2018

Hi, I have been running your code with the option -activation_function=1, i.e. with ReLUs according to the README. However, I noticed that binning for mutual info calculation is performed in between -1 and 1 independent of the activation function.

bins = np.linspace(-1, 1, num_of_bins)

However, this looks coherent with tanh activations only.

@pmorerio
Copy link
Author

Hi again. I changed the line of code above in order to have a bin interval which should better reflect the asymmetric range of values of ReLUs (approximately; I may be missing some bigger values).

bins = np.linspace(0, 10, num_of_bins)

Since this increases the range, I also considered more bins with -num_of_bins=100.

This is what I get.

figure

@choasma
Copy link

choasma commented Feb 24, 2018

i think somewhat you have similar figure compared with the reviewing paper
https://openreview.net/forum?id=ry_WPG-A-
see Figure 1. (B)
while there's few comments below and worth to check

@pmorerio
Copy link
Author

@gladiator8072 yes indeed. Thanks. I have read the paper and the comments. I am inclined to think that the compression phase is to be ascribed to the saturation of the tanh and to this wrong binning for the ReLU. What do you think?

@choasma
Copy link

choasma commented Feb 25, 2018

not_so_sure, except bin range, the number of bins might cause some noise effects and also relu is unbound on positive side, implies that the range of such direction is not fixed.
actually they also defends the relu result in the comments: https://goo.gl/VrRm5W. which shows the relu also has compression phase. i guess the repo just not quite up-to-date
(as the comment is pushed at 04/Jan where the repo is few months before updated)

please let me know you find some secret :D

@pmorerio
Copy link
Author

Thanks, I read that document. Naftali Tishby and @ravidziv state that the authors of the ICLR paper

don’t know how to estimate mutual information correctly

However, they do not explain how such information should be estimated in practice, and the code they provide here certainly does a wrong estimate for ReLUs. It would be great to have an update of this repo which replicates the results they show. This would clarify everything and shed some light.

@pmorerio
Copy link
Author

pmorerio commented Feb 26, 2018

@gladiator8072 The code for that ICLR paper is here https://github.com/artemyk/ibsgd

@choasma
Copy link

choasma commented Feb 27, 2018

awesome! i'm going to study this!! yeah... they didn't mention the correct estimation very detail

@ravidziv
Copy link
Owner

ravidziv commented Dec 4, 2018

Hi,
Sorry for the late response
There are a may papers right now the try to calculate the mutual information with ReLU. It is a very complicated problem. This paper explains why the calculations of Saxe et el is not a good estimation. You can look at this paper for example how to calculate mutual information in networks, they used PixelCNN++ do estimate the information on MNIST and convolutional networks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants