Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_loss, train_acc, valid_loss, valid_acc not changing when training #34

Open
liaoandi opened this issue May 12, 2018 · 6 comments
Open

Comments

@liaoandi
Copy link

liaoandi commented May 12, 2018

When I tried to do part 1 -- 3 initial test, train_loss, train_acc, valid_loss, valid_acc stays the same. Is it normal, or did I do something wrong?
And for this line of instruction "Initialize with 512 hidden units apiece (except for the last layer)": is the parameter 512 units here for input layer, or for hidden layers only, or for both input and hidden layers? It seems that using 512 units for hidden layers generate really bad accuracy results.

@bensoltoff
Copy link
Contributor

No, they should be changing values. As for the second question, that is 512 units for all the layers except the final one which has to have 10 to generate probability scores for each digit. Define "really bad accuracy results" - what kinds of values are you getting for accuracy? They should be somewhere above 95% across all epochs

@liaoandi
Copy link
Author

liaoandi commented May 12, 2018

  • Actually, I am not sure I have the correct understanding of layers or instructions of initial model here.

  • In the deep learning textbook chapter 2.1, the network structure for MNIST dataset is like this:

network = models.Sequential()

network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))

network.add(layers.Dense(10, activation='softmax'))

  • And running this command:

network.summary()

I got this:

Layer (type) Output Shape Param #
dense_4 (Dense) (None, 512) 401920
dense_5 (Dense) (None, 10) 5130
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0

  • It seems to be two dense layers in this model, and the training accuracy and validation accuracy can reach above 95%. But if another dense layer is added,
    network.add(layers.Dense(512, activation='relu'))
    the training and validation accuracy will decrease to around 0.6 or 0.7.

  • If "5 dense, fully-connected layers" is used, which I interpreted as one input layer, three hidden layers, and one output layer:

network = models.Sequential()

network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))

network.add(layers.Dense(512, activation='relu'))

network.add(layers.Dense(512, activation='relu'))

network.add(layers.Dense(512, activation='relu'))

network.add(layers.Dense(10, activation='softmax'))

the training accuracy and validation accuracy will decrease to about 0.1 and that is the "bad result" I mentioned before.

@bensoltoff
Copy link
Contributor

There doesn't appear to be anything wrong with the model code. Are you looking at the correct metrics? The accuracy measures how often the model prediction is correct - that should be above 95% for the initial model in the problem set. The loss function is categorical crossentropy - it should have a value rising up to about 2.5 for the validation set by the last epoch

@liaoandi
Copy link
Author

I believe I am looking at the correct metrics... I did the 5-layer version and 2-layer version, and pushed the result in my repo-PS3 folder, as the result is too long to paste it here. The result is in the first few blocks of the ipynb file.

@bensoltoff
Copy link
Contributor

I'll try to clone and run your notebook on my Amazon instance to see if I get the same results. I'll let you know how that goes once I get some time to attempt it

@liaoandi
Copy link
Author

Okay figured it out. It is the environment problem of RCC. The code is working fine in other environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants