-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train_loss, train_acc, valid_loss, valid_acc not changing when training #34
Comments
No, they should be changing values. As for the second question, that is 512 units for all the layers except the final one which has to have 10 to generate probability scores for each digit. Define "really bad accuracy results" - what kinds of values are you getting for accuracy? They should be somewhere above 95% across all epochs |
I got this: Layer (type) Output Shape Param #
the training accuracy and validation accuracy will decrease to about 0.1 and that is the "bad result" I mentioned before. |
There doesn't appear to be anything wrong with the model code. Are you looking at the correct metrics? The accuracy measures how often the model prediction is correct - that should be above 95% for the initial model in the problem set. The loss function is categorical crossentropy - it should have a value rising up to about 2.5 for the validation set by the last epoch |
I believe I am looking at the correct metrics... I did the 5-layer version and 2-layer version, and pushed the result in my repo-PS3 folder, as the result is too long to paste it here. The result is in the first few blocks of the ipynb file. |
I'll try to clone and run your notebook on my Amazon instance to see if I get the same results. I'll let you know how that goes once I get some time to attempt it |
Okay figured it out. It is the environment problem of RCC. The code is working fine in other environments. |
When I tried to do part 1 -- 3 initial test, train_loss, train_acc, valid_loss, valid_acc stays the same. Is it normal, or did I do something wrong?
And for this line of instruction "Initialize with 512 hidden units apiece (except for the last layer)": is the parameter 512 units here for input layer, or for hidden layers only, or for both input and hidden layers? It seems that using 512 units for hidden layers generate really bad accuracy results.
The text was updated successfully, but these errors were encountered: