Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: summary mode=None keeps current mode (#331) #337

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ego-thales
Copy link

@ego-thales ego-thales commented Nov 29, 2024

  • LGTM

Closes #331.

Changes the behaviour of mode=None in summary.

Before: used global default Mode.EVAL
Now: keeps current model mode.
Note that this changes the default behaviour.


I couldn't pass tests/torchinfo_xl_test.py::test_flan_t5_small because my proxy didn't allow model downloading. Also, I didn't test on GPU so GPU tests were not ran.

Side questions: While running the tests, I noticed that depending on the mode ("train" or "eval"), the summary would sometimes show or not show the top level number of parameters:

# EVAL MODE
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
GoogLeNet                                [1, 1000]                 6,379,984
├─BasicConv2d: 1-1                       [1, 64, 56, 56]           --
│    └─Conv2d: 2-1                       [1, 64, 56, 56]           9,408
...
# TRAIN MODE
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
GoogLeNet                                [1, 1000]                 --
├─BasicConv2d: 1-1                       [1, 64, 56, 56]           --
│    └─Conv2d: 2-1                       [1, 64, 56, 56]           9,408
...

Is there a specific reason to this behaviour?

Furthermore, in this case, the top level shows 6.4M params, while the total count shows 13M params and Google claims around 6.xM params. Any lead to understand this?

Thanks!

@TylerYep
Copy link
Owner

Are there other modes besides train and eval? I'm wondering why we are deleting the enum - we should probably keep that around.

@eliegoudout
Copy link

Are there other modes besides train and eval?

To my knownledge, no.

I'm wondering why we are deleting the enum - we should probably keep that around.

Well, I feel like with the new behaviour, it does not make sense to create this class which would only be used exacly once. I think it brings more confusion than clarity or simplicity (as can probably be seen by the fact that PR removes ~15 lines of code).
Why do you feel it would be a good idea to keep it? I don't have a strong opinion, I was trying to avoid superfluous architecture.

@TylerYep
Copy link
Owner

torchinfo is typed, so I prefer passing enum values over strings when possible. To most callers, it makes no difference, but I think it makes the internals easier to maintain in the long term, since it clearly states what string values are allowed from the API spec.

Could you also update the README with this typing change as well? Thanks!

@ego-thales
Copy link
Author

I followed your preference and implemented the change using Mode Enum. There are infinitely many ways of doing so, but to keep changes minimal, since your enum subclasses str, I decided to use "same" instead of None for this new option.

If this is an issue, feel free to build on this PR!

Also, I couldn't run pre-commit because of my proxy issue, so I commited with --no-verify. I think it is reasonable given the number and simplicity of changed lines.

All the best.
Élie

Note: I noticed that the README.md example outputs don't match the indentation I get when I run them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow mode=None to keep current mode?
3 participants