Fix QAT resume with BN models, checkpoint name #260

oguzhanbsolak · 2023-10-17T20:00:00Z

I noticed these bugs recently and this pr is independent of the bug fixes and updates I mentioned in last week's meeting.

When resuming a checkpoint, a new optimizer is created from the model. As batchnorm fusing reduces models' parameter sizes, when resuming from a qat_checkpoint new optimizer have less parameters than the optimizer state_dict at the checkpoint.
Therefore, I added a update_optimizer function to call at initiate_qat state to strip off the batchnorm parameters from the optimizer.

Also, after resuming a qat_checkpoint, checkpoint names were incorrect. This pr includes a simple fix for that as well.

Limitations:

This solution is tested for a single params_group case. As our current train.py structure always produces a single params_group, it should be okay. However, if we ever add functionality to have different learning rates for different layers, we should check this function as well.

Bugfixes: Qat resume with BN models, ckpt name

ermanok

Looks good.

* Add AutoEncoder Model and Evaluation Notebook (#260)

oguzhanbsolak added 4 commits October 17, 2023 20:59

Bugfixes: Qat resume with BN models, ckpt name

32f3bc5

linter updates

3d8dd76

linter updates

c3b0298

Merge pull request #10 from oguzhanbsolak/pr259

93c237d

Bugfixes: Qat resume with BN models, ckpt name

oguzhanbsolak requested review from rotx-eva, alicangok, ermanok, MaximGorkem, seldauyanik-maxim and asyatrhl October 17, 2023 20:00

ermanok approved these changes Oct 18, 2023

View reviewed changes

rotx-eva approved these changes Oct 18, 2023

View reviewed changes

rotx-eva changed the title ~~Bugfixes: Qat resume with BN models, ckpt name~~ Fix QAT resume with BN models, checkpoint name Oct 18, 2023

rotx-eva merged commit 87351b3 into analogdevicesinc:develop Oct 18, 2023
3 of 4 checks passed

rotx-eva pushed a commit that referenced this pull request Mar 8, 2024

Refactor Sample Motor Data Limerick Dataloader (#293)

9331bc1

* Add AutoEncoder Model and Evaluation Notebook (#260)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix QAT resume with BN models, checkpoint name #260

Fix QAT resume with BN models, checkpoint name #260

oguzhanbsolak commented Oct 17, 2023

ermanok left a comment

Fix QAT resume with BN models, checkpoint name #260

Fix QAT resume with BN models, checkpoint name #260

Conversation

oguzhanbsolak commented Oct 17, 2023

ermanok left a comment

Choose a reason for hiding this comment