Various questions about implementation #239

cestpasphoto · 2021-03-26T16:48:17Z

cestpasphoto
Mar 26, 2021

Hi,

First, thank you very much for all contributors to this repo which is highly pedagogical!
As a newbie in reinforcement learning and machine learning in general, I started to fork your project and during this, it raised some questions to me:

Random generation of batch
In NNet.py, samples for each batch are randomly sampled independently of previous batches of same epoch. Pytorch and Tensorflow implementations share same technic.
But in this stackoverflow answer, training code is different and ensure no common sample between batches inside same epoch. Are both approaches acceptable?
Double random?
In Coach.py, trainExamples is shuffled before calling nnet.train(trainExamples). But what is the point if batches are sampled completely randomly from it?
No use of lr variable?
It looks like learning rate parameter lr isn't used at all, especially not in Adam optimizer constructor, same in tensorflow implementation. Did I missed something?
Optimizer history
In NNet.py, Adam optimizer is re-created at each iteration. Wouldn't it be better to keep it and reuse it? Maybe the optimizer has some kind of history and it's better not to start from scratch at each training?

Thank you in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various questions about implementation #239

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Various questions about implementation #239

cestpasphoto Mar 26, 2021

Replies: 0 comments

cestpasphoto
Mar 26, 2021