Skip to content

logs35:RL finally works

Higepon Taro Minowa edited this page Aug 4, 2018 · 4 revisions

parameters

  • Learning rate 0.01 -> 0.001 made the training stable.
  • Loss gradually decreased
  • Reward avg decreased (suspicious)
  • https://mega.nz/fm/LeZwWSgS
Clone this wiki locally