Skip to content

logs4:See if RL framework is working

Higepon Taro Minowa edited this page Apr 18, 2018 · 1 revision

See if RL framework is working 201804/15 (paused)

  • Log 1: what specific output am I working on right now?
    • Before we go with complex RL, we should confirm if RL framework is working.
  • Log 2: thinking out loud - e.g. hypotheses about the current problem, what to work on next
    • One of the easiest way is giving higher reward when reply length is longer.
    • by measuring average length by step, we should observe increase of average length.
    • steps
      • enable length reward
      • log average length
      • start RL
  • Log 3: record of currently ongoing runs along with a short reminder of what question each run is supposed to answer
    • if avg_len goes up eventually this is good sign.
    • reward also should go up
    • loss should converge
    • if avg_len don't go up, there's something wrong :(
    • Someone taught me reward should be averaged, because it has high variance.
    • Paused here, because we have to fix log issue to see the results.
  • Log 4: results of runs (TensorBoard graphs, any other significant observations), separated by type of run (e.g. by environment the agent is being trained in)
    • TODO. This should be logged carefully.
    • TODO model/README + ipnyb
    • models20180414ReinforcementLearningLengthReward
Clone this wiki locally