-
Notifications
You must be signed in to change notification settings - Fork 19
logs4:See if RL framework is working
Higepon Taro Minowa edited this page Apr 18, 2018
·
1 revision
- Log 1: what specific output am I working on right now?
- Before we go with complex RL, we should confirm if RL framework is working.
- Log 2: thinking out loud - e.g. hypotheses about the current problem, what to work on next
- One of the easiest way is giving higher reward when reply length is longer.
- by measuring average length by step, we should observe increase of average length.
- steps
- enable length reward
- log average length
- start RL
- Log 3: record of currently ongoing runs along with a short reminder of what question each run is supposed to answer
- if avg_len goes up eventually this is good sign.
- reward also should go up
- loss should converge
- if avg_len don't go up, there's something wrong :(
- Someone taught me reward should be averaged, because it has high variance.
- Paused here, because we have to fix log issue to see the results.
- Log 4: results of runs (TensorBoard graphs, any other significant observations), separated by type of run (e.g. by environment the agent is being trained in)
- TODO. This should be logged carefully.
- TODO model/README + ipnyb
- models20180414ReinforcementLearningLengthReward