Home

Jump to bottom

Higepon Taro Minowa edited this page May 17, 2018 · 213 revisions

Seq2Seq RL Chatbot implementation notes

Links

Algorithm for Reinforcement Learning
Log Template
logs13:Understand Policy Gradient in progress

ideas & concerns

client id on the chart is broken.
log policy entropy
limit iteration to see if the learning works.
save iterator state

History