Logs27: Mutual Information

Steps

Wait ... we'll have to use conversations.db finally? because we need p_seq2seq(a| pi, qi)
Fully understand MI
- Read the original paper
  - P_MI is trained by caliculating MI between source and target.
  - P_RL is trained by RL agents (so that they can get dialogue history)?
  - Let's check the existing implmentation.
- Understand where pi, qi comes from in the training
  - pi let's eat curry
  - qi How about kokoichi
  - pi+1 sounds good
Start always with small model.
Have backward seq2seq training in place.
Find old implementation of mutual information.