Skip to content

Logs27: Mutual Information

Higepon Taro Minowa edited this page Jun 13, 2018 · 30 revisions

Steps

  1. Training data set
    • p_i: "Let's have curry for lunch."
    • q_i: "Maybe Coco ichi?"
    • p_i+1: "Sounds good."
  2. Train seq2seq
    • X: concat(p_i, q_i)
    • Y: p_i+1
  3. Train seq2seq_backward
    • X: p_i+1
    • Y: q_i
  4. RL Training
    1. Beam Search
      • X: concat(p_i, q_i) [batch_size, decoder_length]
      • beam_replies: [batch_size, decoder_length, beam_width]
      • logits: [batch_size, decoder_length, vocab_size]
    2. Calc reward
    3. Get log_prob: [batch_size, decoder_length, beam_width]

OLD

Steps

  • done Make it possible that beam coexists with infer
  • Return infer_logis when beam search
  • Get logits for predicted_id
  • Have beam_logits.
  • Refactoring
    • extract attention method.
    • Unify the model class?
  • Confirm beam_logits is same size as logits and same values.
  • for one beam search result get indices
  • Fetch logprob from the indices
  • reward back? or make it for multiple.
- Wait ... we'll have to use conversations.db finally? because we need p_seq2seq(a| pi, qi) - Fully understand MI - Read the original paper - Read the original original paper - we did not train a joint model (log p(T|S)−λ log p(T)), but instead trained maximum likelihood models, and used the MMI criterion only during testing. - P_MI is trained by caliculating MI between source and target. - P_RL is trained by RL agents (so that they can get dialogue history)? - Let's check the existing implmentation. - Understand where pi, qi comes from in the training - pi let's eat curry - qi How about kokoichi - pi+1 sounds good - Start always with small model. - Have backward seq2seq training in place. - Find old implementation of mutual information.

MI steps

  • Build MI model, this is happening when decoding best N results and mutual information.
Clone this wiki locally