Skip to content

Beam Search 8.Revisit Beam Search Discussion Again

Higepon Taro Minowa edited this page Jul 2, 2017 · 21 revisions

The discussion

I think we can reuse my_seq2seq.py.

steps

steps again :)

  • 1.Compare seq2seq_model.py (1.1) with seq2seq_model.py (??)
    • done 1.Find matching seq2seq_odel.py (??)
    • done 1.Take diff (human eyes diff)
    • done 1.Find difference
    • done 1.apply diff to seq2seq_model.py (1.1)
  • 1.Compare seq2seq.py at r1.1 and my_seq2seq.py
    • done 1.Find matching version
    • done 1.Take diff for each functions
    • done 1.Apply the diff to seq2seq.py for 1.1 and rename it to my_seq2seq.py
    • done Connect seq2seq_r11.py with current implementation to see if it works. Commited.
    • Connect it with beam search enabled to see if it works
    • have credit to the original author.

ValueError: Shape must be rank 2 but is rank 1 for 'model_with_buckets/sequence_loss/sequence_loss_by_example/sampled_softmax_loss/MatMul_1' (op: 'MatMul') with input shapes: [?], [?,256].

  • mutmal error between inputs and sampled_w
    • where the sample_w is coming from?
      • [?,256] comes from labels sampled
      • looks like inputs [?] is wrong because it says "inputs has shape [batch_size, dim]" .
    • Then where the inputs come from?
    • let's have a breakpoint in sampled_loss_function and see stacktrace.
  • beam_search=True
    • Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax:0", shape=(?,), dtype=int64)
      • <class 'list'>: [<tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax:0' shape=(?,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_1:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_2:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_3:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_4:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_5:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_6:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_7:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_8:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_9:0' shape=(10,) dtype=int64>]

  • beam_search=False
    • Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/BiasAdd:0", shape=(?, 256), dtype=float32)
      • <class 'list'>: [<tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_1/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_2/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_3/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_4/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_5/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_6/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_7/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_8/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_9/BiasAdd:0' shape=(?, 256) dtype=float32>]
  • True Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_1:0", shape=(10,), dtype=int64)
  • False 01 = {Tensor} Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_1/BiasAdd:0", shape=(?, 256), dtype=float32)
  • beam_attention_decoder is returning something different than attention_decoder
  • in my_seq2seq, what is difference between the two functions
  • _extract_beam_search and _extract_argmax_and_embed
Clone this wiki locally