-
Notifications
You must be signed in to change notification settings - Fork 19
Beam Search 8.Revisit Beam Search Discussion Again
Higepon Taro Minowa edited this page Jul 2, 2017
·
21 revisions
I think we can reuse my_seq2seq.py.
- Look close the discussion
- see if it works for Python 3
- easy to port it over to mine?
- This is working seq2seq. https://github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py
- And this should work with Python 3.
- 1.Compare seq2seq_model.py (1.1) with seq2seq_model.py (??)
- done 1.Find matching seq2seq_odel.py (??)
- done 1.Take diff (human eyes diff)
- done 1.Find difference
- done 1.apply diff to seq2seq_model.py (1.1)
- 1.Compare seq2seq.py at r1.1 and my_seq2seq.py
- done 1.Find matching version
- done 1.Take diff for each functions
- done 1.Apply the diff to seq2seq.py for 1.1 and rename it to my_seq2seq.py
- done Connect seq2seq_r11.py with current implementation to see if it works. Commited.
- Connect it with beam search enabled to see if it works
- have credit to the original author.
ValueError: Shape must be rank 2 but is rank 1 for 'model_with_buckets/sequence_loss/sequence_loss_by_example/sampled_softmax_loss/MatMul_1' (op: 'MatMul') with input shapes: [?], [?,256].
- mutmal error between inputs and sampled_w
- where the sample_w is coming from?
- [?,256] comes from labels sampled
- looks like inputs [?] is wrong because it says "inputs has shape [batch_size, dim]" .
- Then where the inputs come from?
- let's have a breakpoint in sampled_loss_function and see stacktrace.
- where the sample_w is coming from?
- beam_search=True
- Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax:0", shape=(?,), dtype=int64)
-
<class 'list'>: [<tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax:0' shape=(?,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_1:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_2:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_3:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_4:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_5:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_6:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_7:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_8:0' shape=(10,) dtype=int64>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_9:0' shape=(10,) dtype=int64>]
-
- Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax:0", shape=(?,), dtype=int64)
- beam_search=False
- Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/BiasAdd:0", shape=(?, 256), dtype=float32)
- <class 'list'>: [<tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_1/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_2/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_3/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_4/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_5/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_6/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_7/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_8/BiasAdd:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_9/BiasAdd:0' shape=(?, 256) dtype=float32>]
- Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/BiasAdd:0", shape=(?, 256), dtype=float32)
- True Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/ArgMax_1:0", shape=(10,), dtype=int64)
- False 01 = {Tensor} Tensor("model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection_1/BiasAdd:0", shape=(?, 256), dtype=float32)
- beam_attention_decoder is returning something different than attention_decoder
- in my_seq2seq, what is difference between the two functions
- _extract_beam_search and _extract_argmax_and_embed