Possible shaping error on _add_loss_op() in model.py #34

thefirebanks · 2019-07-09T21:25:53Z

Hello!

I'm running into a reshaping error when using RL and intermediate rewards.

The output of intermediate_rewards() is a # list of max_dec_step * (batch_size, k)(line 241)

and then this is stacked and has shape (batch_size, k) - stored in self.sampling_discounted_rewards.

But then in _add_loss_op(), you iterate k times and append:

for _ in range(self._hps.k):
    self._sampled_rewards.append(self.sampling_discounted_rewards[:, :, _]) # shape (max_enc_steps, batch_size)

But the index [:, :, _] would run into a dimension error because the shape of self.sampling_discounted_rewards is (batch_size, k).

Am I missing something here? What should be the correct shape/reshaping? Thank you for uploading this code!

The text was updated successfully, but these errors were encountered:

thefirebanks · 2019-07-10T22:01:49Z

Possible solution:

Change lines 414 and 427 of attention_decoder.py from

if FLAGS.use_discounted_rewards:

to

if FLAGS.use_discounted_rewards or FLAGS.use_intermediate_rewards:

Provide feedback