Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible shaping error on _add_loss_op() in model.py #34

Open
thefirebanks opened this issue Jul 9, 2019 · 1 comment
Open

Possible shaping error on _add_loss_op() in model.py #34

thefirebanks opened this issue Jul 9, 2019 · 1 comment

Comments

@thefirebanks
Copy link

thefirebanks commented Jul 9, 2019

Hello!

I'm running into a reshaping error when using RL and intermediate rewards.

The output of intermediate_rewards() is a # list of max_dec_step * (batch_size, k)(line 241)

and then this is stacked and has shape (batch_size, k) - stored in self.sampling_discounted_rewards.

But then in _add_loss_op(), you iterate k times and append:

for _ in range(self._hps.k):
    self._sampled_rewards.append(self.sampling_discounted_rewards[:, :, _]) # shape (max_enc_steps, batch_size)

But the index [:, :, _] would run into a dimension error because the shape of self.sampling_discounted_rewards is (batch_size, k).

Am I missing something here? What should be the correct shape/reshaping? Thank you for uploading this code!

@thefirebanks
Copy link
Author

Possible solution:

Change lines 414 and 427 of attention_decoder.py from

if FLAGS.use_discounted_rewards:

to

if FLAGS.use_discounted_rewards or FLAGS.use_intermediate_rewards:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant