Skip to content
This repository has been archived by the owner on Jun 2, 2023. It is now read-only.

unifying multi-task implementation #106

Closed
jsadler2 opened this issue Jun 2, 2021 · 2 comments · Fixed by #104
Closed

unifying multi-task implementation #106

jsadler2 opened this issue Jun 2, 2021 · 2 comments · Fixed by #104

Comments

@jsadler2
Copy link
Collaborator

jsadler2 commented Jun 2, 2021

Currently, the LSTM and GRU implementation of multitask learning is different than the RGCN implementation.

  • LSTM and GRU -> combine the gradients of the two variables in the train_step function
  • RGCN -> combines the losses of the two variables. This happens under the "TF hood"

I think it makes sense to either choose one implementation or the other so as to simplify.

I lean toward the RGCN approach and just add the losses together. It is simpler and I think it will be sufficient for our needs. One drawback is that it makes it so you can't do the fancy gradient correction approach I was using before. But that didn't actually improve performance, and if we want to we can always go back and look at the code and resurrect it.

@aappling-usgs
Copy link
Member

I'm good with the RGCN approach given the exploration you've already done.

@jzwart
Copy link
Member

jzwart commented Jun 3, 2021

When adding in the state updating functionality, I thought the RGCN code was easier to follow. If the gradient correction doesn't improve performance, then I'd vote for the RGCN approach

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants