-
Notifications
You must be signed in to change notification settings - Fork 19
logs11:See if RL works with medium model2
- Run RL multiple times and see if it's stable.
-
CONCLUSION: I'm still not sure if it's working. Also not sure how I can validate it.
- Probably I should go back basic.
2: Thinking out loud - e.g. hypotheses about the current problem, what to work on next, how can I verify
- We saw great results in the past. We may need some luck to get it.
- Let's run 10 and see if we'd see it again.
- Note that we increase num_steps for base seq2seq, so that we know it converged.
3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer
-
run1: Trial 1
- Looks promising, but there are some future work. 1. seq2seq should be trained longer until it coverges. 2.RL also should be trained longer, sot that actual avg_len goes up.
-
run2: Trial 2
- Pretty similar to Trial 1, looks also promising.
-
run3 and 4 : Trial 3 & 4
- avg_reply_len(s) are flat in the end, which means they caught in local optima.
- I wonder if we trained seq2seq too much and RL didn't have a way to move out from the local optima.
-
run 5 and 6: Trial 5 & 6
- objective: See what happen if we don't train seq2seq much.
- didn't converge or issued any good responses.
-
run 7 and 8: Trial 7 & 8
- mimic run2, but training RL longer to see if it goes up.
- Similar to run 3 & 4, but slightly better.
- let's train less seq2seq and see how it goes.
-
run 9 and 10: Trial 9 & 10
- train less seq2seq
- didn't converge or reward didn't go up.
- I think RL training is slow and didn't have enough time.
- Let's have 4 times longer training
-
What to save. the hparams for comparison, and ipynb and related files.
4: Results of runs (TensorBoard graphs, any other significant observations), separated by type of run (e.g. by the environment the agent is being trained in)
-
{'machine': 'master', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large'} dst {'machine': 'master', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large_rl'}
- mega.nz directory: 20180430rl_test_medium7
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
src: {'machine': 'slave', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large'} dst {'machine': 'slave', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large_rl'}
-
mega.nz directory: 20180430rl_test_medium8
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large'} dst {'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
- mega.nz 20180501rl_test_medium9
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large'} dst {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
- 20180501rl_test_medium10
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 30, 'model_path': 'model/tweet_large'} dst {'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
- It didn't converge at all
-
ばいとおわ! [0]💩💩💩
[1]💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩 - 20180501rl_test_medium11
RL | ||
---|---|---|
-
{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 30, 'model_path': 'model/tweet_large'} dst {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
-
It didn't converge at all
-
ばいとおわ! [0]💩💩💩
[1]💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩 -
20180501rl_test_medium12
RL | ||
---|---|---|
-
{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large'} dst {'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 15600, 'model_path': 'model/tweet_large_rl'}
-
今回もよろしくです。
-
[0]wwwwww💩ちゃん、もう暇だな。!!
-
[1]wwwwww💩ちゃん、もう暇だな。マスター)これ、、、、、いいいいのな?
- 20180501rl_test_medium13
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 3120, 'model_path': 'model/tweet_large'} dst {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 15600, 'model_path': 'model/tweet_large_rl'}
-
今回もよろしくです。
-
[0]ほら(・ω・)スッこの今日は💩なんだけど!!
-
[1]ほら(・ω・)スッこの今日は💩なんだけど(^_^;)
- 20180501rl_test_medium14
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large'} dst {'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
-
今回もよろしくです。 [0]💩!
[1]💩!! -
20180501rl_test_medium15
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large'} dst {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 6240, 'model_path': 'model/tweet_large_rl'}
-
おはようございます。寒いですね。 [0]そうなな(˙-˙) [1]そうなな(˘ω˘)
-
20180501rl_test_medium16
seq2seq | ||
---|---|---|
RL | ||
---|---|---|
-
{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large'}\n", "dst\n", "{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 31200, 'model_path': 'model/tweet_large_rl'}
-
"今回もよろしくです。\n", " [0]間違えたwww \n", " [1]間違えた!! \n",
-
20180501rl_test_medium17
RL | ||
---|---|---|
-
{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large'}\n", "dst\n", "{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 30, 'decoder_length': 30, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 31200, 'model_path': 'model/tweet_large_rl'}
-> "今回もよろしくです。\n", " [0]💩が💩www \n", " [1]💩が💩💩💩💩をしてます‼ \n",
- 20180501rl_test_medium18
RL | ||
---|---|---|