-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add c51 for dqn and dqfd #115
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
algorithms/dqn/agent.py
Outdated
@@ -252,7 +251,7 @@ def write_log(self, i: int, loss: np.ndarray, score: int): | |||
"""Write log about loss and score""" | |||
print( | |||
"[INFO] episode %d, episode step: %d, total step: %d, total score: %d\n" | |||
"epsilon: %f, loss: %f, avg q-value: %f at %s\n" | |||
"epsilon: %f, loss: %f, avg_q_value: %f at %s\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no f-string?
"epsilon: %f, loss: %f, avg_q_value: %f at %s\n" | |
f"epsilon: {i}, loss: {loss[0]}, avg_q_value: {loss[1]} at {now()}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use f-string because it is not compatible with the python versions lower than 3.6.
algorithms/dqn/agent.py
Outdated
curr_q_value = q_values.gather(1, actions.long().unsqueeze(1)) | ||
next_q_value = next_target_q_values.gather( # Double DQN | ||
1, next_q_values.argmax(1).unsqueeze(1) | ||
batch_size = self.hyper_params["BATCH_SIZE"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a key instead of using str
?
batch_size = self.hyper_params["BATCH_SIZE"] | |
from params.keys import BATCH_SIZE | |
batch_size = self.hyper_params[BATCH_SIZE] |
# params/keys.py
BATCH_SIZE = "BATCH_SIZE"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking for any way not to use strings as keys, like enum in c.
It would be better if I don't have to make a new .py to define keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made an issue: #116
atom_size: int = 51, | ||
v_min: int = -10, | ||
v_max: int = 10, | ||
hidden_activation: Callable = F.relu, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hidden_activation: Callable = F.relu, | |
hidden_activation: Callable[[torch.Tensor], torch.Tensor] = F.relu, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open an issue for it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened an issue: #117
This pull request introduces 1 alert when merging 75a4ac7 into 94d9fd8 - view on LGTM.com new alerts:
Comment posted by LGTM.com |
algorithms/dqn/utils.py
Outdated
return dq_loss_element_wise, q_values | ||
|
||
|
||
def get_dqn_loss( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should change method name (duplicate name)
Tested on lunarlander-v2.
performance is not so good :(
@MrSyee You don't have to review this now. This PR is for code review with external contributors.