Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add c51 for dqn and dqfd #115

Merged
merged 7 commits into from
Mar 19, 2019
Merged

Add c51 for dqn and dqfd #115

merged 7 commits into from
Mar 19, 2019

Conversation

Curt-Park
Copy link
Collaborator

Tested on lunarlander-v2.
performance is not so good :(

@MrSyee You don't have to review this now. This PR is for code review with external contributors.

@Curt-Park Curt-Park self-assigned this Mar 16, 2019
@Curt-Park Curt-Park requested a review from MrSyee March 16, 2019 07:21
Copy link

@kkweon kkweon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -252,7 +251,7 @@ def write_log(self, i: int, loss: np.ndarray, score: int):
"""Write log about loss and score"""
print(
"[INFO] episode %d, episode step: %d, total step: %d, total score: %d\n"
"epsilon: %f, loss: %f, avg q-value: %f at %s\n"
"epsilon: %f, loss: %f, avg_q_value: %f at %s\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no f-string?

Suggested change
"epsilon: %f, loss: %f, avg_q_value: %f at %s\n"
f"epsilon: {i}, loss: {loss[0]}, avg_q_value: {loss[1]} at {now()}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use f-string because it is not compatible with the python versions lower than 3.6.

curr_q_value = q_values.gather(1, actions.long().unsqueeze(1))
next_q_value = next_target_q_values.gather( # Double DQN
1, next_q_values.argmax(1).unsqueeze(1)
batch_size = self.hyper_params["BATCH_SIZE"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a key instead of using str?

Suggested change
batch_size = self.hyper_params["BATCH_SIZE"]
from params.keys import BATCH_SIZE
batch_size = self.hyper_params[BATCH_SIZE]
# params/keys.py
BATCH_SIZE = "BATCH_SIZE"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking for any way not to use strings as keys, like enum in c.
It would be better if I don't have to make a new .py to define keys.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an issue: #116

atom_size: int = 51,
v_min: int = -10,
v_max: int = 10,
hidden_activation: Callable = F.relu,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hidden_activation: Callable = F.relu,
hidden_activation: Callable[[torch.Tensor], torch.Tensor] = F.relu,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open an issue for it. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue: #117

@Curt-Park
Copy link
Collaborator Author

@medipixel
Copy link
Owner

This pull request introduces 1 alert when merging 75a4ac7 into 94d9fd8 - view on LGTM.com

new alerts:

  • 1 for Variable defined multiple times

Comment posted by LGTM.com

return dq_loss_element_wise, q_values


def get_dqn_loss(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should change method name (duplicate name)

@MrSyee MrSyee merged commit 0269595 into master Mar 19, 2019
@Curt-Park Curt-Park deleted the feature/c51 branch March 19, 2019 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants