Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about new feature SparseGaussianAdam #1066

Open
rylynchen opened this issue Nov 15, 2024 · 0 comments
Open

A question about new feature SparseGaussianAdam #1066

rylynchen opened this issue Nov 15, 2024 · 0 comments

Comments

@rylynchen
Copy link

Why SparseGaussianAdam do not support group["params"] multi elements, such us torch.optim.Adam ?

class SparseGaussianAdam(torch.optim.Adam):
    def __init__(self, params, lr, eps):
        super().__init__(params=params, lr=lr, eps=eps)
    
    @torch.no_grad()
    def step(self, visibility, N):
        for group in self.param_groups:
            lr = group["lr"]
            eps = group["eps"]

            assert len(group["params"]) == 1, "more than one tensor in group"
            param = group["params"][0]
            if param.grad is None:
                continue
        ....
class Adam(Optimizer):
    @_use_grad_for_differentiable
    def step(self, closure=None, *, grad_scaler=None):
        ...
            for p in group['params']:
                if p.grad is not None:
                    params_with_grad.append(p)
                    if p.grad.is_sparse:
                        raise RuntimeError('Adam does not support sparse gradients, please consider SparseAdam instead')
                    grads.append(p.grad)
                    ...
                    state_steps.append(state['step'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant