Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Support latest Jumanji version #1134

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from
Open

Conversation

WiemKhlifi
Copy link
Contributor

@WiemKhlifi WiemKhlifi commented Nov 11, 2024

What?

Upgrade to the latest Jumanji version of 1.0.1 instead of 0.3.1 and pin to the original and latest Jumanji and Matrax.

How?

  • Change the requirements.txt to use original versions instead of a fork.
  • Adapt all the wrappers and systems to use cached specs similar to jumanji wrappers and envs.

Extra:

  • Note that this PR will be merged after pushing this PR into jumanji for Connector env updates.
  • Note that in some wrappers, since the specs outputs will be cached, some attributes can't be retrieved if defined after the super().__init__(env) ( The self.__getattr__(env,name) in parent class can't get the attribute from env if it's defined with different name in the env wrapper class).
    For example:
# Should work:
super().__init__(env)
self.time_limit = self._env.time_limit 

# Shouldn't work:
super().__init__(env)
self.time_limit = self._env.max_episode_length

Copy link
Collaborator

@RuanJohn RuanJohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @WiemKhlifi! Just a few questions, but it looks mostly good to me.
As a sanity check can you please do a few test runs to just check that the system performance is unaffected?

mava/configs/env/connector.yaml Show resolved Hide resolved
mava/wrappers/jumanji.py Outdated Show resolved Hide resolved
mava/wrappers/jumanji.py Show resolved Hide resolved
Copy link
Contributor

@sash-a sash-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Wiem, couple small things, mostly removing the git stuff from requirements.txt where possible

requirements/requirements.txt Outdated Show resolved Hide resolved
requirements/requirements.txt Outdated Show resolved Hide resolved
requirements/requirements.txt Show resolved Hide resolved
mava/wrappers/jumanji.py Outdated Show resolved Hide resolved
mava/systems/sable/anakin/ff_sable.py Show resolved Hide resolved
mava/systems/sable/anakin/rec_sable.py Show resolved Hide resolved
Copy link
Collaborator

@RuanJohn RuanJohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @WiemKhlifi. Some suggestions from my side.

mava/configs/env/vector-connector.yaml Outdated Show resolved Hide resolved
mava/wrappers/jumanji.py Outdated Show resolved Hide resolved
# The environment returns a list of individual rewards and these are used as is.
return timestep.replace(observation=modified_observation)
# Whether or not aggregate the list of individual rewards.
reward = aggregate_rewards(timestep.reward, self.num_agents, self._use_individual_rewards)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I am not a massive fan of the not here. But I prefer it over having the conditional in the aggregation function. What do you think of just having the config option be aggregate_rewards instead of use_individual_rewards? Then we could change the conditional here to if self._aggregate_rewards.

Suggested change
reward = aggregate_rewards(timestep.reward, self.num_agents, self._use_individual_rewards)
if not self._use_individual_rewards:
reward = aggregate_rewards(timestep.reward, self.num_agents)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh either way I prefer this 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose the second option to use aggregate_rewards instead of the not which is less confusing 😅

mava/wrappers/jumanji.py Outdated Show resolved Hide resolved
mava/wrappers/jumanji.py Show resolved Hide resolved
mava/wrappers/jumanji.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants