Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit Torch_PPO_Cleanrl_Atari_Envpool #243

Merged
merged 9 commits into from
Aug 6, 2024

Conversation

roger-creus
Copy link
Contributor

Hello!

Currently main.py is just a copy-paste of the original cleanrl script as discussed with Xavier. Should I modify it to use the observer utilities?

Also, I have been able to add the requirements and install them with: milabench install --config dev.yaml --base .

But after that when I run: milabench run --config dev.yaml --base . I get that some dependencies cannot be imported as if they hadn't been installed. I think I am not being able to load a shell with the venv I created with the first command.

Please let me know how to proceed!

@Delaunay
Copy link
Collaborator

Delaunay commented Aug 1, 2024

I have made a small PR to your branch here

I was able to run the example with

conda create -n py310 PYTHON=3.10
conda activate py310
pip install -e .
cd benchmarks/torch_ppo_atari_envpool/
make install   # equivalent to `milabench install --config dev.yaml --base /opt/milabench --force`
make single   # equivalent to `milabench run --config dev.yaml --base /opt/milabench --select torch_ppo_atari_envpool`

Note: when doing milabench install a sentinel file is created so the second time milabench install is executed the install steps is skipped unless --force is specified.

Result of current run:

make single
milabench run --config dev.yaml --base /opt/milabench --select torch_ppo_atari_envpool
No system config found, using defaults
PerGPU([VoirCommand(PackCommand(pack))])
torch_ppo_atari_envpool.D0 [config.dirs.base] /opt/milabench
torch_ppo_atari_envpool.D0 [config.dirs.venv] /opt/milabench/venv/torch
torch_ppo_atari_envpool.D0 [config.dirs.data] /opt/milabench/data
torch_ppo_atari_envpool.D0 [config.dirs.runs] /opt/milabench/runs
torch_ppo_atari_envpool.D0 [config.dirs.extra] /opt/milabench/extra/torch_ppo_atari_envpool
torch_ppo_atari_envpool.D0 [config.dirs.cache] /opt/milabench/cache
torch_ppo_atari_envpool.D0 [config.group] torch_ppo_atari_envpool
torch_ppo_atari_envpool.D0 [config.install_group] torch
torch_ppo_atari_envpool.D0 [config.install_variant] cpu
torch_ppo_atari_envpool.D0 [config.run_name] lajijila.2024-08-01_12:19:09.256554
torch_ppo_atari_envpool.D0 [config.enabled] True
torch_ppo_atari_envpool.D0 [config.capabilities.nodes] 1
torch_ppo_atari_envpool.D0 [config.definition] /home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool
torch_ppo_atari_envpool.D0 [config.install-variant] unpinned
torch_ppo_atari_envpool.D0 [config.plan.method] per_gpu
torch_ppo_atari_envpool.D0 [config.config_base] /home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool
torch_ppo_atari_envpool.D0 [config.config_file] 
/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/dev.yaml
torch_ppo_atari_envpool.D0 [config.tags] []
torch_ppo_atari_envpool.D0 [config.name] torch_ppo_atari_envpool
torch_ppo_atari_envpool.D0 [config.tag] ['torch_ppo_atari_envpool', 'D0']
torch_ppo_atari_envpool.D0 [config.device] 0
torch_ppo_atari_envpool.D0 [config.devices] [0]
torch_ppo_atari_envpool.D0 [config.env.CPU_VISIBLE_DEVICE] 0
torch_ppo_atari_envpool.D0 [start] /opt/milabench/venv/torch/bin/voir 
/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/main.py [at 2024-08-01 12:19:12.407467]
torch_ppo_atari_envpool.D0 [stderr] /opt/milabench/venv/torch/lib/python3.10/site-packages/tyro/_fields.py:330: UserWarning:
The field wandb_entity is annotated with type <class 'str'>, but the default value None has type <class 'NoneType'>. We'll 
try to handle this gracefully, but it may cause unexpected behavior.
torch_ppo_atari_envpool.D0 [stderr]   warnings.warn(
torch_ppo_atari_envpool.D0 [stderr] /opt/milabench/venv/torch/lib/python3.10/site-packages/tyro/_fields.py:330: UserWarning:
The field target_kl is annotated with type <class 'float'>, but the default value None has type <class 'NoneType'>. We'll 
try to handle this gracefully, but it may cause unexpected behavior.
torch_ppo_atari_envpool.D0 [stderr]   warnings.warn(
torch_ppo_atari_envpool.D0 [stderr] 
/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/main.py:214: UserWarning: Creating a tensor 
from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with 
numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:278.)
torch_ppo_atari_envpool.D0 [stderr]   next_obs = torch.Tensor(envs.reset()).to(device)
torch_ppo_atari_envpool.D0 [stderr] Traceback (most recent call last):
torch_ppo_atari_envpool.D0 [stderr]   File "/opt/milabench/venv/torch/bin/voir", line 8, in <module>
torch_ppo_atari_envpool.D0 [stderr]     sys.exit(main())
torch_ppo_atari_envpool.D0 [stderr]   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/cli.py", line 128, 
in main
torch_ppo_atari_envpool.D0 [stderr]     ov(sys.argv[1:] if argv is None else argv)
torch_ppo_atari_envpool.D0 [stderr]   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/phase.py", line 331,
in __call__
torch_ppo_atari_envpool.D0 [stderr]     self._run(*args, **kwargs)
torch_ppo_atari_envpool.D0 [stderr]   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/overseer.py", line 
242, in _run
torch_ppo_atari_envpool.D0 [stderr]     set_value(func())
torch_ppo_atari_envpool.D0 [stderr]   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/scriptutils.py", 
line 37, in <lambda>
torch_ppo_atari_envpool.D0 [stderr]     return lambda: exec(mainsection, glb, glb)
torch_ppo_atari_envpool.D0 [stderr]   File 
"/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/main.py", line 214, in <module>
torch_ppo_atari_envpool.D0 [stderr]     next_obs = torch.Tensor(envs.reset()).to(device)
torch_ppo_atari_envpool.D0 [stderr] ValueError: expected sequence of length 128 at dim 1 (got 6)
torch_ppo_atari_envpool.D0 ValueError: expected sequence of length 128 at dim 1 (got 6)
torch_ppo_atari_envpool.D0 [data] {'gpudata': {}, 'task': 'main'}
torch_ppo_atari_envpool.D0 [end (1)] /opt/milabench/venv/torch/bin/voir 
/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/main.py [at 2024-08-01 12:19:22.919134]
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ GLOBAL                                      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (1/1)                               │
│ torch_ppo_atari_envpool.D0   main gpudata   {}                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
torch_ppo_atari_envpool.D0
==========================
  * no training rate retrieved
  * Error codes = 1, 1
  * 1 exceptions found
    * 1 x ValueError: expected sequence of length 128 at dim 1 (got 6)
        | Traceback (most recent call last):
        |   File "/opt/milabench/venv/torch/bin/voir", line 8, in <module>
        |     sys.exit(main())
        |   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/cli.py", line 128, in main
        |     ov(sys.argv[1:] if argv is None else argv)
        |   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/phase.py", line 331, in __call__
        |     self._run(*args, **kwargs)
        |   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/overseer.py", line 242, in _run
        |     set_value(func())
        |   File "/opt/milabench/venv/torch/lib/python3.10/site-packages/voir/scriptutils.py", line 37, in <lambda>
        |     return lambda: exec(mainsection, glb, glb)
        |   File "/home/newton/work/milabench_dev/milabench/benchmarks/torch_ppo_atari_envpool/main.py", line 214, in <module>
        |     next_obs = torch.Tensor(envs.reset()).to(device)
        | ValueError: expected sequence of length 128 at dim 1 (got 6)

[DONE] Reports directory: /opt/milabench/runs/lajijila.2024-08-01_12:19:09.256554
Source: /opt/milabench/runs/lajijila.2024-08-01_12:19:09.256554
=================
Benchmark results
=================
bench                          | fail |   n |       perf |   sem% |   std% | peak_memory |      score | weight
torch_ppo_atari_envpool        |    1 |   1 |        nan |   nan% |   nan% |         nan |        nan |   0.00

Scores
------
/home/newton/work/milabench_dev/milabench/milabench/report.py:322: RuntimeWarning: invalid value encountered in scalar divide
  logscore = np.sum(np.log(perf) * weights) / weight_total
Failure rate:     100.00% (FAIL)
Score:                nan

Errors
------
1 errors, details in HTML report.
make: *** [Makefile:25: single] Error 1

@Delaunay
Copy link
Collaborator

Delaunay commented Aug 2, 2024

I added the instrumentation for it in my PR.

I also updated the argument so the number of envs scales with the number of CPU available per GPUs.

=================
Benchmark results
=================
bench                          | fail |   n |       perf |   sem% |   std% | peak_memory |      score | weight
torch_ppo_atari_envpool        |    0 |   8 |    2423.74 |   0.7% |  10.5% |        1736 |   19688.23 |   0.00

@Delaunay Delaunay changed the base branch from master to staging August 6, 2024 11:47
@Delaunay Delaunay merged commit eed157a into mila-iqia:staging Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants