Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce the results in the paper #1

Open
ByunghyunYoo opened this issue Mar 8, 2023 · 3 comments
Open

Reproduce the results in the paper #1

ByunghyunYoo opened this issue Mar 8, 2023 · 3 comments

Comments

@ByunghyunYoo
Copy link

I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario.
(I think "dfop" means MACPF)
I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper.
Do I have an additional parameter for the Corridor scenario?
The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code).
MACPF_res

@RetiaAdolf
Copy link
Collaborator

I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario. (I think "dfop" means MACPF) I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper. Do I have an additional parameter for the Corridor scenario? The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code). MACPF_res

Hi,
sorry for late reply cause I don't really check this repo very often.
Can you provide some details about the software envoriment you used for reproduction? Like torch version, cuda version?
For results in the paper, I use torch 1.7.1+cu110.

@RetiaAdolf
Copy link
Collaborator

I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario. (I think "dfop" means MACPF) I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper. Do I have an additional parameter for the Corridor scenario? The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code). MACPF_res

I tried this code again on another machine with torch 1.11.0+cu113, it works fine there (at least it achieves non zero win rate for most seeds), so I think it is not that picky with software envoriment, so maybe I do need more details to figure out what the problem is.

Another potential problem is the version of SC2, I use SC2 4.10 in my paper, so if you are using 4.6, the performace may vary a lot.

@giangbang
Copy link

Hi @RetiaAdolf

I'm facing a similar issue replicating the results in the paper. I ran the experiment with 8m_vs_9m using the default configuration, but the performance lags behind QMIX quite a lot. After 500k steps, the win-rate is only around 0.1, whereas the paper reports over 80% at this point. Could you share the complete configuration files used in the paper?

Additionally, I'm curious as to why the code does not support parallel running, given that the number of parallel threads is also an important hyperparameter which can significantly impact performance (see the qmix_high_sample_efficiency in pymarl2, where the training thread is set to 4, lower than normal QMIX). It also runs much faster with parallel threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants