-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiry for possible performance improvement #162
Comments
No I havn't read the paper, will read it. Sounds promising |
I don't understand the step(5), where to use the instant counterfactual value updated by σt+1? |
add to the end of actionUtility() , it indeed improve performance in some public, such as 6h6c6d, 7d7h2h... |
I believe you need calculation 5 to proceed with parent node calculations. I don't understand it very well. That is why I opened an issue instead of coding it myself and doing a pull request. |
It seems to need to recalculate payoff use the new strategy, I tried, in some case like banchmark settings, it convergent faster, but in large scale game ,it works worse, maybe somewhere I misunderstood. |
The performance of this repo is already amazing, but I wanted to ask a question.
Have you checked the family of improvements defined in this paper? (https://realworld-sdm.github.io/paper/27.pdf)
It derives existing algorithms like CFR+ or DCFR by computing "instant updates" to the counterfactual value, the regret and the strategy.
I don't know if this would add a lot of complexity to the existing codebase, but it allows, for example, for even faster convergence.
This would make CFR+ converge faster than DCFR without worrying about tuning alpha, beta and gamma.
The text was updated successfully, but these errors were encountered: