-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify how to measure the performance of an agent in a 1 v 1 #1
Comments
Hello @RemiFabre! To start - something I forgot to add to the readme - this was presented at a workshop, and the corresponding paper might offer some extra explanations: http://id.nii.ac.jp/1001/00207567/ To answer your question directly, the "default rules" expect a 4-player game. As only two players were given, it has to create groups of 4, and the mode "All" will create all possible groups (well, except the ones with always the same player), namely the following:
The option Since games are 4-player, the average win-rate should be 25% (looking at the sum in this context shouldn't give 100%) Lastly for your last question, to perform duels the only thing really needed is to set the number of players in the rules to 2, and for that... I don't think I put an option to do it in the |
So, that felt like a weird time jump. I want to do some clean-up now (such moving to In any case, I added a quick fix to handle duels in the branch arena-improvements (I will add it to the main branch after doing the clean-up). If you want to run duels / set the number of player to two, you can use
Which, on my computer, gave me
I mentioned in the above message it was also possible to do the same in python, here is the code (it should work in both branches) from ceramic.players import MonteCarloPlayer, RandomPlayer
from ceramic.arena import AllArena, PairsArena
from ceramic.rules import Rules
rules = Rules.BASE
rules.player_count = 2
players = [RandomPlayer(), MonteCarloPlayer(rollouts=100)]
arena = PairsArena(rules, players)
arena.count = 10
arena.run()
arena.print() Tell me if this answers your question |
Amazing!
|
Ok so I read the paper, good work! In the paper you give some elements of the dimension of the game tree:
If you have more data on this, I'd be interested in reading it. My intuitions are:
Also here is a repo I worked on with some statistics about the game using a modest brute force method: |
Printing the number of legal moves in a game between 2 default MC:
So it's still a tad big :) |
It seems you have already answered most of the questions yourself ;), but for my input:
They outperform the best agent in the paper (monte-carlo-based) that, while not that bad, is still less advanced more recent machine-learning based ones. Its main advantage is being a good baseline without immediate exploits
For a single round in a two-player game (especially the last ones, where there are less possibilities), it might be possible. But you showed the numbers, they are still quite big. It might be possible to reach a very good solution with a smart pruning, either by hand-crafted heuristics, or a machine learning-based one. If the code is clear enough, I hope you'll be able to do that with this package ;) (There still remains the challenge of having a good heuristic for the end of a round)
It seems the numbers showed your intuitions were correct! I believe that introducing a lot of hand-crafted heuristic will likely result in a less-than-optimal policy, but surely it will be better than the baselines currently present. I think that yes, it is feasible to have pretty good results with way, although it's hard to tell now how it would compare to humans. If you're still going for minmax with alpha-beta pruning, solving the end of a round according to an heuristic (and perfectly the last round) could give it an edge |
If I may ask, what would be your approach to try to make the best agent possible using the tools we have today? |
Since we have access to an environment in which to run simulations I would, in the case of 1 vs 1:
I'm of course not sure how well this would work, but I expect it to be a simple straight-forward way in obtaining a very good agent In the case of 3 or 4 player, the path is less clear for me. The above strategy could work too by considering all opponents behave more or less like the agent (and thus pruning their actions the same way), but if the assumption doesn't hold I would look into Multi-Agent Reinforcement Learning algorithms |
Very interesting! Thanks for the detailed answer. |
Hi,
First of all thank you for your work, this is great stuff.
I'd like to compare the performance of 2 agents in a series of 1 v 1 games. For example mc vs fl, I'd expect this would do it:
But it generates this:
What are "groups" ? I'd expect only 5 games to be played per player, not 30. And why do the winrates not sum to 100%?
Clearly I did not understand something about the call options, please clarify if there is a way to perform duels.
Best,
The text was updated successfully, but these errors were encountered: