Skip to content

Play, learn, solve, and analyze No-Limit Texas Hold Em. Implementation follows from Monte Carlo counter-factual regret minimization over with hierarchical K-means imperfect recall abstractions.

License

Notifications You must be signed in to change notification settings

krukah/robopoker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

robopoker

license build

robopoker is a Rust library to play, learn, analyze, track, and solve No-Limit Texas Hold'em.

Overview

This started as a simple Rust project before evolving into a state-of-the-art poker solver and analysis tool seeking functional parity with Pluribus1, the first superhuman agent in multiplayer No Limit Texas Hold'em.

Training Progress
Monte Carlo Tree Search
Strategy Growth
Equity Distributions

The guiding philosophy of this crate is to use very precise struct and trait abstractions to represent the rules, mechanics, and strategies of NLHE. Every module is modeled as closely as possible to its real-world analogue, while also utilizing clever representations and transformations to be as memory- and compute-efficient as possible. We lean heavily into idiomatic Rust by using lazy functional patterns, efficient data structure representations, infallible type conversions, thread-safe multi-processing, and strictly safe code.

The intended use case is a one-time resource-intensive training run that will save information abstractions, k-means clusters, distance metrics, and blueprint profiles to disk for use in later runs or analyses. To generate these datasets under arbitrary parametrization, the program will iterate through the following steps:

  1. For each layer of hierarchical abstraction (preflop, flop, turn, river):

    • Generate isomorphic hand clusters by exhaustively iterating through strategically equivalent situations
    • Initialize k-means centroids using k-means++ seeding over abstract distribution space2
    • Run hierarchical k-means clustering to group hands into strategically similar situations
    • Calculate Earth Mover's Distance metrics via optimal transport5 between all cluster pairs
    • Save abstraction results and distance metrics to disk
  2. Run iterative Monte Carlo CFR training3:

    • Initialize regret tables and strategy profiles
    • Sample game trajectories using external sampling MCCFR
    • Update regret values and compute counterfactual values
    • Accumulate strategy updates with linear weighting
    • Periodically checkpoint blueprint strategy to disk
    • Continue until convergence criteria met
  3. Perform real-time search during gameplay (in progress):

    • Load pre-computed abstractions and blueprint strategy
    • Use depth-limited subgame solving with blueprint as prior
    • Dynamically build local game trees
    • Run targeted Monte Carlo rollouts
    • Return optimal actions within time constraints

System Requirements

The abstraction and counterfactual regret minimization algorithms are quite resource intensive.

  • Hierarchical k-means requires holding all strategically isomorphic observations at a given street, as well as their projected distributions onto their future streets.
  • Monte Carlo CFR requires sampling game trees with full game state information and accumulating regret and policy information
Street Abstraction Size Metric Size
Preflop 4 KB 301 KB
Flop 32 MB 175 KB
Turn 347 MB 175 KB
River 3.02 GB -

Modules

cards

Core functionality for working with standard playing cards and Texas Hold'em rules:

  • Hand Evaluation: Nanosecond hand strength calculation using lazy evaluation; fastest open-source hand evaluation algorithm; benchmarks outperform the popular Cactus Kev implementation
  • Equity Calculation: Fast equity calculations between ranges of hands, supporting both exact enumeration and Monte Carlo simulation
  • Exhaustive Iteration: Efficient iteration over cards, hands, decks, and private-public observations with lazy bitwise advancing
  • Distribution Analysis: Tools for analyzing equity distributions and range vs range scenarios
  • Short Deck Support: Full support for 36-card short deck variant with adjusted hand rankings and iterators
  • Bijective Representations: Multiple card representations (u8/u16/u32/u64) allow for maximally efficient operations and transformations

gameplay

A complete poker game engine implementation:

  • Standard Rules: Full implementation of No-Limit Texas Hold'em rules and mechanics
  • Complex Showdowns: Elegant handling and thorough testing of showdown edge cases like side pots, all-ins, dead cards, and multi-way ties
  • Flexible Payout Logic: Configurable payout structures for different game formats
  • Decider Abstraction: Generic trait system for implementing different player decision strategies
  • Functional Design: Clean Node/Edge/Tree implementation for game state representation

clustering

Advanced clustering capabilities for poker hand analysis:

  • Isomorphic Exhaustion: Plays out every one of 3.1T possible situations by respecting symmetries and enforcing permutation invariance4
  • Earth Mover's Distance (EMD): Implementation of EMD metric for comparing hand distributions over equity and hierarchical abstraction clusters
  • Hierarchical K-means: Multi-level clustering algorithm for creating strategic abstractions from bottom-up distribution clustering
  • Optimal Transport: High level abstractions for calculating Wasserstein distance between two arbitrary distributions supported over a joint metric space
  • Persistence: Efficient serialization and deserialization of clustering results to/from disk using Postgres binary formats

mccfr

Monte Carlo Counterfactual Regret Minimization solver:

  • Blueprint Convergence: Previously demonstrated convergence on Rock-Paper-Scissors as validation
  • External Sampling: Implementation of external sampling MCCFR variant
  • Dynamic Tree Building: On-the-fly game tree construction for memory efficiency
  • Linear Strategy Weighting: Efficient strategy updates using iterative weighting and discount schemes5
  • Persistence: Efficient serialization and deserialization of blueprint results to/from disk using Postgres binary formats

api

Coming soon. A distributed and scalable single-binary WebSocket-based HTTP server that allows players to play, learn, analyze, and track hands remotely.

References

  1. (2019). Superhuman AI for multiplayer poker. (Science)
  2. (2014). Potential-Aware Imperfect-Recall Abstraction with Earth Mover's Distance in Imperfect-Information Games. (AAAI)
  3. (2007). Regret Minimization in Games with Incomplete Information. (NIPS)
  4. (2013). A Fast and Optimal Hand Isomorphism Algorithm. (AAAI)
  5. (2018). Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. (NIPS)
  6. (2019). Solving Imperfect-Information Games via Discounted Regret Minimization. (AAAI)
  7. (2013). Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping. (IJCAI)
  8. (2015). Discretization of Continuous Action Spaces in Extensive-Form Games. (AAMAS)
  9. (2015). Regret-Based Pruning in Extensive-Form Games. (NIPS)
  10. (2018). Depth-Limited Solving for Imperfect-Information Games. (NeurIPS)
  11. (2017). Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning. (ICML)
  12. (2017). Safe and Nested Subgame Solving for Imperfect-Information Games. (NIPS)