Skip to content

OpenSpiel 1.3

Compare
Choose a tag to compare
@lanctot lanctot released this 30 May 18:49
· 638 commits to master since this release

This release adds several games and algorithms, improvements, bug fixes, and documentation updates.

Support and Process changes

  • Added Python 3.11 support
  • Added Roshambo bot population to wheels
  • Removed Python 3.6 support
  • Upgraded versions of supported extra packages (OR-Tools, abseil, Jax, TF, Pytorch, etc.)

Games

  • Bach or Stravisnky matrix game
  • Block Dominoes (python)
  • Crazy Eights
  • Dhu Dizhu
  • Liar's poker (python)
  • MAEDN (Mensch Ärgere Dich Nicht)
  • Nine Men's morris

Game Transforms

  • Add Noisy utility to leaves game transform
  • Add Zero-sum game transform

Other environments

  • Atari Learning Environment (ALE)

Algorithms

  • Boltzmann Policy Iteration (for mean-field games)
  • Correlated Q-learning
  • Information State MCTS, Cowling et al. '12 (Python)
  • LOLA and LOLA-DiCE (Foerster, Chen, Al-Shedivat, et al. '18) and Opponent Shaping (JAX)
  • MIP Nash solver (Sandholm, Gilpin, and Conitzer '05)
  • Proximal Policy Optimization (PPO); adapted from CleanRL. Supports single-agent use case, tested on ALE.
  • Regret-matching (Hart & Mas-Colell '00) for normal-form games and as a PSROv2 meta-solver
  • Regularized Nash Dynamics (R-NaD), Perolat & de Vylder et. al '22, Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Bots

  • Simple heuristic Gin Rummy bot
  • Roshambo bot population (see python/examples/roshambo_bot_population.py)

Examples

  • Opponent shaping on iterated matrix games example
  • Roshambo population example
  • Using Nash bargaining solution for negotiation example

Improvements and other additions

  • Add Bot::Clone() method for cloning bots
  • Avoid relying on C++ exceptions for playthrough tests
  • Add support Agent-vs-Task case in Nash averaging
  • Add scoring variants to the game Oh Hell
  • Add eligibility traces in C++ Q-learning and SARSA
  • Allow creation of per-player random policies
  • Support simultaneous move games in policy aggregator and exploitability
  • Support UCIBot via pybind11
  • Add single_tensor observer for all games
  • Add used_indices for non-marginal solvers in PSROv2
  • Add Flat Dirichlet random policy sampling
  • Add several options to bargaining game: probabilistic ending, max turns, discounted utilities
  • Add lambda returns support to JAX policy gradient
  • Several improvements to Gambit EFG parser / support
  • Add support for softmax policies in fictitious play
  • Add temperature parameter to fixed point MFG algorithms
  • Add information state tensor to battleship
  • Add option to tabular BR to return maximum entropy BR

Fixes

  • Fix UCIBot compilation in Windows
  • Misc fixes to Nash averaging
  • RNaD: fix MLP torso in final layer
  • Dark hex observation (max length)
  • Fix max game length in abstracted poker games
  • Fix legal moves in some ACPC(poker) game cases
  • Fix joint policy aggregator
  • Fix non-uniform chance outcome sampling in Deep CFR (TF2 & Pytorch)
  • Fix randomization bug in alpha_zero_torch

Several other miscellaneous fixes and improvements.

Known issues

There are a few known issues that will be fixed in the coming months.

  • Collision with pybind11 and version in C++ LibTorch AlphaZero. See #966.
  • PyTorch NFSP convergence issue. See #1008.

Acknowledgments

Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors: