# AlphaMatico 1 This project contains an implementation of multiple agents designed to solve the game [Mathematico](https://github.com/balgot/mathematico). There are two types of agents: * pure Monte Carlo Tree Search * *AlphaZero* adaptation based on two different libraries: * [`open_spiel`](https://github.com/deepmind/open_spiel) * [`mcts`](https://github.com/pbsinclair42/MCTS) (custom AlphaZero implementation uses only the Value Head). ## Requirements This project was build using `Python 3.10` but should be able to support `3.11>Python>=3.9` out-of-the-box. Install the necessary packages: ```bash pip install -r requirements.txt ``` ## Usage This project contains multiple files for different purposes: * to **run the evaluation** and view the **results**, check out `evaluation/` * to see **examples**, and the learning algorithms, **train an agent**, visit `notebooks/` * to view the implementation, head to `src/` See [Documentation](#documentation) below for further exaples. ## Documentation In the last part, we will present the interface of both, `mathematico` and this repository. ### 1. mathematico The studied game, *Mathematico*, is part of another package. To install the game, use: ```bash pip install --quiet 'git+https://github.com/balgot/mathematico.git#egg=mathematico&subdirectory=game' ``` In order to play the game, you need to supply a `Player` instance to the `Mathematico` object, e.g.: ```python from mathematico import Mathematico, RandomPlayer game = Mathematico() player1 = RandomPlayer() game.add_player(player1) game.add_player(RandomPlayer()) # ... add as many players as needed game.play() ``` which returns the achieved scores per specified players in the order, they were added to the game. For other options of installation (`Python < 3.9`), detailed rules explanation and detailed interface options, refer to the [github repository](https://github.com/balgot/mathematico). See also `notebooks/mathematico.ipynb` for examples. ### 2. `open_spiel` Adaptation of `Mathematico` To use `Mathematico` from the [`open_spiel`](https://github.com/deepmind/open_spiel) (`pyspiel`) package, it is sufficient to import one package: ```python import src.agents.ospiel # registers the game automatically import pyspiel game = pyspiel.load_game("mathematico") state = game.new_initial_state() ``` Check `notebooks.mcts_open_spiel.ipynb` for examples on how to use this. ### 3. MCTS Agents There are two types of MCTS agent in this repository: * customised Python implementation, inspired by [mcts](https://github.com/pbsinclair42/MCTS), see the corresponding class for further details: ```python from src.agents.mcts_player import MctsPlayer MAX_TIME = 500 # 500 ms per move MAX_SIMULATIONS = 20 # 20 MCTS rollouts per move custom_mcts_player = MctsPlayer(MAX_TIME, MAX_SIMULATIONS) ``` * `open_spiel` implementations, available as: ```python from src.agents.ospiel import OpenSpielPlayer MAX_SIMULATIONS = 20 # 20 MCTS rollouts per move open_spiel_player = OpenSpielPlayer(MAX_SIMULATIONS) ``` ### 4. Train `open_spiel` AlphaZero Agent To train (customised `open_spiel`) AlphaZero agent, use script at `src/train_azero.py`. This will require authentication for online logging, you can disable this by defining environment variable `WANDB_MODE=offline`. ```bash # to see help python src/train_azero.py --help # train default agent python src/train_azero.py ``` #### 4.1 Loading trained agent Assuming that the previous script saved the configuration and the checkpoints to `PATH/` folder, it is possible to load the trained bot using: ```python from azero import load_trained_bot as _load_azero_bot import json import os PATH = "PATH/" CHECKPOINT = -1 def load_trained_bot(): with open(os.path.join(PATH, "config.json"), "r") as f: cfg = json.load(f) bot, _ = _load_azero_bot(cfg, PATH, CHECKPOINT, is_eval=True) return bot ``` The trained bot is not compatible with `mathematico` interface, it is necessary to wrap it, for example by: ```python from src.agents.ospiel import OpenSpielPlayer MAX_SIMULATIONS = 20 # does not matter here, original (training) value will be used player = OpenSpielPlayer(MAX_SIMULATIONS) player.bot = bot ``` ### 5. Train (Value Network Only) Agent Use notebooks `notebooks/rf-mlp.ipynb` or `notebooks/rf-cnn.ipynb` to train these agents. ### 6. Evaluation of Trained Agents See `evaluation/perft.py` for detailed instructions and `evaluation/perft.sh` for examples on how to use this script. ## License Open source, see `LICENSE` file for further details. ## Authors Samuel Gazda, Michal BarniĊĦin