### Abstract

Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Littman (Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning (ICML-94), 1994.) uses this framework to model two-agent zero-sum problems and, within this context, proposes the minimax-Q algorithm. This paper reviews RL algorithms for two-player zero-sum Markov games and introduces a new, simple, fast, algorithm, called QL_{2}. QL_{2} is compared to several standard algorithms (Q-learning, Minimax and minimax-Q) implemented with the Q ash library written in Python. The experiments show that QL_{2} converges empirically to optimal mixed policies, as minimax-Q, but uses a surprisingly simple and cheap updating rule. © 2009 Elsevier B.V. All rights reserved.

Original language | English |
---|---|

Pages (from-to) | 1494-1507 |

Number of pages | 14 |

Journal | Neurocomputing |

Volume | 72 |

Issue number | 7-9 |

DOIs | |

Publication status | Published - 2008 |

Externally published | Yes |

### Keywords

- Markov games
- Multi-agent
- Q-Learning
- Reinforcement learning
- Two-player zero-sum games

## Fingerprint Dive into the research topics of 'QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games'. Together they form a unique fingerprint.

## Cite this

*Neurocomputing*,

*72*(7-9), 1494-1507. https://doi.org/10.1016/j.neucom.2008.12.022