QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games

Benoît Frénay, Marco Saerens

Résultats de recherche: Contribution à un journal/une revueArticleRevue par des pairs

Résumé

Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Littman (Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning (ICML-94), 1994.) uses this framework to model two-agent zero-sum problems and, within this context, proposes the minimax-Q algorithm. This paper reviews RL algorithms for two-player zero-sum Markov games and introduces a new, simple, fast, algorithm, called QL2. QL2 is compared to several standard algorithms (Q-learning, Minimax and minimax-Q) implemented with the Q ash library written in Python. The experiments show that QL2 converges empirically to optimal mixed policies, as minimax-Q, but uses a surprisingly simple and cheap updating rule. © 2009 Elsevier B.V. All rights reserved.

langue originaleAnglais
Pages (de - à)1494-1507
Nombre de pages14
journalNeurocomputing
Volume72
Numéro de publication7-9
Les DOIs
Etat de la publicationPublié - 2008
Modification externeOui

Empreinte digitale

Examiner les sujets de recherche de « QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation