Factored temporal difference learning in the new ties environment

Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a l...

Full description

Saved in:
Bibliographic Details
Main Authors: Gyenes Viktor
Bontovics Ákos
Lőrincz András
Corporate Author: Symposium of Young Scientists on Intelligent Systems (2.) (2007) (Budapest)
Format: Article
Published: 2008
Series:Acta cybernetica 18 No. 4
Kulcsszavak:Számítástechnika, Kibernetika
Subjects:
Online Access:http://acta.bibl.u-szeged.hu/12840
Description
Summary:Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a linear function approximation scheme utilising natural features coming from the structure of the task. We conducted experiments in the New Ties environment, which is a novel platform for multi-agent simulations. We show that learning utilising a factored representation is effective even in large state spaces, furthermore it outperforms tabular methods even in smaller problems both in learning speed and stability, because of its generalisation capabilities.
Physical Description:651-668
ISSN:0324-721X