Factored temporal difference learning in the new ties environment

Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a l...

Teljes leírás

Elmentve itt :
Bibliográfiai részletek
Szerzők: Gyenes Viktor
Bontovics Ákos
Lőrincz András
Testületi szerző: Symposium of Young Scientists on Intelligent Systems (2.) (2007) (Budapest)
Dokumentumtípus: Cikk
Megjelent: 2008
Sorozat:Acta cybernetica 18 No. 4
Kulcsszavak:Számítástechnika, Kibernetika
Tárgyszavak:
Online Access:http://acta.bibl.u-szeged.hu/12840
Leíró adatok
Tartalmi kivonat:Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a linear function approximation scheme utilising natural features coming from the structure of the task. We conducted experiments in the New Ties environment, which is a novel platform for multi-agent simulations. We show that learning utilising a factored representation is effective even in large state spaces, furthermore it outperforms tabular methods even in smaller problems both in learning speed and stability, because of its generalisation capabilities.
Terjedelem/Fizikai jellemzők:651-668
ISSN:0324-721X