Event-learning and robust policy heuristics

A. Lőrincz, Imre Pólik, István Szita

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the 'curse of dimensionality' problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.

Original languageEnglish
Pages (from-to)319-337
Number of pages19
JournalCognitive Systems Research
Volume4
Issue number4
DOIs
Publication statusPublished - Dec 2003

Fingerprint

Learning
Reinforcement learning
Pendulums
Robust control
Learning algorithms
Planning
Computer simulation
Heuristics
Computer Simulation

Keywords

  • Continuous SDS controller
  • Event-learning
  • Reinforcement learning
  • Robust control

ASJC Scopus subject areas

  • Artificial Intelligence
  • Cognitive Neuroscience
  • Experimental and Cognitive Psychology

Cite this

Event-learning and robust policy heuristics. / Lőrincz, A.; Pólik, Imre; Szita, István.

In: Cognitive Systems Research, Vol. 4, No. 4, 12.2003, p. 319-337.

Research output: Contribution to journalArticle

Lőrincz, A. ; Pólik, Imre ; Szita, István. / Event-learning and robust policy heuristics. In: Cognitive Systems Research. 2003 ; Vol. 4, No. 4. pp. 319-337.
@article{5ea86561397542089cacae7b4a29a4ed,
title = "Event-learning and robust policy heuristics",
abstract = "In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the 'curse of dimensionality' problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.",
keywords = "Continuous SDS controller, Event-learning, Reinforcement learning, Robust control",
author = "A. Lőrincz and Imre P{\'o}lik and Istv{\'a}n Szita",
year = "2003",
month = "12",
doi = "10.1016/S1389-0417(03)00014-7",
language = "English",
volume = "4",
pages = "319--337",
journal = "Cognitive Systems Research",
issn = "1389-0417",
publisher = "Elsevier",
number = "4",

}

TY - JOUR

T1 - Event-learning and robust policy heuristics

AU - Lőrincz, A.

AU - Pólik, Imre

AU - Szita, István

PY - 2003/12

Y1 - 2003/12

N2 - In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the 'curse of dimensionality' problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.

AB - In this paper we introduce a novel reinforcement learning algorithm called event-learning. The algorithm uses events, ordered pairs of two consecutive states. We define event-value function and we derive learning rules. Combining our method with a well-known robust control method, the SDS algorithm, we introduce Robust Policy Heuristics (RPH). It is shown that RPH, a fast-adapting non-Markovian policy, is particularly useful for coarse models of the environment and could be useful for some partially observed systems. RPH may be of help in alleviating the 'curse of dimensionality' problem. Event-learning and RPH can be used to separate time scales of learning of value functions and adaptation. We argue that the definition of modules is straightforward for event-learning and event-learning makes planning feasible in the RL framework. Computer simulations of a rotational inverted pendulum with coarse discretization are shown to demonstrate the principle.

KW - Continuous SDS controller

KW - Event-learning

KW - Reinforcement learning

KW - Robust control

UR - http://www.scopus.com/inward/record.url?scp=2442486448&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2442486448&partnerID=8YFLogxK

U2 - 10.1016/S1389-0417(03)00014-7

DO - 10.1016/S1389-0417(03)00014-7

M3 - Article

AN - SCOPUS:2442486448

VL - 4

SP - 319

EP - 337

JO - Cognitive Systems Research

JF - Cognitive Systems Research

SN - 1389-0417

IS - 4

ER -