The many faces of optimism: A unifying approach

István Szita, A. Lőrincz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Citations (Scopus)

Abstract

The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.

Original languageEnglish
Title of host publicationProceedings of the 25th International Conference on Machine Learning
Pages1048-1055
Number of pages8
Publication statusPublished - 2008
Event25th International Conference on Machine Learning - Helsinki, Finland
Duration: Jul 5 2008Jul 9 2008

Other

Other25th International Conference on Machine Learning
CountryFinland
CityHelsinki
Period7/5/087/9/08

Fingerprint

Reinforcement learning
Polynomials
Uncertainty

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Software

Cite this

Szita, I., & Lőrincz, A. (2008). The many faces of optimism: A unifying approach. In Proceedings of the 25th International Conference on Machine Learning (pp. 1048-1055)

The many faces of optimism : A unifying approach. / Szita, István; Lőrincz, A.

Proceedings of the 25th International Conference on Machine Learning. 2008. p. 1048-1055.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Szita, I & Lőrincz, A 2008, The many faces of optimism: A unifying approach. in Proceedings of the 25th International Conference on Machine Learning. pp. 1048-1055, 25th International Conference on Machine Learning, Helsinki, Finland, 7/5/08.
Szita I, Lőrincz A. The many faces of optimism: A unifying approach. In Proceedings of the 25th International Conference on Machine Learning. 2008. p. 1048-1055
Szita, István ; Lőrincz, A. / The many faces of optimism : A unifying approach. Proceedings of the 25th International Conference on Machine Learning. 2008. pp. 1048-1055
@inproceedings{708f5140e1514435a149d35d9bb881e2,
title = "The many faces of optimism: A unifying approach",
abstract = "The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. {"}Optimism in the face of uncertainty{"} and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.",
author = "Istv{\'a}n Szita and A. Lőrincz",
year = "2008",
language = "English",
isbn = "9781605582054",
pages = "1048--1055",
booktitle = "Proceedings of the 25th International Conference on Machine Learning",

}

TY - GEN

T1 - The many faces of optimism

T2 - A unifying approach

AU - Szita, István

AU - Lőrincz, A.

PY - 2008

Y1 - 2008

N2 - The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.

AB - The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.

UR - http://www.scopus.com/inward/record.url?scp=56449092664&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56449092664&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:56449092664

SN - 9781605582054

SP - 1048

EP - 1055

BT - Proceedings of the 25th International Conference on Machine Learning

ER -