Context tree estimation for not necessarily finite memory processes, Via BIC and MDL

I. Csiszár, Zsolt Talata

Research output: Contribution to journalArticle

89 Citations (Scopus)

Abstract

The concept of context tree, usually defined for finite memory processes, is extended to arbitrary stationary ergodic processes (with finite alphabet). These context trees are not necessarily complete, and may be of infinite depth. The familiar Bayesian information criterion (BIC) and minimum description length (MDL) principles arc shown to provide strongly consistent estimators of the context tree, via optimization of a criterion for hypothetical context trees of finite depth, allowed to grow with the sample size n as o(log n). Algorithms are provided to compute these estimators in O(n) time, and to compute them on-line for all i ≤ n in o(n log n) time.

Original languageEnglish
Pages (from-to)1007-1016
Number of pages10
JournalIEEE Transactions on Information Theory
Volume52
Issue number3
DOIs
Publication statusPublished - Mar 2006

Fingerprint

Data storage equipment
time

Keywords

  • Bayesian information criterion (BIG)
  • Consistent estimation
  • Context tree
  • Context tree maximization (CTM)
  • Infinite memory
  • Minimum description length (MDL)
  • Model selection

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Information Systems

Cite this

Context tree estimation for not necessarily finite memory processes, Via BIC and MDL. / Csiszár, I.; Talata, Zsolt.

In: IEEE Transactions on Information Theory, Vol. 52, No. 3, 03.2006, p. 1007-1016.

Research output: Contribution to journalArticle

@article{a41cc92bb13747adaa486d444fc0d1b3,
title = "Context tree estimation for not necessarily finite memory processes, Via BIC and MDL",
abstract = "The concept of context tree, usually defined for finite memory processes, is extended to arbitrary stationary ergodic processes (with finite alphabet). These context trees are not necessarily complete, and may be of infinite depth. The familiar Bayesian information criterion (BIC) and minimum description length (MDL) principles arc shown to provide strongly consistent estimators of the context tree, via optimization of a criterion for hypothetical context trees of finite depth, allowed to grow with the sample size n as o(log n). Algorithms are provided to compute these estimators in O(n) time, and to compute them on-line for all i ≤ n in o(n log n) time.",
keywords = "Bayesian information criterion (BIG), Consistent estimation, Context tree, Context tree maximization (CTM), Infinite memory, Minimum description length (MDL), Model selection",
author = "I. Csisz{\'a}r and Zsolt Talata",
year = "2006",
month = "3",
doi = "10.1109/TIT.2005.864431",
language = "English",
volume = "52",
pages = "1007--1016",
journal = "IEEE Transactions on Information Theory",
issn = "0018-9448",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Context tree estimation for not necessarily finite memory processes, Via BIC and MDL

AU - Csiszár, I.

AU - Talata, Zsolt

PY - 2006/3

Y1 - 2006/3

N2 - The concept of context tree, usually defined for finite memory processes, is extended to arbitrary stationary ergodic processes (with finite alphabet). These context trees are not necessarily complete, and may be of infinite depth. The familiar Bayesian information criterion (BIC) and minimum description length (MDL) principles arc shown to provide strongly consistent estimators of the context tree, via optimization of a criterion for hypothetical context trees of finite depth, allowed to grow with the sample size n as o(log n). Algorithms are provided to compute these estimators in O(n) time, and to compute them on-line for all i ≤ n in o(n log n) time.

AB - The concept of context tree, usually defined for finite memory processes, is extended to arbitrary stationary ergodic processes (with finite alphabet). These context trees are not necessarily complete, and may be of infinite depth. The familiar Bayesian information criterion (BIC) and minimum description length (MDL) principles arc shown to provide strongly consistent estimators of the context tree, via optimization of a criterion for hypothetical context trees of finite depth, allowed to grow with the sample size n as o(log n). Algorithms are provided to compute these estimators in O(n) time, and to compute them on-line for all i ≤ n in o(n log n) time.

KW - Bayesian information criterion (BIG)

KW - Consistent estimation

KW - Context tree

KW - Context tree maximization (CTM)

KW - Infinite memory

KW - Minimum description length (MDL)

KW - Model selection

UR - http://www.scopus.com/inward/record.url?scp=33744799923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33744799923&partnerID=8YFLogxK

U2 - 10.1109/TIT.2005.864431

DO - 10.1109/TIT.2005.864431

M3 - Article

AN - SCOPUS:33744799923

VL - 52

SP - 1007

EP - 1016

JO - IEEE Transactions on Information Theory

JF - IEEE Transactions on Information Theory

SN - 0018-9448

IS - 3

ER -