Fast parallel estimation of high dimensional information theoretical quantities with low dimensional random projection ensembles

Zoltan Szabo, A. Lőrincz

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The estimation of relevant information theoretical quantities, such as entropy, mutual information, and various divergences is computationally expensive in high dimensions. However, for this task, one may apply pairwise Euclidean distances of sample points, which suits random projection (RP) based low dimensional embeddings. The Johnson-Lindenstrauss (JL) lemma gives theoretical bound on the dimension of the low dimensional embedding. We adapt the RP technique for the estimation of information theoretical quantities. Intriguingly, we find that embeddings into extremely small dimensions, far below the bounds of the JL lemma, provide satisfactory estimates for the original task. We illustrate this in the Independent Subspace Analysis (ISA) task; we combine RP dimension reduction with a simple ensemble method. We gain considerable speed-up with the potential of real-time parallel estimation of high dimensional information theoretical quantities.

Original languageEnglish
Pages (from-to)146-153
Number of pages8
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5441
DOIs
Publication statusPublished - 2009

Fingerprint

Random Projection
Ensemble
High-dimensional
Lemma
Ensemble Methods
Sample point
Dimension Reduction
Euclidean Distance
Mutual Information
Higher Dimensions
Pairwise
Divergence
Speedup
Entropy
Subspace
Real-time
Estimate

Keywords

  • Independent subspace analysis
  • Information theoretical estimations
  • Pairwise distances
  • Random projection

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

@article{0c53aeead5314dad89d29d6af23f779f,
title = "Fast parallel estimation of high dimensional information theoretical quantities with low dimensional random projection ensembles",
abstract = "The estimation of relevant information theoretical quantities, such as entropy, mutual information, and various divergences is computationally expensive in high dimensions. However, for this task, one may apply pairwise Euclidean distances of sample points, which suits random projection (RP) based low dimensional embeddings. The Johnson-Lindenstrauss (JL) lemma gives theoretical bound on the dimension of the low dimensional embedding. We adapt the RP technique for the estimation of information theoretical quantities. Intriguingly, we find that embeddings into extremely small dimensions, far below the bounds of the JL lemma, provide satisfactory estimates for the original task. We illustrate this in the Independent Subspace Analysis (ISA) task; we combine RP dimension reduction with a simple ensemble method. We gain considerable speed-up with the potential of real-time parallel estimation of high dimensional information theoretical quantities.",
keywords = "Independent subspace analysis, Information theoretical estimations, Pairwise distances, Random projection",
author = "Zoltan Szabo and A. Lőrincz",
year = "2009",
doi = "10.1007/978-3-642-00599-2_19",
language = "English",
volume = "5441",
pages = "146--153",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Fast parallel estimation of high dimensional information theoretical quantities with low dimensional random projection ensembles

AU - Szabo, Zoltan

AU - Lőrincz, A.

PY - 2009

Y1 - 2009

N2 - The estimation of relevant information theoretical quantities, such as entropy, mutual information, and various divergences is computationally expensive in high dimensions. However, for this task, one may apply pairwise Euclidean distances of sample points, which suits random projection (RP) based low dimensional embeddings. The Johnson-Lindenstrauss (JL) lemma gives theoretical bound on the dimension of the low dimensional embedding. We adapt the RP technique for the estimation of information theoretical quantities. Intriguingly, we find that embeddings into extremely small dimensions, far below the bounds of the JL lemma, provide satisfactory estimates for the original task. We illustrate this in the Independent Subspace Analysis (ISA) task; we combine RP dimension reduction with a simple ensemble method. We gain considerable speed-up with the potential of real-time parallel estimation of high dimensional information theoretical quantities.

AB - The estimation of relevant information theoretical quantities, such as entropy, mutual information, and various divergences is computationally expensive in high dimensions. However, for this task, one may apply pairwise Euclidean distances of sample points, which suits random projection (RP) based low dimensional embeddings. The Johnson-Lindenstrauss (JL) lemma gives theoretical bound on the dimension of the low dimensional embedding. We adapt the RP technique for the estimation of information theoretical quantities. Intriguingly, we find that embeddings into extremely small dimensions, far below the bounds of the JL lemma, provide satisfactory estimates for the original task. We illustrate this in the Independent Subspace Analysis (ISA) task; we combine RP dimension reduction with a simple ensemble method. We gain considerable speed-up with the potential of real-time parallel estimation of high dimensional information theoretical quantities.

KW - Independent subspace analysis

KW - Information theoretical estimations

KW - Pairwise distances

KW - Random projection

UR - http://www.scopus.com/inward/record.url?scp=67149130044&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67149130044&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-00599-2_19

DO - 10.1007/978-3-642-00599-2_19

M3 - Article

AN - SCOPUS:67149130044

VL - 5441

SP - 146

EP - 153

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -