Multidimensional indexing tools for the virtual observatory

I. Csabai, L. Dobos, M. Trencséni, G. Herczegh, P. Józsa, N. Purger, T. Budavári, A. S. Szalay

Research output: Contribution to journalArticle

51 Citations (Scopus)

Abstract

The last decade has seen a dramatic change in the way astronomy is carried out. The dawn of the the new microelectronic devices, like CCDs has dramatically extended the amount of observed data. Large, in some cases all sky surveys emerged in almost all the wavelength ranges of the observable spectrum of electromagnetic waves. This large amount of data has to be organized, published electronically and a new style of data retrieval is essential to exploit all the hidden information in the multiwavelength data. Many statistical algorithms required for these tasks run reasonably fast when using small sets of in-memory data, but take noticeable performance hits when operating on large databases that do not fit into memory. We utilize new software technologies to develop and evaluate fast multidimensional indexing schemes that inherently follow the underlying, highly non-uniform distribution of the data: they are layered uniform indices, hierarchical binary space partitioning, and sampled flat Voronoi tessellation of the data. These techniques can dramatically speed up operations such as finding similar objects by example, classifying objects or comparing extensive simulation sets with observations.

Original languageEnglish
Pages (from-to)852-857
Number of pages6
JournalAstronomische Nachrichten
Volume328
Issue number8
DOIs
Publication statusPublished - Aug 2007

Fingerprint

observatories
observatory
data retrieval
classifying
astronomy
microelectronics
charge coupled devices
electromagnetic radiation
computer programs
wavelengths
simulation
electromagnetic wave
partitioning
wavelength
software

Keywords

  • Astronomical databases: miscellaneous
  • Methods: data analysis

ASJC Scopus subject areas

  • Astronomy and Astrophysics
  • Space and Planetary Science

Cite this

Csabai, I., Dobos, L., Trencséni, M., Herczegh, G., Józsa, P., Purger, N., ... Szalay, A. S. (2007). Multidimensional indexing tools for the virtual observatory. Astronomische Nachrichten, 328(8), 852-857. https://doi.org/10.1002/asna.200710817

Multidimensional indexing tools for the virtual observatory. / Csabai, I.; Dobos, L.; Trencséni, M.; Herczegh, G.; Józsa, P.; Purger, N.; Budavári, T.; Szalay, A. S.

In: Astronomische Nachrichten, Vol. 328, No. 8, 08.2007, p. 852-857.

Research output: Contribution to journalArticle

Csabai, I, Dobos, L, Trencséni, M, Herczegh, G, Józsa, P, Purger, N, Budavári, T & Szalay, AS 2007, 'Multidimensional indexing tools for the virtual observatory', Astronomische Nachrichten, vol. 328, no. 8, pp. 852-857. https://doi.org/10.1002/asna.200710817
Csabai I, Dobos L, Trencséni M, Herczegh G, Józsa P, Purger N et al. Multidimensional indexing tools for the virtual observatory. Astronomische Nachrichten. 2007 Aug;328(8):852-857. https://doi.org/10.1002/asna.200710817
Csabai, I. ; Dobos, L. ; Trencséni, M. ; Herczegh, G. ; Józsa, P. ; Purger, N. ; Budavári, T. ; Szalay, A. S. / Multidimensional indexing tools for the virtual observatory. In: Astronomische Nachrichten. 2007 ; Vol. 328, No. 8. pp. 852-857.
@article{1a1775f361ba4059896e52378806a815,
title = "Multidimensional indexing tools for the virtual observatory",
abstract = "The last decade has seen a dramatic change in the way astronomy is carried out. The dawn of the the new microelectronic devices, like CCDs has dramatically extended the amount of observed data. Large, in some cases all sky surveys emerged in almost all the wavelength ranges of the observable spectrum of electromagnetic waves. This large amount of data has to be organized, published electronically and a new style of data retrieval is essential to exploit all the hidden information in the multiwavelength data. Many statistical algorithms required for these tasks run reasonably fast when using small sets of in-memory data, but take noticeable performance hits when operating on large databases that do not fit into memory. We utilize new software technologies to develop and evaluate fast multidimensional indexing schemes that inherently follow the underlying, highly non-uniform distribution of the data: they are layered uniform indices, hierarchical binary space partitioning, and sampled flat Voronoi tessellation of the data. These techniques can dramatically speed up operations such as finding similar objects by example, classifying objects or comparing extensive simulation sets with observations.",
keywords = "Astronomical databases: miscellaneous, Methods: data analysis",
author = "I. Csabai and L. Dobos and M. Trencs{\'e}ni and G. Herczegh and P. J{\'o}zsa and N. Purger and T. Budav{\'a}ri and Szalay, {A. S.}",
year = "2007",
month = "8",
doi = "10.1002/asna.200710817",
language = "English",
volume = "328",
pages = "852--857",
journal = "Astronomische Nachrichten",
issn = "0004-6337",
publisher = "Wiley-VCH Verlag",
number = "8",

}

TY - JOUR

T1 - Multidimensional indexing tools for the virtual observatory

AU - Csabai, I.

AU - Dobos, L.

AU - Trencséni, M.

AU - Herczegh, G.

AU - Józsa, P.

AU - Purger, N.

AU - Budavári, T.

AU - Szalay, A. S.

PY - 2007/8

Y1 - 2007/8

N2 - The last decade has seen a dramatic change in the way astronomy is carried out. The dawn of the the new microelectronic devices, like CCDs has dramatically extended the amount of observed data. Large, in some cases all sky surveys emerged in almost all the wavelength ranges of the observable spectrum of electromagnetic waves. This large amount of data has to be organized, published electronically and a new style of data retrieval is essential to exploit all the hidden information in the multiwavelength data. Many statistical algorithms required for these tasks run reasonably fast when using small sets of in-memory data, but take noticeable performance hits when operating on large databases that do not fit into memory. We utilize new software technologies to develop and evaluate fast multidimensional indexing schemes that inherently follow the underlying, highly non-uniform distribution of the data: they are layered uniform indices, hierarchical binary space partitioning, and sampled flat Voronoi tessellation of the data. These techniques can dramatically speed up operations such as finding similar objects by example, classifying objects or comparing extensive simulation sets with observations.

AB - The last decade has seen a dramatic change in the way astronomy is carried out. The dawn of the the new microelectronic devices, like CCDs has dramatically extended the amount of observed data. Large, in some cases all sky surveys emerged in almost all the wavelength ranges of the observable spectrum of electromagnetic waves. This large amount of data has to be organized, published electronically and a new style of data retrieval is essential to exploit all the hidden information in the multiwavelength data. Many statistical algorithms required for these tasks run reasonably fast when using small sets of in-memory data, but take noticeable performance hits when operating on large databases that do not fit into memory. We utilize new software technologies to develop and evaluate fast multidimensional indexing schemes that inherently follow the underlying, highly non-uniform distribution of the data: they are layered uniform indices, hierarchical binary space partitioning, and sampled flat Voronoi tessellation of the data. These techniques can dramatically speed up operations such as finding similar objects by example, classifying objects or comparing extensive simulation sets with observations.

KW - Astronomical databases: miscellaneous

KW - Methods: data analysis

UR - http://www.scopus.com/inward/record.url?scp=35348976684&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35348976684&partnerID=8YFLogxK

U2 - 10.1002/asna.200710817

DO - 10.1002/asna.200710817

M3 - Article

VL - 328

SP - 852

EP - 857

JO - Astronomische Nachrichten

JF - Astronomische Nachrichten

SN - 0004-6337

IS - 8

ER -