Spatial clustering of galaxies in large datasets

Alexander S. Szalay, Tamás Budavari, Andrew Connolly, Jim Gray, Takahiko Matsubara, Adrian Pope, István Szapudi

Research output: Contribution to journalConference article

1 Citation (Scopus)


Datasets with tens of millions of galaxies present new challenges for the analysis of spatial clustering. We have built a framework, that integrates a database of object catalogs, tools for creating masks of bad regions, and a fast (NlogN) correlation code. This system has enabled unprecedented efficiency in carrying out the analysis of galaxy clustering in the SDSS catalog. A similar approach is used to compute the three-dimensional spatial clustering of galaxies on very large scales. We describe our strategy to estimate the effect of photometric errors using a database. We discuss our efforts as an early example of data-intensive science. While it would have been possible to get these results without the framework we describe, it will be infeasible to perform these computations on the future huge datasets without using this framework.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalProceedings of SPIE - The International Society for Optical Engineering
Publication statusPublished - Dec 1 2002
EventAstronomical data Analysis II - Waikoloa, HI, United States
Duration: Aug 27 2002Aug 28 2002



  • Clustering
  • Cosmology
  • Databases
  • Galaxies
  • Large-scale structure
  • Spatial statistics

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Cite this