Effects of sampling on statistics of large-scale structure

Stéphane Colombi, István Szapudi, Alexander S. Szalay

Research output: Contribution to journalArticle

38 Citations (Scopus)


The effects of sampling are investigated on measurements of counts-in-cells in three-dimensional magnitude-limited galaxy surveys, with emphasis on moments of the underlying smooth galaxy density field convolved with a spherical window. A new estimator is proposed for measuring the kth-order moment 〈ρk〉: the weighted factorial moment F̃k[ω]. Since these statistics are corrected for the effects of the varying selection function, they can extract the moments in one pass without the need to construct a series of volume-limited samples. The cosmic error on the measurement of F̃k[ω] is computed via the formalism of Szapudi & Colombi, which is generalized to include the effects of the selection function. The integral equation for finding the minimum variance weight is solved numerically, and an accurate and intuitive analytical approximation is derived, ωoptimal(r) ∝ 1/Δ(r), where Δ(r) is the cosmic error as a function of the distance from the observer. The resulting estimator is more accurate than the traditional method of counts-in-cells in volume-limited samples, which discards useful information. As a practical example, it is demonstrated that, unless unforeseen systematics prevent it, the proposed method will extract moments of the galaxy distribution in the future Sloan Digital Sky Survey (hereafter SDSS) with accuracy of order a few per cent for k = 2, 3 and better than 10 per cent for k = 4 in the scale range of 1 ≤ ℓ ≤ 50 h-1 Mpc. In the particular case of the SDSS, a homogeneous (spatial) weight ω = 1 is reasonably close to optimal. Optimal sampling strategies for designing magnitude-limited redshift surveys are investigated as well. The arguments of Kaiser are extended to higher order moments, and it is found that the optimal strategy depends greatly on the statistics and scales considered. A sampling rate f ∼ 1/3 - 1/10 is appropriate to measure low-order moments with k ≤ 4 in the scale range 1 ≲ ℓ ≲ 50 h-1 Mpc. However, the optimal sampling rate increases with the order considered, k, and with 1/ℓ. Therefore counts-in-cells statistics in general, such as the shape of the distribution function, high-order moments, cluster selection, etc., require full sampling, especially at small, highly non-linear scales ℓ ∼ 1 h-1 Mpc. Another design issue is the optimal geometry of a catalogue, when it covers only a small fraction of the sky. Similarly to Kaiser, we find that a survey composed of several compact subsamples of angular size ΩF spread over the sky on a glass-like structure would do better, with regards to the cosmic error, than the compact or the traditional slice-like configurations, at least at small scales. The required dynamic range of the measurements determines the characteristic size of the subsamples. It is, however, difficult to estimate, since an accurate and cumbersome calculation of edge effects would be required at scales comparable to the size of a subsample.

Original languageEnglish
Pages (from-to)253-274
Number of pages22
JournalMonthly Notices of the Royal Astronomical Society
Issue number2
Publication statusPublished - May 11 1998


  • Galaxies: clusters: general
  • Large-scale structure of Universe
  • Methods: numerical
  • Methods: statistical

ASJC Scopus subject areas

  • Astronomy and Astrophysics
  • Space and Planetary Science

Fingerprint Dive into the research topics of 'Effects of sampling on statistics of large-scale structure'. Together they form a unique fingerprint.

  • Cite this