### Abstract

We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X ∈ ℝ^{d}. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when X has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension d. The asymptotic variance does not depend on the smoothness of the density of X or of the regression function. A non-asymptotic exponential concentration inequal-ity is also proved. We illustrate the use of the new estimate through testing whether a component of the vector X carries information for predicting Y.

Original language | English |
---|---|

Pages (from-to) | 1752-1778 |

Number of pages | 27 |

Journal | Electronic Journal of Statistics |

Volume | 12 |

Issue number | 1 |

DOIs | |

Publication status | Published - Jan 1 2018 |

### Fingerprint

### Keywords

- Asymptotic normality
- Concentration inequalities
- Dimension reduction
- Nearest-neighbor-based estimate
- Regression functional

### ASJC Scopus subject areas

- Statistics and Probability

### Cite this

*Electronic Journal of Statistics*,

*12*(1), 1752-1778. https://doi.org/10.1214/18-EJS1438