### Abstract

We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X ∈ ℝ^{d}. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when X has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension d. The asymptotic variance does not depend on the smoothness of the density of X or of the regression function. A non-asymptotic exponential concentration inequal-ity is also proved. We illustrate the use of the new estimate through testing whether a component of the vector X carries information for predicting Y.

Original language | English |
---|---|

Pages (from-to) | 1752-1778 |

Number of pages | 27 |

Journal | Electronic Journal of Statistics |

Volume | 12 |

Issue number | 1 |

DOIs | |

Publication status | Published - Jan 1 2018 |

### Fingerprint

### Keywords

- Asymptotic normality
- Concentration inequalities
- Dimension reduction
- Nearest-neighbor-based estimate
- Regression functional

### ASJC Scopus subject areas

- Statistics and Probability

### Cite this

*Electronic Journal of Statistics*,

*12*(1), 1752-1778. https://doi.org/10.1214/18-EJS1438

**A nearest neighbor estimate of the residual variance.** / Devroye, Luc; Györfi, L.; Lugosi, Gábor; Walk, Harro.

Research output: Contribution to journal › Article

*Electronic Journal of Statistics*, vol. 12, no. 1, pp. 1752-1778. https://doi.org/10.1214/18-EJS1438

}

TY - JOUR

T1 - A nearest neighbor estimate of the residual variance

AU - Devroye, Luc

AU - Györfi, L.

AU - Lugosi, Gábor

AU - Walk, Harro

PY - 2018/1/1

Y1 - 2018/1/1

N2 - We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X ∈ ℝd. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when X has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension d. The asymptotic variance does not depend on the smoothness of the density of X or of the regression function. A non-asymptotic exponential concentration inequal-ity is also proved. We illustrate the use of the new estimate through testing whether a component of the vector X carries information for predicting Y.

AB - We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X ∈ ℝd. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when X has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension d. The asymptotic variance does not depend on the smoothness of the density of X or of the regression function. A non-asymptotic exponential concentration inequal-ity is also proved. We illustrate the use of the new estimate through testing whether a component of the vector X carries information for predicting Y.

KW - Asymptotic normality

KW - Concentration inequalities

KW - Dimension reduction

KW - Nearest-neighbor-based estimate

KW - Regression functional

UR - http://www.scopus.com/inward/record.url?scp=85048490062&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048490062&partnerID=8YFLogxK

U2 - 10.1214/18-EJS1438

DO - 10.1214/18-EJS1438

M3 - Article

AN - SCOPUS:85048490062

VL - 12

SP - 1752

EP - 1778

JO - Electronic Journal of Statistics

JF - Electronic Journal of Statistics

SN - 1935-7524

IS - 1

ER -