How to avoid over-fitting in multivariate calibration-The conventional validation approach and an alternative

N. M. Faber, R. Rajkó

Research output: Contribution to journalArticle

138 Citations (Scopus)

Abstract

This paper critically reviews the problem of over-fitting in multivariate calibration and the conventional validation-based approach to avoid it. It proposes a randomization test that enables one to assess the statistical significance of each component that enters the model. This alternative is compared with cross-validation and independent test set validation for the calibration of a near-infrared spectral data set using partial least squares (PLS) regression. The results indicate that the alternative approach is more objective, since, unlike the validation-based approach, it does not require the use of 'soft' decision rules. The alternative approach therefore appears to be a useful addition to the chemometrician's toolbox.

Original languageEnglish
Pages (from-to)98-106
Number of pages9
JournalAnalytica Chimica Acta
Volume595
Issue number1-2 SPEC. ISS.
DOIs
Publication statusPublished - Jul 9 2007

Keywords

  • Component selection
  • Cross-validation
  • Multivariate calibration
  • Near-infrared spectroscopy
  • PLS
  • Randomization test
  • Test set validation

ASJC Scopus subject areas

  • Analytical Chemistry
  • Biochemistry
  • Environmental Chemistry
  • Spectroscopy

Fingerprint Dive into the research topics of 'How to avoid over-fitting in multivariate calibration-The conventional validation approach and an alternative'. Together they form a unique fingerprint.

  • Cite this