Motivation: In this paper, we propose using the Kalman filter (KF) as a pre-processing step in microarray-based molecular diagnosis. Incorporating the expression covariance between genes is important in such classification problems, since this represents the functional relationships that govern tissue state. Failing to fulfil such requirements may result in biologically implausible class prediction models. Here, we show that employing the KF to remove noise (while retaining meaningful covariance and thus being able to estimate the underlying biological state from microarray measurements) yields linearly separable data suitable for most classification algorithms. Results: We demonstrate the utility and performance of the KF as a robust disease-state estimator on publicly available binary and multi-class microarray datasets in combination with the most widely used classification methods to date. Moreover, using popular graphical representation schemes we show that our filtered datasets also have an improved visualization capability.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics