A way to achieve feature selection for classification problems polluted by label noise is proposed. The performances of traditional feature selection algorithms often decrease sharply when some samples are wrongly labelled. A method based on a probabilistic label noise model combined with a nearest neighbours-based entropy estimator is introduced to robustly evaluate the mutual information, a popular relevance criterion for feature selection. A backward greedy search procedure is used in combination with this criterion to find relevant sets of features. Experiments establish that (i) there is a real need to take a possible label noise into account when selecting features and (ii) the proposed methodology is effectively able to reduce the negative impact of the mislabelled data points on the feature selection process.
- Entropy estimation
- Feature selection
- Label noise
- Mutual information
Frénay, B., Doquire, G., & Verleysen, M. (2014). Estimating Mutual information for feature selection in the presence of label noise. Computational Statistics and Data Analysis, 71, 832-848. https://doi.org/10.1016/j.csda.2013.05.001