## Abstract

Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation.

Original language | English |
---|---|

Pages (from-to) | 124-141 |

Number of pages | 18 |

Journal | Neural Networks |

Volume | 50 |

DOIs | |

Publication status | Published - 2014 |

Externally published | Yes |

## Keywords

- Cleansing
- Filtering
- Maximum likelihood
- Outliers
- Probability reinforcements
- Robust inference