In Machine Learning, ensemble methods have been receiving a great deal of
attention. Techniques such as Bagging and Boosting have been successfully
applied to a variety of problems...
Nevertheless, such techniques are still
susceptible to the effects of noise and outliers in the training data. We
propose a new method for the generation of pools of classifiers based on
Bagging, in which the probability of an instance being selected during the
resampling process is inversely proportional to its instance hardness, which
can be understood as the likelihood of an instance being misclassified,
regardless of the choice of classifier. The goal of the proposed method is to
remove noisy data without sacrificing the hard instances which are likely to be
found on class boundaries. We evaluate the performance of the method in
nineteen public data sets, and compare it to the performance of the Bagging and
Random Subspace algorithms. Our experiments show that in high noise scenarios
the accuracy of our method is significantly better than that of Bagging.