Diversity in Machine Learning

4 Jul 2018 · Zhiqiang Gong, Ping Zhong, Weidong Hu ·

Machine learning methods have achieved good performance and been widely applied in various real-world applications. They can learn the model adaptively and be better fit for special requirements of different tasks. Generally, a good machine learning system is composed of plentiful training data, a good model training process, and an accurate inference. Many factors can affect the performance of the machine learning process, among which the diversity of the machine learning process is an important one. The diversity can help each procedure to guarantee a total good machine learning: diversity of the training data ensures that the training data can provide more discriminative information for the model, diversity of the learned model (diversity in parameters of each model or diversity among different base models) makes each parameter/model capture unique or complement information and the diversity in inference can provide multiple choices each of which corresponds to a specific plausible local optimal result. Even though the diversity plays an important role in machine learning process, there is no systematical analysis of the diversification in machine learning system. In this paper, we systematically summarize the methods to make data diversification, model diversification, and inference diversification in the machine learning process, respectively. In addition, the typical applications where the diversity technology improved the machine learning performance have been surveyed, including the remote sensing imaging tasks, machine translation, camera relocalization, image segmentation, object detection, topic modeling, and others. Finally, we discuss some challenges of the diversity technology in machine learning and point out some directions in future work.

PDF Abstract