We contribute the first large-scale comparison of stopping criteria, using a cost measure to quantify the accuracy/label trade-off, public implementations of all stopping criteria we evaluate, and an open-source framework for evaluating stopping criteria.
This work explores a new training method for semi-supervised learning that is based on similarity function learning using a Siamese network to obtain a suitable embedding.
Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their predictions and added to the training set, with this process being repeated multiple times.
Deep neural networks produce state-of-the-art results when trained on a large number of labeled examples but tend to overfit when small amounts of labeled examples are used for training.
Game-theoretic attribution techniques based on Shapley values are used extensively to interpret black-box machine learning models, but their exact calculation is generally NP-hard, requiring approximation methods for non-trivial models.
SHAP (SHapley Additive exPlanation) values provide a game theoretic interpretation of the predictions of machine learning models based on Shapley values.
no code implementations • 7 Oct 2020 • Moi Hoon Yap, Ryo Hachiuma, Azadeh Alavi, Raphael Brungel, Bill Cassidy, Manu Goyal, Hongtao Zhu, Johannes Ruckert, Moshe Olshansky, Xiao Huang, Hideo Saito, Saeed Hassanpour, Christoph M. Friedrich, David Ascher, Anping Song, Hiroki Kajita, David Gillespie, Neil D. Reeves, Joseph Pappachan, Claire O'Shea, Eibe Frank
DFUC2020 provided participants with a comprehensive dataset consisting of 2, 000 images for training and 2, 000 images for testing.
The proposed method creates new members of the ensemble from mini-batches of data as new data becomes available.
no code implementations • 24 Apr 2020 • Bill Cassidy, Neil D. Reeves, Pappachan Joseph, David Gillespie, Claire O'Shea, Satyan Rajbhandari, Arun G. Maiya, Eibe Frank, Andrew Boulton, David Armstrong, Bijan Najafi, Justina Wu, Moi Hoon Yap
Every 20 seconds, a limb is amputated somewhere in the world due to diabetes.
code2vec is a recently released embedding approach that uses the proxy task of method name prediction to map Java methods to feature vectors.
This performance led to further studies of how exactly it works, and how it could be improved, and in the recent decade numerous studies have explored classifier chains mechanisms on a theoretical level, and many improvements have been made to the training and inference procedures, such that this method remains among the state-of-the-art options for multi-label learning.
Nested dichotomies are used as a method of transforming a multiclass classification problem into a series of binary problems.
Obtaining accurate and well calibrated probability estimates from classifiers is useful in many applications, for example, when minimising the expected cost of classifications.
Effective regularisation of neural networks is essential to combat overfitting due to the large number of parameters involved.
We investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs.
To investigate this question we use population-based optimisation algorithms to generate artificial surrogate training data for naive Bayes for regression.
A system of nested dichotomies is a method of decomposing a multi-class problem into a collection of binary problems.