Search Results for author: Andrey Malinin

Found 16 papers, 9 papers with code

Multi-Sentence Resampling: A Simple Approach to Alleviate Dataset Length Bias and Beam-Search Degradation

1 code implementation EMNLP 2021 Ivan Provilkov, Andrey Malinin

We demonstrate that MSR significantly reduces degradation with growing beam size and improves final translation quality on the IWSTL$15$ En-Vi, IWSTL$17$ En-Fr, and WMT$14$ En-De datasets.

Automatic Speech Recognition Data Augmentation +3

Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

3 code implementations15 Jul 2021 Andrey Malinin, Neil Band, Ganshin, Alexander, German Chesnokov, Yarin Gal, Mark J. F. Gales, Alexey Noskov, Andrey Ploskonosov, Liudmila Prokhorenkova, Ivan Provilkov, Vatsal Raina, Vyas Raina, Roginskiy, Denis, Mariya Shmatova, Panos Tigas, Boris Yangel

However, many tasks of practical interest have different modalities, such as tabular data, audio, text, or sensor data, which offer significant challenges involving regression and discrete or continuous structured prediction.

Image Classification Machine Translation +3

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

1 code implementation NeurIPS 2021 Max Ryabinin, Andrey Malinin, Mark Gales

\emph{Ensemble Distribution Distillation} is an approach that allows a single model to efficiently capture both the predictive performance and uncertainty estimates of an ensemble.

Ensemble Distillation Approaches for Grammatical Error Correction

no code implementations24 Nov 2020 Yassir Fathullah, Mark Gales, Andrey Malinin

It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.

Grammatical Error Correction

Regression Prior Networks

1 code implementation20 Jun 2020 Andrey Malinin, Sergey Chervontsev, Ivan Provilkov, Mark Gales

Prior Networks are a recently developed class of models which yield interpretable measures of uncertainty and have been shown to outperform state-of-the-art ensemble approaches on a range of tasks.

Monocular Depth Estimation

Uncertainty in Gradient Boosting via Ensembles

no code implementations ICLR 2021 Andrey Malinin, Liudmila Prokhorenkova, Aleksei Ustimenko

For many practical, high-risk applications, it is essential to quantify uncertainty in a model's predictions to avoid costly mistakes.

General Classification

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

1 code implementation NeurIPS 2019 Andrey Malinin, Mark Gales

Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training.

Adversarial Attack Detection Adversarial Robustness +3

Ensemble Distribution Distillation

1 code implementation ICLR 2020 Andrey Malinin, Bruno Mlodozeniec, Mark Gales

The properties of EnD$^2$ are investigated on both an artificial dataset, and on the CIFAR-10, CIFAR-100 and TinyImageNet datasets, where it is shown that EnD$^2$ can approach the classification performance of an ensemble, and outperforms both standard DNNs and Ensemble Distillation on the tasks of misclassification and out-of-distribution input detection.

Prior Networks for Detection of Adversarial Attacks

no code implementations6 Dec 2018 Andrey Malinin, Mark Gales

In this work, Prior Networks are applied to adversarial attack detection using measures of uncertainty in a similar fashion to Monte-Carlo Dropout.

Adversarial Attack Detection

Predictive Uncertainty Estimation via Prior Networks

1 code implementation NeurIPS 2018 Andrey Malinin, Mark Gales

Experiments on synthetic and MNIST and CIFAR-10 data show that unlike previous non-Bayesian methods PNs are able to distinguish between data and distributional uncertainty.

Incorporating Uncertainty into Deep Learning for Spoken Language Assessment

no code implementations ACL 2017 Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales

On experiments conducted on data from the Business Language Testing Service (BULATS), the proposed approach is found to outperform GPs and DNNs with MCD in uncertainty-based rejection whilst achieving comparable grading performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.