Search Results for author: Mark Gales

Found 44 papers, 18 papers with code

On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems

no code implementations NAACL (BEA) 2022 Yiting Lu, Stefano Bannò, Mark Gales

Due to a lack of end-to-end training data, SGEC is often implemented as a cascaded, modular system, consisting of speech recognition, disfluency removal, and grammatical error correction (GEC).

Grammatical Error Correction speech-recognition +1

LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History

1 code implementation28 Feb 2024 Akash Gupta, Ivaxi Sheth, Vyas Raina, Mark Gales, Mario Fritz

With the recent emergence of powerful instruction-tuned large language models (LLMs), various helpful conversational Artificial Intelligence (AI) systems have been deployed across many applications.

Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment

no code implementations21 Feb 2024 Vyas Raina, Adian Liusie, Mark Gales

Large Language Models (LLMs) are powerful zero-shot assessors and are increasingly used in real-world situations such as for written exams or benchmarking systems.

Adversarial Robustness Benchmarking

An Information-Theoretic Approach to Analyze NLP Classification Tasks

1 code implementation1 Feb 2024 Luran Wang, Mark Gales, Vatsal Raina

This work provides an information-theoretic framework to analyse the influence of inputs for text classification tasks.

Multiple-choice Reading Comprehension +4

Structural-Based Uncertainty in Deep Learning Across Anatomical Scales: Analysis in White Matter Lesion Segmentation

1 code implementation15 Nov 2023 Nataliia Molchanova, Vatsal Raina, Andrey Malinin, Francesco La Rosa, Adrien Depeursinge, Mark Gales, Cristina Granziera, Henning Muller, Mara Graziani, Meritxell Bach Cuadra

The results from a multi-centric MRI dataset of 172 patients demonstrate that our proposed measures more effectively capture model errors at the lesion and patient scales compared to measures that average voxel-scale uncertainty values.

Lesion Segmentation Uncertainty Quantification

Assessing Distractors in Multiple-Choice Tests

no code implementations8 Nov 2023 Vatsal Raina, Adian Liusie, Mark Gales

Specifically, we define quality in terms of the incorrectness, plausibility and diversity of the distractor options.

Multiple-choice Reading Comprehension

Minimum Bayes' Risk Decoding for System Combination of Grammatical Error Correction Systems

1 code implementation12 Sep 2023 Vyas Raina, Mark Gales

Minimum Bayes' Risk (MBR) decoding can be used to combine system outputs in a manner that encourages better alignment with the final assessment criterion.

Grammatical Error Correction

Analyzing Multiple-Choice Reading and Listening Comprehension Tests

no code implementations3 Jul 2023 Vatsal Raina, Adian Liusie, Mark Gales

Multiple-choice reading and listening comprehension tests are an important part of language assessment.

Multiple-choice Reading Comprehension +1

Sample Attackability in Natural Language Adversarial Attacks

1 code implementation21 Jun 2023 Vyas Raina, Mark Gales

Adversarial attack research in natural language processing (NLP) has made significant progress in designing powerful attack methods and defence approaches.

Adversarial Attack

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

1 code implementation8 Jun 2023 Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales

In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting.

Sentiment Perception Adversarial Attacks on Neural Machine Translation Systems

no code implementations2 May 2023 Vyas Raina, Mark Gales

In this work, adversarial attacks for NMT systems are explored from an output perception perspective.

Machine Translation NMT +1

Tackling Bias in the Dice Similarity Coefficient: Introducing nDSC for White Matter Lesion Segmentation

1 code implementation10 Feb 2023 Vatsal Raina, Nataliia Molchanova, Mara Graziani, Andrey Malinin, Henning Muller, Meritxell Bach Cuadra, Mark Gales

This work describes a detailed analysis of the recently proposed normalised Dice Similarity Coefficient (nDSC) for binary segmentation tasks as an adaptation of DSC which scales the precision at a fixed recall rate to tackle this bias.

Lesion Segmentation Segmentation

Identifying Adversarially Attackable and Robust Samples

1 code implementation30 Jan 2023 Vyas Raina, Mark Gales

We propose a deep-learning-based detector to identify the adversarially attackable and robust samples in an unseen dataset for an unseen target model.

Active Learning Adversarial Attack +1

World Knowledge in Multiple Choice Reading Comprehension

1 code implementation13 Nov 2022 Adian Liusie, Vatsal Raina, Mark Gales

Two metrics are described: the expected number of options, which measures whether a passage-free system can identify the answer a question using world knowledge; and the contextual mutual information, which measures the importance of context for a given question.

General Knowledge Multiple-choice +2

Parallel Attention Forcing for Machine Translation

no code implementations6 Nov 2022 Qingyun Dou, Mark Gales

Attention forcing has been introduced to address the mismatch, guiding the model with the generated back-history and reference attention.

Machine Translation NMT +1

Deliberation Networks and How to Train Them

no code implementations6 Nov 2022 Qingyun Dou, Mark Gales

A deliberation network consists of multiple standard sequence-to-sequence models, each one conditioned on the initial input and the output of the previous model.

Machine Translation Speech Synthesis

Multiple-Choice Question Generation: Towards an Automated Assessment Framework

no code implementations23 Sep 2022 Vatsal Raina, Mark Gales

Applying n-gram based approaches is challenging for this form of system as the reference set is unlikely to capture the full range of possible questions and answer options.

Multiple-choice Question Generation +2

Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment

no code implementations19 Aug 2022 Vyas Raina, Mark Gales

When considering the application of GEC systems to automated language assessment, the aim of an adversary could be to cheat by making a small change to a grammatically incorrect input sentence that conceals the errors from a GEC system, such that no edits are found and the candidate is unjustly awarded a perfect fluency score.

Adversarial Attack Grammatical Error Correction +1

Residue-Based Natural Language Adversarial Attack Detection

1 code implementation NAACL 2022 Vyas Raina, Mark Gales

Many popular image adversarial detection approaches are able to identify adversarial examples from embedding feature spaces, whilst in the NLP domain existing state of the art detection approaches solely focus on input text features, without consideration of model embedding spaces.

Adversarial Attack Detection Sentence +2

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

1 code implementation NeurIPS 2021 Max Ryabinin, Andrey Malinin, Mark Gales

\emph{Ensemble Distribution Distillation} is an approach that allows a single model to efficiently capture both the predictive performance and uncertainty estimates of an ensemble.

Should Ensemble Members Be Calibrated?

no code implementations13 Jan 2021 Xixin Wu, Mark Gales

It is shown that well calibrated ensemble members will not necessarily yield a well calibrated ensemble prediction, and if the ensemble prediction is well calibrated its performance cannot exceed that of the average performance of the calibrated ensemble members.

Image Classification

CUED_speech at TREC 2020 Podcast Summarisation Track

no code implementations4 Dec 2020 Potsawee Manakul, Mark Gales

Our approach consists of two steps: (1) Filtering redundant or less informative sentences in the transcription using the attention of a hierarchical model; (2) Applying a state-of-the-art text summarisation system (BART) fine-tuned on the Podcast data using a sequence-level reward function.

Ensemble Distillation Approaches for Grammatical Error Correction

no code implementations24 Nov 2020 Yassir Fathullah, Mark Gales, Andrey Malinin

It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.

Grammatical Error Correction

Complementary Systems for Off-Topic Spoken Response Detection

no code implementations WS 2020 Vatsal Raina, Mark Gales, Kate Knill

This paper examines one form of spoken language assessment; whether the response from the candidate is relevant to the prompt provided.

Data Augmentation

Regression Prior Networks

1 code implementation20 Jun 2020 Andrey Malinin, Sergey Chervontsev, Ivan Provilkov, Mark Gales

Prior Networks are a recently developed class of models which yield interpretable measures of uncertainty and have been shown to outperform state-of-the-art ensemble approaches on a range of tasks.

Monocular Depth Estimation regression

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

1 code implementation NeurIPS 2019 Andrey Malinin, Mark Gales

Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training.

Adversarial Attack Detection Adversarial Robustness +3

Ensemble Distribution Distillation

1 code implementation ICLR 2020 Andrey Malinin, Bruno Mlodozeniec, Mark Gales

The properties of EnD$^2$ are investigated on both an artificial dataset, and on the CIFAR-10, CIFAR-100 and TinyImageNet datasets, where it is shown that EnD$^2$ can approach the classification performance of an ensemble, and outperforms both standard DNNs and Ensemble Distillation on the tasks of misclassification and out-of-distribution input detection.

Prior Networks for Detection of Adversarial Attacks

no code implementations6 Dec 2018 Andrey Malinin, Mark Gales

In this work, Prior Networks are applied to adversarial attack detection using measures of uncertainty in a similar fashion to Monte-Carlo Dropout.

Adversarial Attack Detection

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks

no code implementations30 Oct 2018 Anton Ragni, Qiujia Li, Mark Gales, Yu Wang

These errors are not accounted for by the standard confidence estimation schemes and are hard to rectify in the upstream and downstream processing.

Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

4 code implementations30 Oct 2018 Qiujia Li, Preben Ness, Anton Ragni, Mark Gales

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Predictive Uncertainty Estimation via Prior Networks

1 code implementation NeurIPS 2018 Andrey Malinin, Mark Gales

Experiments on synthetic and MNIST and CIFAR-10 data show that unlike previous non-Bayesian methods PNs are able to distinguish between data and distributional uncertainty.

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription

no code implementations1 Feb 2018 Yu Wang, Xie Chen, Mark Gales, Anton Ragni, Jeremy Wong

As the combination approaches become more complicated the difference between the phonetic and graphemic systems further decreases.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Future Word Contexts in Neural Network Language Models

no code implementations18 Aug 2017 Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark Gales

Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words.

speech-recognition Speech Recognition

Incorporating Uncertainty into Deep Learning for Spoken Language Assessment

no code implementations ACL 2017 Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales

On experiments conducted on data from the Business Language Testing Service (BULATS), the proposed approach is found to outperform GPs and DNNs with MCD in uncertainty-based rejection whilst achieving comparable grading performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.