Search Results for author: Marek Rei

Found 54 papers, 18 papers with code

Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models

no code implementations23 Oct 2023 Matthieu Meeus, Shubham Jain, Marek Rei, Yves-Alexandre de Montjoye

First, we propose a procedure for the development and evaluation of document-level membership inference for LLMs by leveraging commonly used data sources for training and the model release date.


On the application of Large Language Models for language teaching and assessment technology

no code implementations17 Jul 2023 Andrew Caines, Luca Benedetto, Shiva Taslimipoor, Christopher Davis, Yuan Gao, Oeistein Andersen, Zheng Yuan, Mark Elliott, Russell Moore, Christopher Bryant, Marek Rei, Helen Yannakoudakis, Andrew Mullooly, Diane Nicholls, Paula Buttery

The recent release of very large language models such as PaLM and GPT-4 has made an unprecedented impact in the popular media and public consciousness, giving rise to a mixture of excitement and fear as to their capabilities and potential uses, and shining a light on natural language processing research which had not previously received so much attention.

Grammatical Error Correction Misinformation +1

Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms

no code implementations22 May 2023 Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Oana-Maria Camburu, Marek Rei

We apply our method to the highly challenging ANLI dataset, where our framework improves the performance of both a DeBERTa-base and BERT baseline.

Logical Reasoning Natural Language Inference +1

Improving Robustness in Knowledge Distillation Using Domain-Targeted Data Augmentation

no code implementations22 May 2023 Joe Stacey, Marek Rei

DMU is complementary to the domain-targeted augmentation, and substantially improves performance on SNLI-hard.

Data Augmentation Knowledge Distillation +2

Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers

no code implementations14 Mar 2023 Kamil Bujel, Andrew Caines, Helen Yannakoudakis, Marek Rei

Long-sequence transformers are designed to improve the representation of longer texts by language models and their performance on downstream document-level tasks.

Document Classification Language Modelling +2

Modelling Temporal Document Sequences for Clinical ICD Coding

no code implementations24 Feb 2023 Clarence Boon Liang Ng, Diogo Santos, Marek Rei

Past studies on the ICD coding problem focus on predicting clinical codes primarily based on the discharge summary.

An Extended Sequence Tagging Vocabulary for Grammatical Error Correction

2 code implementations12 Feb 2023 Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan

We extend a current sequence-tagging approach to Grammatical Error Correction (GEC) by introducing specialised tags for spelling correction and morphological inflection using the SymSpell and LemmInflect algorithms.

Grammatical Error Correction Morphological Inflection +1

Probing for targeted syntactic knowledge through grammatical error detection

1 code implementation28 Oct 2022 Christopher Davis, Christopher Bryant, Andrew Caines, Marek Rei, Paula Buttery

Targeted studies testing knowledge of subject-verb agreement (SVA) indicate that pre-trained language models encode syntactic information.

Grammatical Error Detection

Control Prefixes for Parameter-Efficient Text Generation

2 code implementations15 Oct 2021 Jordan Clive, Kris Cao, Marek Rei

Prefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application.

Abstractive Text Summarization Data-to-Text Generation +2

Guiding Visual Question Generation

no code implementations NAACL 2022 Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia

In traditional Visual Question Generation (VQG), most images have multiple concepts (e. g. objects and categories) for which a question could be generated, but models are trained to mimic an arbitrary choice of concept as given in their training data.

Question Generation Question-Generation +2

Supervising Model Attention with Human Explanations for Robust Natural Language Inference

1 code implementation16 Apr 2021 Joe Stacey, Yonatan Belinkov, Marek Rei

Natural Language Inference (NLI) models are known to learn from biases and artefacts within their training data, impacting how well they generalise to other unseen datasets.

Natural Language Inference

Memorisation versus Generalisation in Pre-trained Language Models

1 code implementation ACL 2022 Michael Tänzer, Sebastian Ruder, Marek Rei

State-of-the-art pre-trained language models have been shown to memorise facts and perform well with limited amounts of training data.

Few-Shot Learning Low Resource Named Entity Recognition +3

Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers

1 code implementation ACL (RepL4NLP) 2021 Kamil Bujel, Helen Yannakoudakis, Marek Rei

We investigate how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision.

Visual Cues and Error Correction for Translation Robustness

1 code implementation Findings (EMNLP) 2021 Zhenhao Li, Marek Rei, Lucia Specia

Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions.

Machine Translation Translation

Grammatical error detection in transcriptions of spoken English

no code implementations COLING 2020 Andrew Caines, Christian Bentz, Kate Knill, Marek Rei, Paula Buttery

We describe the collection of transcription corrections and grammatical error annotations for the CrowdED Corpus of spoken English monologues on business topics.

Grammatical Error Detection

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses

no code implementations EMNLP 2020 Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard

Evaluation of grammatical error correction (GEC) systems has primarily focused on essays written by non-native learners of English, which however is only part of the full spectrum of GEC applications.

Grammatical Error Correction Language Modelling

Multidirectional Associative Optimization of Function-Specific Word Representations

1 code implementation ACL 2020 Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen

We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures.

Semi-supervised Bootstrapping of Dialogue State Trackers for Task Oriented Modelling

no code implementations26 Nov 2019 Bo-Hsiang Tseng, Marek Rei, Paweł Budzianowski, Richard E. Turner, Bill Byrne, Anna Korhonen

Dialogue systems benefit greatly from optimizing on detailed annotations, such as transcribed utterances, internal dialogue state representations and dialogue act labels.

Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling

no code implementations IJCNLP 2019 Bo-Hsiang Tseng, Marek Rei, Pawe{\l} Budzianowski, Richard Turner, Bill Byrne, Anna Korhonen

Dialogue systems benefit greatly from optimizing on detailed annotations, such as transcribed utterances, internal dialogue state representations and dialogue act labels.

Context is Key: Grammatical Error Detection with Contextual Word Representations

1 code implementation WS 2019 Samuel Bell, Helen Yannakoudakis, Marek Rei

Grammatical error detection (GED) in non-native writing requires systems to identify a wide range of errors in text written by language learners.

Grammatical Error Detection

A Simple and Robust Approach to Detecting Subject-Verb Agreement Errors

no code implementations NAACL 2019 Simon Flachs, Oph{\'e}lie Lacroix, Marek Rei, Helen Yannakoudakis, Anders S{\o}gaard

While rule-based detection of subject-verb agreement (SVA) errors is sensitive to syntactic parsing errors and irregularities and exceptions to the main rules, neural sequential labelers have a tendency to overfit their training data.

Jointly Learning to Label Sentences and Tokens

2 code implementations14 Nov 2018 Marek Rei, Anders Søgaard

Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size.

Grammatical Error Detection Sentence Classification

Sequence Classification with Human Attention

1 code implementation CONLL 2018 Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei, Anders S{\o}gaard

Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP.

Abusive Language Classification +4

Variable Typing: Assigning Meaning to Variables in Mathematical Text

no code implementations NAACL 2018 Yiannos Stathopoulos, Simon Baker, Marek Rei, Simone Teufel

Our results show that the best performing MIR models make use of our typed index, compared to a formula index only containing raw symbols, thereby demonstrating the usefulness of variable typing.

Information Retrieval Retrieval

Scoring Lexical Entailment with a Supervised Directional Similarity Network

1 code implementation ACL 2018 Marek Rei, Daniela Gerz, Ivan Vulić

Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.

Lexical Entailment Word Embeddings

Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens

no code implementations NAACL 2018 Marek Rei, Anders Søgaard

Can attention- or gradient-based visualization techniques be used to infer token-level labels for binary sequence tagging problems, using networks trained only on sentence-level labels?

Neural Multi-task Learning in Automated Assessment

no code implementations21 Jan 2018 Ronan Cummins, Marek Rei

Grammatical error detection and automated essay scoring are two tasks in the area of automated assessment.

Automated Essay Scoring BIG-bench Machine Learning +2

An Error-Oriented Approach to Word Embedding Pre-Training

no code implementations WS 2017 Youmna Farag, Marek Rei, Ted Briscoe

Additionally, extending the model with corrections provides further performance gains when data sparsity is an issue.

Auxiliary Objectives for Neural Error Detection Models

no code implementations WS 2017 Marek Rei, Helen Yannakoudakis

We investigate the utility of different auxiliary objectives and training strategies within a neural sequence labeling approach to error detection in learner writing.

Grammatical Error Detection

Detecting Off-topic Responses to Visual Prompts

no code implementations WS 2017 Marek Rei

Automated methods for essay scoring have made great progress in recent years, achieving accuracies very close to human annotators.

Semi-supervised Multitask Learning for Sequence Labeling

3 code implementations ACL 2017 Marek Rei

We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset.

Chunking Grammatical Error Detection +5

Attending to Characters in Neural Sequence Labeling Models

no code implementations COLING 2016 Marek Rei, Gamal K. O. Crichton, Sampo Pyysalo

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words.

Chunking Grammatical Error Detection +3

Automatic Text Scoring Using Neural Networks

3 code implementations ACL 2016 Dimitrios Alikaniotis, Helen Yannakoudakis, Marek Rei

Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking.

A Joint Model for Word Embedding and Word Morphology

no code implementations WS 2016 Kris Cao, Marek Rei

This paper presents a joint model for performing unsupervised morphological analysis on words, and learning a character-level composition function from morphemes to word embeddings.

Morphological Analysis Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.