1 code implementation • NAACL (BEA) 2022 • Gladys Tyen, Mark Brenchley, Andrew Caines, Paula Buttery
State-of-the-art chatbots for English are now able to hold conversations on virtually any topic (e. g. Adiwardana et al., 2020; Roller et al., 2021).
no code implementations • NAACL (BEA) 2022 • Thiemo Wambsganss, Andrew Caines, Paula Buttery
We present an approach which automatically detects claim-premise structures and provides visual feedback to the learner to prompt them to repair any broken argumentation structures. To investigate, if our persuasive feedback on language learners’ essay writing tasks engages and supports them in learning better English language, we designed the ALEN app (Argumentation for Learning English).
no code implementations • NAACL (BEA) 2022 • Roman Rietsche, Andrew Caines, Cornelius Schramm, Dominik Pfütze, Paula Buttery
This peer-to-peer feedback has become increasingly important whether in MOOCs to provide feedback to thousands of students or in large-scale classes at universities.
no code implementations • EMNLP (WNUT) 2020 • Jack Hughes, Seth Aycock, Andrew Caines, Paula Buttery, Alice Hutchings
We present a lightweight method for identifying currently trending terms in relation to a known prior of terms, using a weighted log-odds ratio with an informative prior.
1 code implementation • LREC 2022 • Mariano Felice, Shiva Taslimipoor, Øistein E. Andersen, Paula Buttery
Open cloze tests are a standard type of exercise where examinees must complete a text by filling in the gaps without any given options to choose from.
no code implementations • EACL (VarDial) 2021 • Rami Aly, Andrew Caines, Paula Buttery
The most successful approach to Neural Machine Translation (NMT) when only monolingual training data is available, called unsupervised machine translation, is based on back-translation where noisy translations are generated to turn the task into a supervised one.
no code implementations • 15 Jan 2024 • Christopher Davis, Andrew Caines, Øistein Andersen, Shiva Taslimipoor, Helen Yannakoudakis, Zheng Yuan, Christopher Bryant, Marek Rei, Paula Buttery
Thanks to recent advances in generative AI, we are able to prompt large language models (LLMs) to produce texts which are fluent and grammatical.
no code implementations • 15 Nov 2023 • Richard Diehl Martinez, Zebulon Goriely, Hope McGovern, Christopher Davis, Andrew Caines, Paula Buttery, Lisa Beinborn
We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge.
no code implementations • 17 Jul 2023 • Andrew Caines, Luca Benedetto, Shiva Taslimipoor, Christopher Davis, Yuan Gao, Oeistein Andersen, Zheng Yuan, Mark Elliott, Russell Moore, Christopher Bryant, Marek Rei, Helen Yannakoudakis, Andrew Mullooly, Diane Nicholls, Paula Buttery
The recent release of very large language models such as PaLM and GPT-4 has made an unprecedented impact in the popular media and public consciousness, giving rise to a mixture of excitement and fear as to their capabilities and potential uses, and shining a light on natural language processing research which had not previously received so much attention.
1 code implementation • 28 Oct 2022 • Christopher Davis, Christopher Bryant, Andrew Caines, Marek Rei, Paula Buttery
Targeted studies testing knowledge of subject-verb agreement (SVA) indicate that pre-trained language models encode syntactic information.
no code implementations • Findings (ACL) 2022 • Mariano Felice, Shiva Taslimipoor, Paula Buttery
This paper presents the first multi-objective transformer model for constructing open cloze tests that exploits generation and discrimination capabilities to improve performance.
no code implementations • COLING 2020 • Andrew Caines, Christian Bentz, Kate Knill, Marek Rei, Paula Buttery
We describe the collection of transcription corrections and grammatical error annotations for the CrowdED Corpus of spoken English monologues on business topics.
no code implementations • NLP4CALL (COLING) 2020 • Andrew Caines, Helen Yannakoudakis, Helena Edmondson, Helen Allen, Pascual Pérez-Paredes, Bill Byrne, Paula Buttery
The Teacher-Student Chatroom Corpus (TSCC) is a collection of written conversations captured during one-to-one lessons between teachers and learners of English.
no code implementations • ACL 2020 • Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis
We address the task of automatically grading the language proficiency of spontaneous speech based on textual features from automatic speech recognition transcripts.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • LREC 2020 • Andrew Caines, Paula Buttery
We report on our attempts to reproduce the work described in Vajjala {\&} Rama 2018, {`}Experiments with universal CEFR classification{'}, as part of REPROLANG 2020: this involves featured-based and neural approaches to essay scoring in Czech, German and Italian.
no code implementations • 23 Apr 2020 • Ahmed Zaidi, Andrew Caines, Russell Moore, Paula Buttery, Andrew Rice
The forgetting curve has been extensively explored by psychologists, educationalists and cognitive scientists alike.
no code implementations • RANLP 2019 • Mariano Felice, Paula Buttery
To the best of our knowledge, this is the first unsupervised information-theoretical approach to evaluating the quality of cloze tests.
no code implementations • SEMEVAL 2019 • Guy Aglionby, Chris Davis, Pushkar Mishra, Andrew Caines, Helen Yannakoudakis, Marek Rei, Ekaterina Shutova, Paula Buttery
We describe the CAMsterdam team entry to the SemEval-2019 Shared Task 6 on offensive language identification in Twitter data.
no code implementations • WS 2018 • Andrew Caines, Sergio Pastrana, Alice Hutchings, Paula Buttery
We probe the heterogeneity in levels of abusive language in different sections of the Internet, using an annotated corpus of Wikipedia page edit comments to train a binary classifier for abuse detection.
no code implementations • WS 2017 • Andrew Caines, Michael McCarthy, Paula Buttery
We present an analysis of parser performance on speech data, comparing word type and token frequency distributions with written data, and evaluating parse accuracy by length of input string.
no code implementations • WS 2017 • Emma Flint, Elliot Ford, Olivia Thomas, Andrew Caines, Paula Buttery
This paper investigates the problem of text normalisation; specifically, the normalisation of non-standard words (NSWs) in English.
no code implementations • WS 2017 • Andrew Caines, Emma Flint, Paula Buttery
We present crowdsourced collection of error annotations for transcriptions of spoken learner English.
no code implementations • COLING 2016 • Russell Moore, Andrew Caines, Calbert Graham, Paula Buttery
In order to apply computational linguistic analyses and pass information to downstream applications, transcriptions of speech obtained via automatic speech recognition (ASR) need to be divided into smaller meaningful units, in a task we refer to as {`}speech-unit (SU) delimitation{'}.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • LREC 2016 • Wanru Zhang, Andrew Caines, Dimitrios Alikaniotis, Paula Buttery
Binary file summaries/958. html matches
no code implementations • LREC 2016 • Andrew Caines, Christian Bentz, Calbert Graham, Tim Polzehl, Paula Buttery
We announce the release of the CROWDED CORPUS: a pair of speech corpora collected via crowdsourcing, containing a native speaker corpus of English (CROWDED{\_}ENGLISH), and a corpus of German/English bilinguals (CROWDED{\_}BILINGUAL).
no code implementations • LREC 2012 • Andrew Caines, Paula Buttery
We present a set of stand-off annotations for the ninety thousand sentences in the spoken section of the British National Corpus (BNC) which feature a progressive aspect verb group.
no code implementations • LREC 2012 • Paula Buttery, Andrew Caines
The premise was not only to compare the results of two quite different methods for our own interest, but also to enable other researchers to choose whichever reclassification better suited their purpose (one being grounded purely in theoretical linguistics and the other in practical language engineering).