1 code implementation • ACL 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Lambert Mathias, Marzieh Saeidi, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
1 code implementation • EMNLP (newsum) 2021 • Andreas Marfurt, James Henderson
Abstractive summarization models heavily rely on copy mechanisms, such as the pointer network or attention, to achieve good performance, measured by textual overlap with reference summaries.
no code implementations • 13 Aug 2024 • Andrei C. Coman, Christos Theodoropoulos, Marie-Francine Moens, James Henderson
Link prediction models can benefit from incorporating textual descriptions of entities and relations, enabling fully inductive learning and flexibility in dynamic graphs.
1 code implementation • 18 Jul 2024 • Christos Theodoropoulos, Andrei Catalin Coman, James Henderson, Marie-Francine Moens
The ever-growing volume of biomedical publications creates a critical need for efficient knowledge discovery.
2 code implementations • 10 Jun 2024 • Yuta Nagano, Andrew Pyo, Martina Milighetti, James Henderson, John Shawe-Taylor, Benny Chain, Andreas Tiffeau-Mayer
Here we introduce a TCR language model called SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors), capable of data-efficient transfer learning.
no code implementations • 19 Apr 2024 • James Henderson, Yuta Nagano, Martina Milighetti, Andreas Tiffeau-Mayer
To perform this mapping requires the identification of sequence features most informative about function.
no code implementations • 1 Dec 2023 • Fabio Fehr, James Henderson
We extend the NVIB framework to replace all types of attention functions in Transformers, and show that existing pretrained Transformers can be reinterpreted as Nonparametric Variational (NV) models using a proposed identity initialisation.
1 code implementation • 27 Oct 2023 • James Henderson, Alireza Mohammadshahi, Andrei C. Coman, Lesly Miculicich
We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case.
2 code implementations • 26 Oct 2023 • Melika Behjati, Fabio Fehr, James Henderson
Finally, we show that NVIB compression results in a model which is more robust to adversarial perturbations.
no code implementations • 28 Aug 2023 • Andrei C. Coman, Christos Theodoropoulos, Marie-Francine Moens, James Henderson
Document-level relation extraction typically relies on text-based encoders and hand-coded pooling heuristics to aggregate information learned by the encoder.
2 code implementations • 15 May 2023 • Rabeeh Karimi Mahabadi, Hamish Ivison, Jaesung Tae, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan
Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains.
1 code implementation • 2 Nov 2022 • Alireza Mohammadshahi, Thomas Scialom, Majid Yazdani, Pouya Yanki, Angela Fan, James Henderson, Marzieh Saeidi
We demonstrate that RQUGE has a higher correlation with human judgment without relying on the reference question.
3 code implementations • 20 Oct 2022 • Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier
In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation.
no code implementations • 27 Jul 2022 • James Henderson, Fabio Fehr
We propose a VAE for Transformers by developing a variational information bottleneck regulariser for Transformer embeddings.
no code implementations • *SEM (NAACL) 2022 • Luis Espinosa-Anke, Alexander Shvets, Alireza Mohammadshahi, James Henderson, Leo Wanner
Recognizing and categorizing lexical collocations in context is useful for language learning, dictionary compilation and downstream NLP.
1 code implementation • 22 May 2022 • Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier
In this work, we assess the impact of compression methods on Multilingual Neural Machine Translation models (MNMT) for various language groups, gender, and semantic biases by extensive analysis of compressed models on different machine translation benchmarks, i. e. FLORES-101, MT-Gender, and DiBiMT.
2 code implementations • 3 Apr 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Marzieh Saeidi, Lambert Mathias, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
1 code implementation • Findings (ACL) 2022 • Lesly Miculicich, James Henderson
The state-of-the-art models for coreference resolution are based on independent mention pair-wise decisions.
Ranked #9 on Coreference Resolution on OntoNotes
3 code implementations • 7 Mar 2022 • Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, Francois Marelli, Francois Fleuret, James Henderson
We find that existing architectures such as MLPMixer, which achieves token mixing through a static MLP applied to each feature independently, are too detached from the inductive biases required for natural language understanding.
no code implementations • 13 Oct 2021 • Florian Mai, James Henderson
We address this issue by extending their method to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variable-size bag of vectors that grows with the size of the text, as in attention-based models.
1 code implementation • CoNLL (EMNLP) 2021 • Christos Theodoropoulos, James Henderson, Andrei C. Coman, Marie-Francine Moens
Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited.
Ranked #10 on Relation Extraction on Adverse Drug Events (ADE) Corpus
1 code implementation • ACL (IWPT) 2021 • James Barry, Alireza Mohammadshahi, Joachim Wagner, Jennifer Foster, James Henderson
The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure.
1 code implementation • ICLR 2021 • Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
Moreover, we show that our VIB model finds sentence representations that are more robust to biases in natural language inference datasets, and thereby obtains better generalization to out-of-domain datasets.
2 code implementations • NeurIPS 2021 • Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work.
1 code implementation • ACL 2021 • Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, James Henderson
State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model.
no code implementations • 15 Apr 2021 • Alireza Mohammadshahi, James Henderson
Recent models have shown that incorporating syntactic knowledge into the semantic role labelling (SRL) task leads to a significant improvement.
Ranked #6 on Semantic Role Labeling on CoNLL 2005 (using extra training data)
no code implementations • 1 Feb 2021 • Melika Behjati, James Henderson
Characters do not convey meaning, but sequences of characters do.
no code implementations • NAACL 2021 • Haozhou Wang, James Henderson, Paola Merlo
Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings -- maps of matching words across languages -- without supervision.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1
1 code implementation • EMNLP 2020 • Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith, James Henderson
Text autoencoders are commonly used for conditional generation tasks such as style transfer.
no code implementations • ACL 2020 • James Henderson
In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures.
1 code implementation • 29 Mar 2020 • Alireza Mohammadshahi, James Henderson
We propose the Recursive Non-autoregressive Graph-to-Graph Transformer architecture (RNGTr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing.
Ranked #8 on Dependency Parsing on Penn Treebank
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Alireza Mohammadshahi, James Henderson
We propose the Graph2Graph Transformer architecture for conditioning on and predicting arbitrary graphs, and apply it to the challenging task of transition-based dependency parsing.
no code implementations • 25 Sep 2019 • Rabeeh Karimi Mahabadi*, Florian Mai*, James Henderson
Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences.
2 code implementations • ACL 2020 • Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
We experiment on large-scale natural language inference and fact verification benchmarks, evaluating on out-of-domain datasets that are specifically designed to assess the robustness of models against known biases in the training data.
no code implementations • COLING (CRAC) 2020 • Lesly Miculicich, James Henderson
Learning to detect entity mentions without using syntactic information can be useful for integration and joint optimization with other tasks.
no code implementations • 29 May 2019 • Navid Rekabsaz, Nikolaos Pappas, James Henderson, Banriskhem K. Khonglah, Srikanth Madikeri
In this study, we propose a multilingual neural language model architecture, trained jointly on the domain-specific data of several low-resource languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • 14 May 2019 • Nikolaos Pappas, James Henderson
Many tasks, including language generation, benefit from learning the structure of the output space, particularly when the space of output labels is large and the data is sparse.
Ranked #12 on Language Modelling on Penn Treebank (Word Level)
no code implementations • IJCNLP 2019 • Haozhou Wang, James Henderson, Paola Merlo
Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages.
no code implementations • 13 Dec 2018 • Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury
The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders).
1 code implementation • TACL 2018 • Xiao Pu, Nikolaos Pappas, James Henderson, Andrei Popescu-Belis
We show that the concatenation of these vectors, and the use of a sense selection mechanism based on the weighted average of sense vectors, outperforms several baselines including sense-aware ones.
2 code implementations • EMNLP 2018 • Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson
Neural Machine Translation (NMT) can be improved by including document-level contextual information.
1 code implementation • WS 2018 • Nikolaos Pappas, Lesly Miculicich Werlen, James Henderson
The model is a generalized form of weight tying which shares parameters but allows learning a more flexible relationship with input word embeddings and allows the effective capacity of the output layer to be controlled.
1 code implementation • TACL 2019 • Nikolaos Pappas, James Henderson
This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones.
no code implementations • ICLR 2018 • James Henderson
Entailment vectors are a principled way to encode in a vector what information is known and what is unknown.
no code implementations • 6 Oct 2017 • James Henderson
Lexical entailment, such as hyponymy, is a fundamental issue in the semantics of natural language.
no code implementations • 30 Sep 2017 • Diana Nicoleta Popa, James Henderson
We demonstrate the usefulness of this representation by training bag-of-vector embeddings of dependency graphs and evaluating them on unsupervised semantic induction for the Semantic Textual Similarity and Natural Language Inference tasks.
no code implementations • CONLL 2017 • Christophe Moor, Paola Merlo, James Henderson, Haozhou Wang
This paper describes the University of Geneva{'}s submission to the CoNLL 2017 shared task Multilingual Parsing from Raw Text to Universal Dependencies (listed as the CLCL (Geneva) entry).
no code implementations • ACL 2016 • James Henderson, Diana Nicoleta Popa
Distributional semantics creates vector-space representations that capture many forms of semantic similarity, but their relation to semantic entailment has been less clear.
no code implementations • 4 Mar 2016 • Nikhil Garg, James Henderson
We propose a Bayesian model of unsupervised semantic role induction in multiple languages, and use it to explore the usefulness of parallel corpora for this task.
no code implementations • WS 2014 • Helen Hastie, Marie-Aude Aufaure, Panos Alexopoulos, Hugues Bouchard, Catherine Breslin, Heriberto Cuay{\'a}huitl, Nina Dethlefs, Milica Ga{\v{s}}i{\'c}, James Henderson, Oliver Lemon, Xingkun Liu, Peter Mika, Nesrine Ben Mustapha, Tim Potter, Verena Rieser, Blaise Thomson, Pirros Tsiakoulis, Yves Vanrompay, Boris Villazon-Terrazas, Majid Yazdani, Steve Young, Yanchao Yu
no code implementations • WS 2013 • Helen Hastie, Marie-Aude Aufaure, Panos Alexopoulos, Heriberto Cuay{\'a}huitl, Nina Dethlefs, Milica Gasic, James Henderson, Oliver Lemon, Xingkun Liu, Peter Mika, Nesrine Ben Mustapha, Verena Rieser, Blaise Thomson, Pirros Tsiakoulis, Yves Vanrompay
no code implementations • 16 Apr 2013 • Joel Lang, James Henderson
Previous work has shown the effectiveness of random walk hitting times as a measure of dissimilarity in a variety of graph-based learning problems such as collaborative filtering, query suggestion or finding paraphrases.