1 code implementation • ACL 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Lambert Mathias, Marzieh Saeidi, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
1 code implementation • EMNLP (newsum) 2021 • Andreas Marfurt, James Henderson
Abstractive summarization models heavily rely on copy mechanisms, such as the pointer network or attention, to achieve good performance, measured by textual overlap with reference summaries.
no code implementations • 23 May 2022 • Luis Espinosa-Anke, Alexander Shvets, Alireza Mohammadshahi, James Henderson, Leo Wanner
Recognizing and categorizing lexical collocations in context is useful for language learning, dictionary compilation and downstream NLP.
no code implementations • 22 May 2022 • Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier
Recently, very large pre-trained models achieve state-of-the-art results in various natural language processing (NLP) tasks, but their size makes it more challenging to apply them in resource-constrained environments.
1 code implementation • 3 Apr 2022 • Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Marzieh Saeidi, Lambert Mathias, Veselin Stoyanov, Majid Yazdani
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.
no code implementations • Findings (ACL) 2022 • Lesly Miculicich, James Henderson
The state-of-the-art models for coreference resolution are based on independent mention pair-wise decisions.
1 code implementation • 7 Mar 2022 • Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, Francois Marelli, Francois Fleuret, James Henderson
We find that existing architectures such as MLPMixer, which achieves token mixing through a static MLP applied to each feature independently, are too detached from the inductive biases required for natural language understanding.
no code implementations • 13 Oct 2021 • Florian Mai, James Henderson
We address this issue by extending their method to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variable-size bag of vectors that grows with the size of the text, as in attention-based models.
1 code implementation • CoNLL (EMNLP) 2021 • Christos Theodoropoulos, James Henderson, Andrei C. Coman, Marie-Francine Moens
Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited.
1 code implementation • ACL (IWPT) 2021 • James Barry, Alireza Mohammadshahi, Joachim Wagner, Jennifer Foster, James Henderson
The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure.
1 code implementation • ICLR 2021 • Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
Moreover, we show that our VIB model finds sentence representations that are more robust to biases in natural language inference datasets, and thereby obtains better generalization to out-of-domain datasets.
1 code implementation • NeurIPS 2021 • Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work.
1 code implementation • ACL 2021 • Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, James Henderson
State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model.
no code implementations • 15 Apr 2021 • Alireza Mohammadshahi, James Henderson
Our architecture is general and can be applied to encode any graph information for a desired downstream task.
Ranked #3 on
Semantic Role Labeling
on CoNLL 2005
no code implementations • 1 Feb 2021 • Melika Behjati, James Henderson
Characters do not convey meaning, but sequences of characters do.
no code implementations • NAACL 2021 • Haozhou Wang, James Henderson, Paola Merlo
Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings -- maps of matching words across languages -- without supervision.
Bilingual Lexicon Induction
Cross-Lingual Word Embeddings
+1
1 code implementation • EMNLP 2020 • Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith, James Henderson
Text autoencoders are commonly used for conditional generation tasks such as style transfer.
no code implementations • ACL 2020 • James Henderson
In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures.
1 code implementation • 29 Mar 2020 • Alireza Mohammadshahi, James Henderson
We propose the Recursive Non-autoregressive Graph-to-Graph Transformer architecture (RNGTr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing.
Ranked #7 on
Dependency Parsing
on Penn Treebank
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Alireza Mohammadshahi, James Henderson
We propose the Graph2Graph Transformer architecture for conditioning on and predicting arbitrary graphs, and apply it to the challenging task of transition-based dependency parsing.
no code implementations • 25 Sep 2019 • Rabeeh Karimi Mahabadi*, Florian Mai*, James Henderson
Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences.
2 code implementations • ACL 2020 • Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
We experiment on large-scale natural language inference and fact verification benchmarks, evaluating on out-of-domain datasets that are specifically designed to assess the robustness of models against known biases in the training data.
no code implementations • COLING (CRAC) 2020 • Lesly Miculicich, James Henderson
Learning to detect entity mentions without using syntactic information can be useful for integration and joint optimization with other tasks.
no code implementations • 29 May 2019 • Navid Rekabsaz, Nikolaos Pappas, James Henderson, Banriskhem K. Khonglah, Srikanth Madikeri
In this study, we propose a multilingual neural language model architecture, trained jointly on the domain-specific data of several low-resource languages.
1 code implementation • 14 May 2019 • Nikolaos Pappas, James Henderson
Many tasks, including language generation, benefit from learning the structure of the output space, particularly when the space of output labels is large and the data is sparse.
Ranked #9 on
Language Modelling
on WikiText-2
no code implementations • IJCNLP 2019 • Haozhou Wang, James Henderson, Paola Merlo
Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages.
no code implementations • 13 Dec 2018 • Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury
The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders).
1 code implementation • TACL 2018 • Xiao Pu, Nikolaos Pappas, James Henderson, Andrei Popescu-Belis
We show that the concatenation of these vectors, and the use of a sense selection mechanism based on the weighted average of sense vectors, outperforms several baselines including sense-aware ones.
2 code implementations • EMNLP 2018 • Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson
Neural Machine Translation (NMT) can be improved by including document-level contextual information.
1 code implementation • WS 2018 • Nikolaos Pappas, Lesly Miculicich Werlen, James Henderson
The model is a generalized form of weight tying which shares parameters but allows learning a more flexible relationship with input word embeddings and allows the effective capacity of the output layer to be controlled.
1 code implementation • TACL 2019 • Nikolaos Pappas, James Henderson
This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones.
no code implementations • ICLR 2018 • James Henderson
Entailment vectors are a principled way to encode in a vector what information is known and what is unknown.
no code implementations • 6 Oct 2017 • James Henderson
Lexical entailment, such as hyponymy, is a fundamental issue in the semantics of natural language.
no code implementations • 30 Sep 2017 • Diana Nicoleta Popa, James Henderson
We demonstrate the usefulness of this representation by training bag-of-vector embeddings of dependency graphs and evaluating them on unsupervised semantic induction for the Semantic Textual Similarity and Natural Language Inference tasks.
no code implementations • CONLL 2017 • Christophe Moor, Paola Merlo, James Henderson, Haozhou Wang
This paper describes the University of Geneva{'}s submission to the CoNLL 2017 shared task Multilingual Parsing from Raw Text to Universal Dependencies (listed as the CLCL (Geneva) entry).
no code implementations • ACL 2016 • James Henderson, Diana Nicoleta Popa
Distributional semantics creates vector-space representations that capture many forms of semantic similarity, but their relation to semantic entailment has been less clear.
no code implementations • 4 Mar 2016 • Nikhil Garg, James Henderson
We propose a Bayesian model of unsupervised semantic role induction in multiple languages, and use it to explore the usefulness of parallel corpora for this task.
no code implementations • WS 2014 • Helen Hastie, Marie-Aude Aufaure, Panos Alexopoulos, Hugues Bouchard, Catherine Breslin, Heriberto Cuay{\'a}huitl, Nina Dethlefs, Milica Ga{\v{s}}i{\'c}, James Henderson, Oliver Lemon, Xingkun Liu, Peter Mika, Nesrine Ben Mustapha, Tim Potter, Verena Rieser, Blaise Thomson, Pirros Tsiakoulis, Yves Vanrompay, Boris Villazon-Terrazas, Majid Yazdani, Steve Young, Yanchao Yu
no code implementations • WS 2013 • Helen Hastie, Marie-Aude Aufaure, Panos Alexopoulos, Heriberto Cuay{\'a}huitl, Nina Dethlefs, Milica Gasic, James Henderson, Oliver Lemon, Xingkun Liu, Peter Mika, Nesrine Ben Mustapha, Verena Rieser, Blaise Thomson, Pirros Tsiakoulis, Yves Vanrompay
no code implementations • 16 Apr 2013 • Joel Lang, James Henderson
Previous work has shown the effectiveness of random walk hitting times as a measure of dissimilarity in a variety of graph-based learning problems such as collaborative filtering, query suggestion or finding paraphrases.