Search Results for author: James Henderson

Found 58 papers, 24 papers with code

Sentence-level Planning for Especially Abstractive Summarization

1 code implementation EMNLP (newsum) 2021 Andreas Marfurt, James Henderson

Abstractive summarization models heavily rely on copy mechanisms, such as the pointer network or attention, to achieve good performance, measured by textual overlap with reference summaries.

Abstractive Text Summarization

Prompt-free and Efficient Few-shot Learning with Language Models

1 code implementation ACL 2022 Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Lambert Mathias, Marzieh Saeidi, Veselin Stoyanov, Majid Yazdani

Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.

Few-Shot Learning

Nonparametric Variational Regularisation of Pretrained Transformers

no code implementations1 Dec 2023 Fabio Fehr, James Henderson

We extend the NVIB framework to replace all types of attention functions in Transformers, and show that existing pretrained Transformers can be reinterpreted as Nonparametric Variational (NV) models using a proposed identity initialisation.

Transformers as Graph-to-Graph Models

1 code implementation27 Oct 2023 James Henderson, Alireza Mohammadshahi, Andrei C. Coman, Lesly Miculicich

We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case.

Learning to Abstract with Nonparametric Variational Information Bottleneck

no code implementations26 Oct 2023 Melika Behjati, Fabio Fehr, James Henderson

Finally, we show that NVIB compression results in a model which is more robust to adversarial perturbations.

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

1 code implementation15 May 2023 Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan

Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs.

Natural Language Understanding Paraphrase Generation +3

SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages

3 code implementations20 Oct 2022 Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation.

Machine Translation Translation

A Variational AutoEncoder for Transformers with Nonparametric Variational Information Bottleneck

no code implementations27 Jul 2022 James Henderson, Fabio Fehr

We propose a VAE for Transformers by developing a variational information bottleneck regulariser for Transformer embeddings.

What Do Compressed Multilingual Machine Translation Models Forget?

1 code implementation22 May 2022 Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

In this work, we assess the impact of compression methods on Multilingual Neural Machine Translation models (MNMT) for various language groups, gender, and semantic biases by extensive analysis of compressed models on different machine translation benchmarks, i. e. FLORES-101, MT-Gender, and DiBiMT.

Machine Translation Memorization +1

PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models

1 code implementation3 Apr 2022 Rabeeh Karimi Mahabadi, Luke Zettlemoyer, James Henderson, Marzieh Saeidi, Lambert Mathias, Veselin Stoyanov, Majid Yazdani

Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score.

Few-Shot Learning

Graph Refinement for Coreference Resolution

1 code implementation Findings (ACL) 2022 Lesly Miculicich, James Henderson

The state-of-the-art models for coreference resolution are based on independent mention pair-wise decisions.


HyperMixer: An MLP-based Low Cost Alternative to Transformers

2 code implementations7 Mar 2022 Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, Francois Marelli, Francois Fleuret, James Henderson

We find that existing architectures such as MLPMixer, which achieves token mixing through a static MLP applied to each feature independently, are too detached from the inductive biases required for natural language understanding.

Natural Language Understanding

Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation

no code implementations13 Oct 2021 Florian Mai, James Henderson

We address this issue by extending their method to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variable-size bag of vectors that grows with the size of the text, as in attention-based models.

Conditional Text Generation Sentence Summarization

Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning

1 code implementation CoNLL (EMNLP) 2021 Christos Theodoropoulos, James Henderson, Andrei C. Coman, Marie-Francine Moens

Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited.

Contrastive Learning Language Modelling +5

The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task

1 code implementation ACL (IWPT) 2021 James Barry, Alireza Mohammadshahi, Joachim Wagner, Jennifer Foster, James Henderson

The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure.


Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

1 code implementation ICLR 2021 Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Moreover, we show that our VIB model finds sentence representations that are more robust to biases in natural language inference datasets, and thereby obtains better generalization to out-of-domain datasets.

Natural Language Inference Transfer Learning

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

2 code implementations NeurIPS 2021 Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder

In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work.

Syntax-Aware Graph-to-Graph Transformer for Semantic Role Labelling

no code implementations15 Apr 2021 Alireza Mohammadshahi, James Henderson

Recent models have shown that incorporating syntactic knowledge into the semantic role labelling (SRL) task leads to a significant improvement.

Ranked #6 on Semantic Role Labeling on CoNLL 2005 (using extra training data)

Semantic Role Labeling

Multi-Adversarial Learning for Cross-Lingual Word Embeddings

no code implementations NAACL 2021 Haozhou Wang, James Henderson, Paola Merlo

Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings -- maps of matching words across languages -- without supervision.

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1

The Unstoppable Rise of Computational Linguistics in Deep Learning

no code implementations ACL 2020 James Henderson

In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures.

Natural Language Understanding

Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement

1 code implementation29 Mar 2020 Alireza Mohammadshahi, James Henderson

We propose the Recursive Non-autoregressive Graph-to-Graph Transformer architecture (RNGTr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing.

Dependency Parsing

Graph-to-Graph Transformer for Transition-based Dependency Parsing

1 code implementation Findings of the Association for Computational Linguistics 2020 Alireza Mohammadshahi, James Henderson

We propose the Graph2Graph Transformer architecture for conditioning on and predicting arbitrary graphs, and apply it to the challenging task of transition-based dependency parsing.

Structured Prediction Transition-Based Dependency Parsing

Learning Entailment-Based Sentence Embeddings from Natural Language Inference

no code implementations25 Sep 2019 Rabeeh Karimi Mahabadi*, Florian Mai*, James Henderson

Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences.

Inductive Bias Natural Language Inference +1

End-to-End Bias Mitigation by Modelling Biases in Corpora

2 code implementations ACL 2020 Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

We experiment on large-scale natural language inference and fact verification benchmarks, evaluating on out-of-domain datasets that are specifically designed to assess the robustness of models against known biases in the training data.

Fact Verification Natural Language Inference +1

Partially-supervised Mention Detection

no code implementations COLING (CRAC) 2020 Lesly Miculicich, James Henderson

Learning to detect entity mentions without using syntactic information can be useful for integration and joint optimization with other tasks.


Deep Residual Output Layers for Neural Language Generation

1 code implementation14 May 2019 Nikolaos Pappas, James Henderson

Many tasks, including language generation, benefit from learning the structure of the output space, particularly when the space of output labels is large and the data is sparse.

Language Modelling Machine Translation +1

Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings

no code implementations IJCNLP 2019 Haozhou Wang, James Henderson, Paola Merlo

Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages.

Cross-Lingual Word Embeddings Word Embeddings

Measuring Societal Biases from Text Corpora with Smoothed First-Order Co-occurrence

no code implementations13 Dec 2018 Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury

The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders).

Word Embeddings

Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation

1 code implementation TACL 2018 Xiao Pu, Nikolaos Pappas, James Henderson, Andrei Popescu-Belis

We show that the concatenation of these vectors, and the use of a sense selection mechanism based on the weighted average of sense vectors, outperforms several baselines including sense-aware ones.

Clustering Machine Translation +3

Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation

1 code implementation WS 2018 Nikolaos Pappas, Lesly Miculicich Werlen, James Henderson

The model is a generalized form of weight tying which shares parameters but allows learning a more flexible relationship with input word embeddings and allows the effective capacity of the output layer to be controlled.

Machine Translation Translation +1

GILE: A Generalized Input-Label Embedding for Text Classification

1 code implementation TACL 2019 Nikolaos Pappas, James Henderson

This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones.

General Classification Multi-Task Learning +3

Unsupervised Learning of Entailment-Vector Word Embeddings

no code implementations ICLR 2018 James Henderson

Entailment vectors are a principled way to encode in a vector what information is known and what is unknown.

Word Embeddings

Bag-of-Vector Embeddings of Dependency Graphs for Semantic Induction

no code implementations30 Sep 2017 Diana Nicoleta Popa, James Henderson

We demonstrate the usefulness of this representation by training bag-of-vector embeddings of dependency graphs and evaluating them on unsupervised semantic induction for the Semantic Textual Similarity and Natural Language Inference tasks.

Natural Language Inference Semantic Textual Similarity +1

CLCL (Geneva) DINN Parser: a Neural Network Dependency Parser Ten Years Later

no code implementations CONLL 2017 Christophe Moor, Paola Merlo, James Henderson, Haozhou Wang

This paper describes the University of Geneva{'}s submission to the CoNLL 2017 shared task Multilingual Parsing from Raw Text to Universal Dependencies (listed as the CLCL (Geneva) entry).

Dependency Parsing Feature Engineering +1

A Vector Space for Distributional Semantics for Entailment

no code implementations ACL 2016 James Henderson, Diana Nicoleta Popa

Distributional semantics creates vector-space representations that capture many forms of semantic similarity, but their relation to semantic entailment has been less clear.

Lexical Entailment Semantic Similarity +1

A Bayesian Model of Multilingual Unsupervised Semantic Role Induction

no code implementations4 Mar 2016 Nikhil Garg, James Henderson

We propose a Bayesian model of unsupervised semantic role induction in multiple languages, and use it to explore the usefulness of parallel corpora for this task.

Efficient Computation of Mean Truncated Hitting Times on Very Large Graphs

no code implementations16 Apr 2013 Joel Lang, James Henderson

Previous work has shown the effectiveness of random walk hitting times as a measure of dissimilarity in a variety of graph-based learning problems such as collaborative filtering, query suggestion or finding paraphrases.

Collaborative Filtering

Cannot find the paper you are looking for? You can Submit a new open access paper.