Search Results for author: Jasmijn Bastings

Found 20 papers, 9 papers with code

Training Text-to-Text Transformers with Privacy Guarantees

no code implementations NAACL (PrivateNLP) 2022 Natalia Ponomareva, Jasmijn Bastings, Sergei Vassilvitskii

We focus on T5 and show that by using recent advances in JAX and XLA we can train models with DP that do not suffer a large drop in pre-training utility, nor in training speed, and can still be fine-tuned to high accuracies on downstream tasks (e. g.


Dissecting Recall of Factual Associations in Auto-Regressive Language Models

no code implementations28 Apr 2023 Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson

Given a subject-relation query, we study how the model aggregates information about the subject and relation to predict the correct attribute.

Attribute Extraction

Simple Recurrence Improves Masked Language Models

no code implementations23 May 2022 Tao Lei, Ran Tian, Jasmijn Bastings, Ankur P. Parikh

In this work, we explore whether modeling recurrence into the Transformer architecture can both be beneficial and efficient, by building an extremely simple recurrent module into the Transformer.

Autoregressive Diffusion Models

1 code implementation ICLR 2022 Emiel Hoogeboom, Alexey A. Gritsenko, Jasmijn Bastings, Ben Poole, Rianne van den Berg, Tim Salimans

We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions.

Ranked #6 on Image Generation on CIFAR-10 (bits/dimension metric)

Image Generation

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

1 code implementation EMNLP (BlackboxNLP) 2020 Jasmijn Bastings, Katja Filippova

There is a recent surge of interest in using attention as explanation of model predictions, with mixed evidence on whether attention can be used as such.

We Need to Talk About Random Splits

1 code implementation EACL 2021 Anders Søgaard, Sebastian Ebert, Jasmijn Bastings, Katja Filippova

We argue that random splits, like standard splits, lead to overly optimistic performance estimates.

Domain Adaptation

Joey NMT: A Minimalist NMT Toolkit for Novices

8 code implementations IJCNLP 2019 Julia Kreutzer, Jasmijn Bastings, Stefan Riezler

We present Joey NMT, a minimalist neural machine translation toolkit based on PyTorch that is specifically designed for novices.

General Knowledge Machine Translation +2

Interpretable Neural Predictions with Differentiable Binary Variables

1 code implementation ACL 2019 Jasmijn Bastings, Wilker Aziz, Ivan Titov

The success of neural networks comes hand in hand with a desire for more interpretability.

Modeling Latent Sentence Structure in Neural Machine Translation

no code implementations18 Jan 2019 Jasmijn Bastings, Wilker Aziz, Ivan Titov, Khalil Sima'an

Recently it was shown that linguistic structure predicted by a supervised parser can be beneficial for neural machine translation (NMT).

Machine Translation NMT +1

Jump to better conclusions: SCAN both left and right

1 code implementation WS 2018 Jasmijn Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela

Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models.

Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks

no code implementations NAACL 2018 Diego Marcheggiani, Jasmijn Bastings, Ivan Titov

Semantic representations have long been argued as potentially useful for enforcing meaning preservation and improving generalization performance of machine translation methods.

Machine Translation Translation

Graph Convolutional Encoders for Syntax-aware Neural Machine Translation

no code implementations EMNLP 2017 Jasmijn Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima'an

We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation.

Machine Translation Translation

All Fragments Count in Parser Evaluation

no code implementations LREC 2014 Jasmijn Bastings, Khalil Sima{'}an

PARSEVAL, the default paradigm for evaluating constituency parsers, calculates parsing success (Precision/Recall) as a function of the number of matching labeled brackets across the test set.

Human Parsing Machine Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.