no code implementations • Findings (EMNLP) 2021 • Yuntian Deng, Alexander Rush
Non-autoregressive machine translation (NAT) approaches enable fast generation by utilizing parallelizable generative processes.
1 code implementation • ACL 2020 • Alexander Rush
The literature on structured prediction for NLP describes a rich collection of distributions and algorithms over sequences, segmentations, alignments, and trees; however, these algorithms are difficult to utilize in deep learning frameworks.
1 code implementation • 13 Dec 2024 • Yair Schiff, Subham Sekhar Sahoo, Hao Phung, Guanghan Wang, Sam Boshar, Hugo Dalla-torre, Bernardo P. de Almeida, Alexander Rush, Thomas Pierrot, Volodymyr Kuleshov
Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms.
2 code implementations • 11 Jun 2024 • Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov
While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling.
Ranked #1 on Language Modelling on One Billion Word
no code implementations • 13 Jul 2021 • Stanislav Lukyanenko, Won-Dong Jang, Donglai Wei, Robbert Struyven, Yoon Kim, Brian Leahy, Helen Yang, Alexander Rush, Dalit Ben-Yosef, Daniel Needleman, Hanspeter Pfister
In this work, we propose a two-stream model for developmental stage classification.
1 code implementation • NAACL 2021 • Xinya Du, Alexander Rush, Claire Cardie
Template filling is generally tackled by a pipeline of two separate supervised systems {--} one for role-filler extraction and another for template/event recognition.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Zonglin Yang, Xinya Du, Alexander Rush, Claire Cardie
End-to-end models in NLP rarely encode external world knowledge about length of time.
2 code implementations • EMNLP 2020 • Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, Alexander Rush
Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks.
no code implementations • 29 Sep 2019 • Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa Reddi, Alexander Rush, David Brooks, Gu-Yeon Wei
Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models.
no code implementations • 8 Feb 2019 • Fritz Obermeyer, Eli Bingham, Martin Jankowiak, Justin Chiu, Neeraj Pradhan, Alexander Rush, Noah Goodman
To exploit efficient tensor algebra in graphs with plates of variables, we generalize undirected factor graphs to plated factor graphs and variable elimination to a tensor variable elimination algorithm that operates directly on plated factor graphs.
no code implementations • 3 Jun 2018 • Gabriel Grand, Aron Szanto, Yoon Kim, Alexander Rush
Visual question answering (VQA) models respond to open-ended natural language questions about images.