no code implementations • Findings (EMNLP) 2021 • Yuntian Deng, Alexander Rush
Non-autoregressive machine translation (NAT) approaches enable fast generation by utilizing parallelizable generative processes.
no code implementations • 5 Sep 2024 • Yuntian Deng, Wenting Zhao, Jack Hessel, Xiang Ren, Claire Cardie, Yejin Choi
The increasing availability of real-world conversation data offers exciting opportunities for researchers to study user-chatbot interactions.
no code implementations • 24 Jul 2024 • Wenting Zhao, Tanya Goyal, Yu Ying Chiu, Liwei Jiang, Benjamin Newman, Abhilasha Ravichander, Khyathi Chandu, Ronan Le Bras, Claire Cardie, Yuntian Deng, Yejin Choi
While hallucinations of large language models (LLMs) prevail as a major challenge, existing evaluation benchmarks on factuality do not cover the diverse domains of knowledge that the real-world users of LLMs seek information about.
1 code implementation • 12 Jun 2024 • Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin
High-quality instruction data is critical for aligning large language models (LLMs).
1 code implementation • 7 Jun 2024 • Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Yejin Choi
For automated evaluation with WildBench, we have developed two metrics, WB-Reward and WB-Score, which are computable using advanced LLMs such as GPT-4-turbo.
no code implementations • 3 Jun 2024 • Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You
Our benchmarks' advantages lie in (1) a 0. 96 model ranking correlation with Chatbot Arena arising from the highly impartial query distribution and grading mechanism, (2) fast, cheap, and reproducible execution (6% of the time and cost of MMLU), and (3) dynamic evaluation enabled by the rapid and stable data update pipeline.
1 code implementation • 23 May 2024 • Yuntian Deng, Yejin Choi, Stuart Shieber
When leveraging language models for reasoning tasks, generating explicit chain-of-thought (CoT) steps often proves essential for achieving high accuracy in final outputs.
no code implementations • 2 May 2024 • Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng
In addition to timestamped chat transcripts, we enrich the dataset with demographic data, including state, country, and hashed IP addresses, alongside request headers.
1 code implementation • 2 Nov 2023 • Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber
In this work, we explore an alternative reasoning approach: instead of explicitly producing the chain of thought reasoning steps, we use the language model's internal hidden states to perform implicit reasoning.
2 code implementations • 21 Oct 2023 • John X. Morris, Chandan Singh, Alexander M. Rush, Jianfeng Gao, Yuntian Deng
Prompting language models (LMs) is the main interface for applying them to new tasks.
no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.
1 code implementation • 16 Oct 2022 • Yuntian Deng, Volodymyr Kuleshov, Alexander M. Rush
Language models have demonstrated the ability to generate highly fluent text; however, it remains unclear whether their output retains coherent high-level structure (e. g., story progression).
1 code implementation • 11 Oct 2022 • Yuntian Deng, Noriyuki Kojima, Alexander M. Rush
These experiments each verify the effectiveness of the diffusion process and the use of scheduled sampling to fix generation issues.
2 code implementations • 24 May 2022 • Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov
We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner.
1 code implementation • NeurIPS 2021 • Justin T. Chiu, Yuntian Deng, Alexander M. Rush
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
2 code implementations • EMNLP 2021 • Keyon Vafa, Yuntian Deng, David M. Blei, Alexander M. Rush
Compared to existing baselines, greedy rationalization is best at optimizing the combinatorial objective and provides the most faithful rationales.
no code implementations • 6 Jul 2021 • Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff
To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.
no code implementations • 9 Jan 2021 • Yuntian Deng, Shiping Shao, Archak Mittal, Richard Twumasi-Boakye, James Fishelson, Abhishek Gupta, Ness B. Shroff
Accordingly, in this paper, we use cooperative game theory coupled with the hyperpath-based stochastic user equilibrium framework to study such a market.
1 code implementation • NeurIPS 2020 • Yuntian Deng, Alexander M. Rush
The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies.
1 code implementation • ICLR 2020 • Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato
In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.
no code implementations • 6 Apr 2020 • Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam
Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.
no code implementations • 29 Sep 2019 • Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa Reddi, Alexander Rush, David Brooks, Gu-Yeon Wei
Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models.
1 code implementation • IJCNLP 2019 • Zachary M. Ziegler, Yuntian Deng, Alexander M. Rush
Whereas traditional cryptography encrypts a secret message into an unintelligible form, steganography conceals that communication is taking place by encoding a secret message into a cover signal.
no code implementations • 7 Jun 2019 • Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc'Aurelio Ranzato, Arthur Szlam
Energy-based models (EBMs), a. k. a.
5 code implementations • EMNLP 2018 • Sebastian Gehrmann, Yuntian Deng, Alexander M. Rush
We use this selector as a bottom-up attention step to constrain the model to likely phrases.
Ranked #4 on
Multi-Document Summarization
on Multi-News
1 code implementation • NeurIPS 2018 • Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, Alexander M. Rush
This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.
Ranked #29 on
Machine Translation
on IWSLT2014 German-English
9 code implementations • WS 2018 • Guillaume Klein, Yoon Kim, Yuntian Deng, Vincent Nguyen, Jean Senellart, Alexander M. Rush
OpenNMT is an open-source toolkit for neural machine translation (NMT).
no code implementations • 12 Sep 2017 • Guillaume Klein, Yoon Kim, Yuntian Deng, Josep Crego, Jean Senellart, Alexander M. Rush
We introduce an open-source toolkit for neural machine translation (NMT) to support research into model architectures, feature representations, and source modalities, while maintaining competitive performance, modularity and reasonable training requirements.
no code implementations • ICML 2017 • Pengtao Xie, Yuntian Deng, Yi Zhou, Abhimanu Kumar, Yao-Liang Yu, James Zou, Eric P. Xing
The large model capacity of latent space models (LSMs) enables them to achieve great performance on various applications, but meanwhile renders LSMs to be prone to overfitting.
4 code implementations • ACL 2017 • Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
We describe an open-source toolkit for neural machine translation (NMT).
no code implementations • 26 Sep 2016 • Xuezhe Ma, Yingkai Gao, Zhiting Hu, Yao-Liang Yu, Yuntian Deng, Eduard Hovy
Algorithmically, we show that our proposed measure of the inference gap can be used to regularize the standard dropout training objective, resulting in an \emph{explicit} control of the gap.
14 code implementations • ICML 2017 • Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, Alexander M. Rush
We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.
no code implementations • EACL 2017 • Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola
Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.
no code implementations • ACL 2016 • Hao Zhang, Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan, Eric P. Xing
We study the problem of automatically building hypernym taxonomies from textual and visual data.
no code implementations • 23 Dec 2015 • Pengtao Xie, Yuntian Deng, Eric Xing
On two popular latent variable models --- restricted Boltzmann machine and distance metric learning, we demonstrate that MAR can effectively capture long-tail patterns, reduce model complexity without sacrificing expressivity and improve interpretability.
no code implementations • 23 Nov 2015 • Pengtao Xie, Yuntian Deng, Eric Xing
Recently diversity-inducing regularization methods for latent variable models (LVMs), which encourage the components in LVMs to be diverse, have been studied to address several issues involved in latent variable modeling: (1) how to capture long-tail patterns underlying data; (2) how to reduce model complexity without sacrificing expressivity; (3) how to improve the interpretability of learned patterns.
no code implementations • 21 Oct 2015 • Aaron Q. Li, Yuntian Deng, Kublai Jing, Joseph W Robinson
In this project we outline a modularized, scalable system for comparing Amazon products in an interactive and informative way using efficient latent variable models and dynamic visualization.