1 code implementation • 25 Mar 2022 • Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong
We train a family of large language models, called CodeGen, on natural language and programming language data.
Ranked #1 on
Program Synthesis
on HumanEval
no code implementations • 27 Aug 2021 • Lifu Tu
In this dissertation, we discuss the concept of the energy function and structured models with different energy functions.
1 code implementation • EMNLP 2020 • Lifu Tu, Tianyu Liu, Kevin Gimpel
Many tasks in natural language processing involve predicting structured outputs, e. g., sequence labeling, semantic role labeling, parsing, and machine translation.
1 code implementation • 14 Jul 2020 • Lifu Tu, Garima Lalwani, Spandana Gella, He He
Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset.
1 code implementation • ACL 2020 • Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel
We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.
no code implementations • EMNLP (spnlp) 2020 • Lifu Tu, Richard Yuanzhe Pang, Kevin Gimpel
Deep energy-based models are powerful, but pose challenges for learning and inference (Belanger and McCallum, 2016).
no code implementations • WS 2019 • Lifu Tu, Xiaoan Ding, Dong Yu, Kevin Gimpel
We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs.
1 code implementation • NAACL 2019 • Lifu Tu, Kevin Gimpel
One approach is to perform gradient descent with respect to the output structure directly (Belanger and McCallum, 2016).
no code implementations • SEMEVAL 2018 • Manasvi Sagarkar, John Wieting, Lifu Tu, Kevin Gimpel
We study the problem of measuring the quality of automatically-generated stories.
3 code implementations • ICLR 2018 • Lifu Tu, Kevin Gimpel
Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them.
no code implementations • ACL 2017 • Zheng Cai, Lifu Tu, Kevin Gimpel
We consider the ROC story cloze task (Mostafazadeh et al., 2016) and present several findings.
no code implementations • WS 2017 • Lifu Tu, Kevin Gimpel, Karen Livescu
We present models for embedding words in the context of surrounding words.
no code implementations • 7 Feb 2016 • Qingming Tang, Lifu Tu, Weiran Wang, Jinbo Xu
We propose a novel method for network inference from partially observed edges using a node-specific degree prior.