1 code implementation • DeeLIO (ACL) 2022 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen
In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-3’s in-context learning capabilities. Inspired by the recent success of leveraging a retrieval module to augment neural networks, we propose to retrieve examples that are semantically-similar to a test query sample to formulate its corresponding prompt.
1 code implementation • ACL 2021 • Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP.
1 code implementation • 31 May 2021 • Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP.
3 code implementations • 17 Jan 2021 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen
Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt.
no code implementations • ICLR 2021 • Kevin J Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin
Large-scale language models have recently demonstrated impressive empirical performance.
no code implementations • ICLR 2021 • Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Jiawei Han, Weizhu Chen
To verify the effectiveness of the proposed framework, we apply CoDA to Transformer-based models on a wide range of natural language understanding tasks.
no code implementations • 12 Oct 2020 • Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao
We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.
Ranked #1 on
Sentence Classification
on ACL-ARC
no code implementations • EMNLP 2020 • Guoyin Wang, Chunyuan Li, Jianqiao Li, Hao Fu, Yuh-Chen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin
An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
2 code implementations • 29 Sep 2020 • Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
Adversarial training has been shown effective at endowing the learned representations with stronger generalization ability.
Ranked #9 on
Machine Translation
on IWSLT2014 German-English
no code implementations • ACL 2020 • Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen
Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint.
no code implementations • ACL 2020 • Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin
Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.
no code implementations • ACL 2020 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin
Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation.
no code implementations • 10 Nov 2019 • Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Shijing Si, Dinghan Shen, Dong Wang, Lawrence Carin
Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks.
no code implementations • IJCNLP 2019 • Qian Yang, Zhouyuan Huo, Dinghan Shen, Yong Cheng, Wenlin Wang, Guoyin Wang, Lawrence Carin
Generating high-quality paraphrases is a fundamental yet challenging natural language processing task.
no code implementations • 5 Oct 2019 • Pengyu Cheng, Chang Liu, Chunyuan Li, Dinghan Shen, Ricardo Henao, Lawrence Carin
The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables.
no code implementations • IJCNLP 2019 • Wei Dong, Qinliang Su, Dinghan Shen, Changyou Chen
Hashing is promising for large-scale information retrieval tasks thanks to the efficiency of distance evaluation between binary codes.
1 code implementation • ACL 2019 • Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin
Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems.
no code implementations • ACL 2019 • Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin
Constituting highly informative network embeddings is an important tool for network analysis.
no code implementations • ACL 2019 • Xinyuan Zhang, Yi Yang, Siyang Yuan, Dinghan Shen, Lawrence Carin
We present a syntax-infused variational autoencoder (SIVAE), that integrates sentences with their syntactic trees to improve the grammar of generated sentences.
no code implementations • NAACL 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin
We propose a topic-guided variational auto-encoder (TGVAE) model for text generation.
no code implementations • 17 Mar 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin
We propose a topic-guided variational autoencoder (TGVAE) model for text generation.
no code implementations • ACL 2019 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin
Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables.
no code implementations • ICLR 2019 • Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin
Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE).
no code implementations • CVPR 2019 • Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang
Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.
Ranked #2 on
Vision-Language Navigation
on Room2Room
no code implementations • 2 Nov 2018 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan Shen, Guoyin Wang, Lawrence Carin
Sequence generation with reinforcement learning (RL) has received significant attention recently.
no code implementations • 27 Sep 2018 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Lawrence Carin
Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation.
1 code implementation • NeurIPS 2018 • Liqun Chen, Shuyang Dai, Chenyang Tao, Dinghan Shen, Zhe Gan, Haichao Zhang, Yizhe Zhang, Lawrence Carin
However, the discrete nature of text hinders the application of GAN to text-generation tasks.
no code implementations • EMNLP 2018 • Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin
Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years.
no code implementations • NeurIPS 2018 • Xinyuan Zhang, Yitong Li, Dinghan Shen, Lawrence Carin
Textual network embedding leverages rich text information associated with the network to learn low-dimensional vectorial representations of vertices.
2 code implementations • ACL 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin
Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations.
Ranked #1 on
Named Entity Recognition (NER)
on CoNLL 2000
1 code implementation • ACL 2018 • Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Lawrence Carin, Ricardo Henao
Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems.
2 code implementations • ACL 2018 • Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin
Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.
Ranked #11 on
Text Classification
on DBpedia
no code implementations • ICLR 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin
In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.
no code implementations • 28 Dec 2017 • Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, Lawrence Carin
The TCNLM learns the global semantic coherence of a document via a neural topic model, and the probability of each learned latent topic is further used to build a Mixture-of-Experts (MoE) language model, where each expert (corresponding to one topic) is a recurrent neural network (RNN) that accounts for learning the local structure of a word sequence.
no code implementations • 14 Oct 2017 • Martin Renqiang Min, Hongyu Guo, Dinghan Shen
Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation.
no code implementations • EMNLP 2018 • Dinghan Shen, Martin Renqiang Min, Yitong Li, Lawrence Carin
The role of meta network is to abstract the contextual information of a sentence or document into a set of input-aware filters.
Ranked #13 on
Text Classification
on DBpedia
no code implementations • 21 Sep 2017 • Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin
A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.
4 code implementations • NeurIPS 2017 • Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin
Learning latent representations from long text sequences is an important first step in many natural language processing applications.
1 code implementation • ICML 2017 • Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin
We propose a framework for generating realistic text via adversarial training.