Search Results for author: Dinghan Shen

Found 39 papers, 11 papers with code

What Makes Good In-Context Examples for GPT-3?

no code implementations • DeeLIO (ACL) 2022 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen

In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-3’s in-context learning capabilities. Inspired by the recent success of leveraging a retrieval module to augment neural networks, we propose to retrieve examples that are semantically-similar to a test query sample to formulate its corresponding prompt.

In-Context Learning Natural Language Understanding +4

Paper
Add Code

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability

1 code implementation • ACL 2021 • Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang

Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP.

Data Augmentation Natural Language Understanding

Paper
Code

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization

1 code implementation • 31 May 2021 • Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang

Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP.

Data Augmentation Natural Language Understanding

Paper
Code

What Makes Good In-Context Examples for GPT-$3$?

3 code implementations • 17 Jan 2021 • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen

Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt.

Few-Shot Learning Natural Language Understanding +4

10,165

Paper
Code

MixKD: Towards Efficient Distillation of Large-scale Language Models

no code implementations • ICLR 2021 • Kevin J Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin

Large-scale language models have recently demonstrated impressive empirical performance.

Data Augmentation Knowledge Distillation

Paper
Add Code

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

no code implementations • ICLR 2021 • Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Jiawei Han, Weizhu Chen

To verify the effectiveness of the proposed framework, we apply CoDA to Transformer-based models on a wide range of natural language understanding tasks.

Data Augmentation Natural Language Understanding

Paper
Add Code

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations • 12 Oct 2020 • Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Ranked #1 on Sentence Classification on ACL-ARC

Language Modelling Sentence Classification

Paper
Add Code

Improving Text Generation with Student-Forcing Optimal Transport

no code implementations • EMNLP 2020 • Guoyin Wang, Chunyuan Li, Jianqiao Li, Hao Fu, Yuh-Chen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin

An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences.

Machine Translation Text Generation +2

Paper
Add Code

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation

2 code implementations • 29 Sep 2020 • Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen

Adversarial training has been shown effective at endowing the learned representations with stronger generalization ability.

Ranked #8 on Machine Translation on IWSLT2014 German-English

Data Augmentation Machine Translation +3

Paper
Code

Generative Semantic Hashing Enhanced via Boltzmann Machines

no code implementations • ACL 2020 • Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen

Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint.

Information Retrieval Retrieval

Paper
Add Code

Improving Disentangled Text Representation Learning with Information-Theoretic Guidance

no code implementations • ACL 2020 • Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin

Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.

Conditional Text Generation Representation Learning +2

Paper
Add Code

Improving Adversarial Text Generation by Modeling the Distant Future

no code implementations • ACL 2020 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin

Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation.

Adversarial Text Imitation Learning +1

Paper
Add Code

Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding

no code implementations • 10 Nov 2019 • Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Shijing Si, Dinghan Shen, Dong Wang, Lawrence Carin

Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks.

Machine Translation Natural Language Understanding +2

Paper
Add Code

An End-to-End Generative Architecture for Paraphrase Generation

no code implementations • IJCNLP 2019 • Qian Yang, Zhouyuan Huo, Dinghan Shen, Yong Cheng, Wenlin Wang, Guoyin Wang, Lawrence Carin

Generating high-quality paraphrases is a fundamental yet challenging natural language processing task.

Paraphrase Generation

Paper
Add Code

Straight-Through Estimator as Projected Wasserstein Gradient Flow

no code implementations • 5 Oct 2019 • Pengyu Cheng, Chang Liu, Chunyuan Li, Dinghan Shen, Ricardo Henao, Lawrence Carin

The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables.

Paper
Add Code

Document Hashing with Mixture-Prior Generative Models

no code implementations • IJCNLP 2019 • Wei Dong, Qinliang Su, Dinghan Shen, Changyou Chen

Hashing is promising for large-scale information retrieval tasks thanks to the efficiency of distance evaluation between binary codes.

Information Retrieval Retrieval

Paper
Add Code

Learning Compressed Sentence Representations for On-Device Text Processing

1 code implementation • ACL 2019 • Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems.

Retrieval Sentence +1

Paper
Code

Improving Textual Network Embedding with Global Attention via Optimal Transport

no code implementations • ACL 2019 • Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin

Constituting highly informative network embeddings is an important tool for network analysis.

Network Embedding

Paper
Add Code

Syntax-Infused Variational Autoencoder for Text Generation

no code implementations • ACL 2019 • Xinyuan Zhang, Yi Yang, Siyang Yuan, Dinghan Shen, Lawrence Carin

We present a syntax-infused variational autoencoder (SIVAE), that integrates sentences with their syntactic trees to improve the grammar of generated sentences.

Sentence Text Generation

Paper
Add Code

Topic-Guided Variational Auto-Encoder for Text Generation

no code implementations • NAACL 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin

We propose a topic-guided variational auto-encoder (TGVAE) model for text generation.

Conditional Text Generation

Paper
Add Code

Topic-Guided Variational Autoencoders for Text Generation

no code implementations • 17 Mar 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin

We propose a topic-guided variational autoencoder (TGVAE) model for text generation.

Conditional Text Generation

Paper
Add Code

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

no code implementations • ACL 2019 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables.

Sentence Text Generation

Paper
Add Code

Improving Sequence-to-Sequence Learning via Optimal Transport

no code implementations • ICLR 2019 • Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin

Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE).

Abstractive Text Summarization Image Captioning +3

Paper
Add Code

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

no code implementations • CVPR 2019 • Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

Ranked #2 on Vision-Language Navigation on Room2Room

Imitation Learning Reinforcement Learning (RL) +2

Paper
Add Code

Sequence Generation with Guider Network

no code implementations • 2 Nov 2018 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan Shen, Guoyin Wang, Lawrence Carin

Sequence generation with reinforcement learning (RL) has received significant attention recently.

Reinforcement Learning (RL)

Paper
Add Code

Hierarchically-Structured Variational Autoencoders for Long Text Generation

no code implementations • 27 Sep 2018 • Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Lawrence Carin

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation.

Sentence Text Generation

Paper
Add Code

Adversarial Text Generation via Feature-Mover's Distance

1 code implementation • NeurIPS 2018 • Liqun Chen, Shuyang Dai, Chenyang Tao, Dinghan Shen, Zhe Gan, Haichao Zhang, Yizhe Zhang, Lawrence Carin

However, the discrete nature of text hinders the application of GAN to text-generation tasks.

Adversarial Text Style Transfer +1

Paper
Code

Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment

no code implementations • EMNLP 2018 • Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin

Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years.

Link Prediction Network Embedding +1

Paper
Add Code

Diffusion Maps for Textual Network Embedding

no code implementations • NeurIPS 2018 • Xinyuan Zhang, Yitong Li, Dinghan Shen, Lawrence Carin

Textual network embedding leverages rich text information associated with the network to learn low-dimensional vectorial representations of vertices.

General Classification Link Prediction +1

Paper
Add Code

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

2 code implementations • ACL 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations.

Ranked #1 on Named Entity Recognition (NER) on CoNLL 2000

Document Classification General Classification +4

284

Paper
Code

NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing

1 code implementation • ACL 2018 • Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Lawrence Carin, Ricardo Henao

Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems.

Information Retrieval Retrieval +1

Paper
Code

Joint Embedding of Words and Labels for Text Classification

2 code implementations • ACL 2018 • Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin

Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.

Ranked #11 on Text Classification on DBpedia

General Classification Sentiment Analysis +2

323

Paper
Code

On the Use of Word Embeddings Alone to Represent Natural Language Sequences

no code implementations • ICLR 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin

In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.

Sentence Word Embeddings

Paper
Add Code

Topic Compositional Neural Language Model

no code implementations • 28 Dec 2017 • Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, Lawrence Carin

The TCNLM learns the global semantic coherence of a document via a neural topic model, and the probability of each learned latent topic is further used to build a Mixture-of-Experts (MoE) language model, where each expert (corresponding to one topic) is a recurrent neural network (RNN) that accounts for learning the local structure of a word sequence.

Language Modelling

Paper
Add Code

Parametric t-Distributed Stochastic Exemplar-centered Embedding

no code implementations • 14 Oct 2017 • Martin Renqiang Min, Hongyu Guo, Dinghan Shen

Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation.

Data Visualization

Paper
Add Code

Learning Context-Sensitive Convolutional Filters for Text Processing

no code implementations • EMNLP 2018 • Dinghan Shen, Martin Renqiang Min, Yitong Li, Lawrence Carin

The role of meta network is to abstract the contextual information of a sentence or document into a set of input-aware filters.

Ranked #13 on Text Classification on DBpedia

Paraphrase Identification Sentence +2

Paper
Add Code

Deconvolutional Latent-Variable Model for Text Sequence Matching

no code implementations • 21 Sep 2017 • Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.

Sentence Text Matching

Paper
Add Code

Deconvolutional Paragraph Representation Learning

4 code implementations • NeurIPS 2017 • Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin

Learning latent representations from long text sequences is an important first step in many natural language processing applications.

General Classification Representation Learning +1

151

Paper
Code

Adversarial Feature Matching for Text Generation

1 code implementation • ICML 2017 • Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin

We propose a framework for generating realistic text via adversarial training.

Generative Adversarial Network Text Generation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.