Search Results for author: Dinghan Shen

Found 36 papers, 8 papers with code

What Makes Good In-Context Examples for GPT-$3$?

no code implementations17 Jan 2021 Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen

Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt.

Few-Shot Learning Natural Language Understanding +2

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations12 Oct 2020 Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Language Modelling Sentence Classification

Generative Semantic Hashing Enhanced via Boltzmann Machines

no code implementations ACL 2020 Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen

Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint.

Information Retrieval

Improving Disentangled Text Representation Learning with Information-Theoretic Guidance

no code implementations ACL 2020 Pengyu Cheng, Martin Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li, Lawrence Carin

Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.

Conditional Text Generation Representation Learning +2

Straight-Through Estimator as Projected Wasserstein Gradient Flow

no code implementations5 Oct 2019 Pengyu Cheng, Chang Liu, Chunyuan Li, Dinghan Shen, Ricardo Henao, Lawrence Carin

The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables.

Document Hashing with Mixture-Prior Generative Models

no code implementations IJCNLP 2019 Wei Dong, Qinliang Su, Dinghan Shen, Changyou Chen

Hashing is promising for large-scale information retrieval tasks thanks to the efficiency of distance evaluation between binary codes.

Information Retrieval

Learning Compressed Sentence Representations for On-Device Text Processing

1 code implementation ACL 2019 Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems.

Sentence Embeddings

Syntax-Infused Variational Autoencoder for Text Generation

no code implementations ACL 2019 Xinyuan Zhang, Yi Yang, Siyang Yuan, Dinghan Shen, Lawrence Carin

We present a syntax-infused variational autoencoder (SIVAE), that integrates sentences with their syntactic trees to improve the grammar of generated sentences.

Text Generation

Sequence Generation with Guider Network

no code implementations2 Nov 2018 Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan Shen, Guoyin Wang, Lawrence Carin

Sequence generation with reinforcement learning (RL) has received significant attention recently.

Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment

no code implementations EMNLP 2018 Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin

Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years.

Link Prediction Network Embedding +1

Diffusion Maps for Textual Network Embedding

no code implementations NeurIPS 2018 Xinyuan Zhang, Yitong Li, Dinghan Shen, Lawrence Carin

Textual network embedding leverages rich text information associated with the network to learn low-dimensional vectorial representations of vertices.

General Classification Link Prediction +1

Joint Embedding of Words and Labels for Text Classification

2 code implementations ACL 2018 Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin

Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.

General Classification Sentiment Analysis +1

On the Use of Word Embeddings Alone to Represent Natural Language Sequences

no code implementations ICLR 2018 Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin

In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.

Word Embeddings

Topic Compositional Neural Language Model

no code implementations28 Dec 2017 Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, Lawrence Carin

The TCNLM learns the global semantic coherence of a document via a neural topic model, and the probability of each learned latent topic is further used to build a Mixture-of-Experts (MoE) language model, where each expert (corresponding to one topic) is a recurrent neural network (RNN) that accounts for learning the local structure of a word sequence.

Language Modelling

Parametric t-Distributed Stochastic Exemplar-centered Embedding

no code implementations14 Oct 2017 Martin Renqiang Min, Hongyu Guo, Dinghan Shen

Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation.

Data Visualization

Deconvolutional Latent-Variable Model for Text Sequence Matching

no code implementations21 Sep 2017 Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.

Latent Variable Models Text Matching

Deconvolutional Paragraph Representation Learning

4 code implementations NeurIPS 2017 Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin

Learning latent representations from long text sequences is an important first step in many natural language processing applications.

General Classification Representation Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.