Search Results for author: Zichao Yang

Found 27 papers, 13 papers with code

Dense-to-Sparse Gate for Mixture-of-Experts

1 code implementation29 Dec 2021 Xiaonan Nie, Shijie Cao, Xupeng Miao, Lingxiao Ma, Jilong Xue, Youshan Miao, Zichao Yang, Zhi Yang, Bin Cui

However, we found that the current approach of jointly training experts and the sparse gate introduces a negative impact on model accuracy, diminishing the efficiency of expensive large-scale model training.

NAFS: A Simple yet Tough-to-Beat Baseline for Graph Representation Learning

no code implementations29 Sep 2021 Wentao Zhang, Zeang Sheng, Mingyu Yang, Yang Li, Yu Shen, Zhi Yang, Zichao Yang, Bin Cui

First, GNNs can learn higher-order structural information by stacking more layers but can not deal with large depth due to the over-smoothing issue.

Graph Representation Learning Link Prediction +1

Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation

1 code implementation29 Jun 2021 Guangyi Liu, Zichao Yang, Tianhua Tao, Xiaodan Liang, Junwei Bao, Zhen Li, Xiaodong He, Shuguang Cui, Zhiting Hu

Such training objective is sub-optimal when the target sequence is not perfect, e. g., when the target sequence is corrupted with noises, or when only weak sequence supervision is available.

Machine Translation Style Transfer +3

Local Additivity Based Data Augmentation for Semi-supervised NER

1 code implementation EMNLP 2020 Jiaao Chen, Zhenghui Wang, Ran Tian, Zichao Yang, Diyi Yang

Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data.

Data Augmentation Named Entity Recognition +1

Progressive Generation of Long Text with Pretrained Language Models

1 code implementation NAACL 2021 Bowen Tan, Zichao Yang, Maruan AI-Shedivat, Eric P. Xing, Zhiting Hu

However, as our systematic examination reveals, it is still challenging for such models to generate coherent long passages of text (e. g., 1000 tokens), especially when the models are fine-tuned to the target domain on a small corpus.

Pretrained Language Models

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

2 code implementations ACL 2020 Jiaao Chen, Zichao Yang, Diyi Yang

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix.

Data Augmentation General Classification +2

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

no code implementations10 Nov 2019 Chao Zhang, Zichao Yang, Xiaodong He, Li Deng

This review provides a comprehensive analysis of recent works on multimodal deep learning from three perspectives: learning multimodal representations, fusing multimodal signals at various levels, and multimodal applications.

Multimodal Deep Learning Question Answering +5

Let's Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms

no code implementations NAACL 2019 Diyi Yang, Jiaao Chen, Zichao Yang, Dan Jurafsky, Eduard Hovy

Modeling what makes a request persuasive - eliciting the desired response from a reader - is critical to the study of propaganda, behavioral economics, and advertising.

Data-to-Text Generation with Style Imitation

1 code implementation Findings of the Association for Computational Linguistics 2020 Shuai Lin, Wentao Wang, Zichao Yang, Xiaodan Liang, Frank F. Xu, Eric Xing, Zhiting Hu

That is, the model learns to imitate the writing style of any given exemplar sentence, with automatic adaptions to faithfully describe the content record.

Data-to-Text Generation Style Transfer

Connecting the Dots Between MLE and RL for Sequence Prediction

no code implementations24 Nov 2018 Bowen Tan, Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric Xing

Reinforcement learning such as policy gradient addresses the issue but can have prohibitively poor exploration efficiency.

Imitation Learning Machine Translation +2

Connecting the Dots Between MLE and RL for Sequence Generation

no code implementations ICLR Workshop drlStructPred 2019 Bowen Tan*, Zhiting Hu*, Zichao Yang, Ruslan Salakhutdinov, Eric P. Xing

We present a generalized entropy regularized policy optimization formulation, and show that the apparently divergent algorithms can all be reformulated as special instances of the framework, with the only difference being the configurations of reward function and a couple of hyperparameters.

Machine Translation Text Summarization +1

Differentiable Expected BLEU for Text Generation

no code implementations27 Sep 2018 Wentao Wang, Zhiting Hu, Zichao Yang, Haoran Shi, Eric P. Xing

Neural text generation models such as recurrent networks are typically trained by maximizing data log-likelihood based on cross entropy.

Image Captioning Machine Translation +2

Unsupervised Text Style Transfer using Language Models as Discriminators

1 code implementation NeurIPS 2018 Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick

Binary classifiers are often employed as discriminators in GAN-based unsupervised style transfer systems to ensure that transferred sentences are similar to sentences in the target domain.

Decipherment Language Modelling +4

On Unifying Deep Generative Models

no code implementations ICLR 2018 Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric P. Xing

Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), as emerging families for generative model learning, have largely been considered as two distinct paradigms and received extensive independent studies respectively.

Toward Controlled Generation of Text

3 code implementations ICML 2017 Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P. Xing

Generic generation and manipulation of text is challenging and has limited success compared to recent deep generative modeling in visual domain.

Improved Variational Autoencoders for Text Modeling using Dilated Convolutions

3 code implementations ICML 2017 Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, Taylor Berg-Kirkpatrick

Recent work on generative modeling of text has found that variational auto-encoders (VAE) incorporating LSTM decoders perform worse than simpler LSTM language models (Bowman et al., 2015).

Text Generation

Reference-Aware Language Models

no code implementations EMNLP 2017 Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling

We propose a general class of language models that treat reference as an explicit stochastic latent variable.

Dialogue Generation Recipe Generation

Neural Machine Translation with Recurrent Attention Modeling

no code implementations EACL 2017 Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Deep Fried Convnets

1 code implementation ICCV 2015 Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters.

Image Classification

A la Carte - Learning Fast Kernels

no code implementations19 Dec 2014 Zichao Yang, Alexander J. Smola, Le Song, Andrew Gordon Wilson

Kernel methods have great promise for learning rich statistical representations of large modern datasets.

Cannot find the paper you are looking for? You can Submit a new open access paper.