Search Results for author: Xuancheng Ren

Found 60 papers, 32 papers with code

Rethinking Denoised Auto-Encoding in Language Pre-Training

no code implementations EMNLP 2021 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun, Songfang Huang, Fei Huang

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation8 Dec 2022 Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention

no code implementations28 Oct 2022 Fenglin Liu, Xian Wu, Shen Ge, Xuancheng Ren, Wei Fan, Xu sun, Yuexian Zou

To enhance the correlation between vision and language in disentangled spaces, we introduce the visual concepts to DiMBERT which represent visual information in textual format.

Image Captioning Language Modelling +3

Prophet Attention: Predicting Attention with Future Attention for Image Captioning

no code implementations19 Oct 2022 Fenglin Liu, Xuancheng Ren, Xian Wu, Wei Fan, Yuexian Zou, Xu sun

Especially for image captioning, the attention based models are expected to ground correct image regions with proper generated words.

Image Captioning

From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models

1 code implementation11 Oct 2022 Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

We then design a Model Uncertainty--aware Knowledge Integration (MUKI) framework to recover the golden supervision for the student.

Delving into the Openness of CLIP

1 code implementation4 Jun 2022 Shuhuai Ren, Lei LI, Xuancheng Ren, Guangxiang Zhao, Xu sun

However, evaluating the openness of CLIP-like models is challenging, as the models are open to arbitrary vocabulary in theory, but their accuracy varies in practice.

Image Classification Text Matching

Hierarchical Inductive Transfer for Continual Dialogue Learning

no code implementations Findings (ACL) 2022 Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun

However, for the continual increase of online chit-chat scenarios, directly fine-tuning these models for each of the new tasks not only explodes the capacity of the dialogue system on the embedded devices but also causes knowledge forgetting on pre-trained models and knowledge interference among diverse dialogue tasks.

General Knowledge

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

no code implementations14 Dec 2021 Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.

Well-classified Examples are Underestimated in Classification with Deep Neural Networks

1 code implementation13 Oct 2021 Guangxiang Zhao, Wenkai Yang, Xuancheng Ren, Lei LI, Yunfang Wu, Xu sun

The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary.

Graph Classification imbalanced classification +4

Topology-Imbalance Learning for Semi-Supervised Node Classification

1 code implementation NeurIPS 2021 Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.

Classification Node Classification

Adversarial Parameter Defense by Multi-Step Risk Minimization

no code implementations7 Sep 2021 Zhiyuan Zhang, Ruixuan Luo, Xuancheng Ren, Qi Su, Liangyou Li, Xu sun

To enhance neural networks, we propose the adversarial parameter defense algorithm that minimizes the average risk of multiple adversarial parameter corruptions.

Neural Network Surgery: Injecting Data Patterns into Pre-trained Models with Minimal Instance-wise Side Effects

no code implementations NAACL 2021 Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun, Bin He

Motivated by neuroscientific evidence and theoretical results, we demonstrate that side effects can be controlled by the number of changed parameters and thus, we propose to conduct \textit{neural network surgery} by only modifying a limited number of parameters.

A Global Past-Future Early Exit Method for Accelerating Inference of Pre-trained Language Models

1 code implementation NAACL 2021 Kaiyuan Liao, Yi Zhang, Xuancheng Ren, Qi Su, Xu sun, Bin He

We first take into consideration all the linguistic information embedded in the past layers and then take a further step to engage the future information which is originally inaccessible for predictions.

Rethinking Skip Connection with Layer Normalization in Transformers and ResNets

no code implementations15 May 2021 Fenglin Liu, Xuancheng Ren, Zhiyuan Zhang, Xu sun, Yuexian Zou

In this work, we investigate how the scale factors in the effectiveness of the skip connection and reveal that a trivial adjustment of the scale will lead to spurious gradient exploding or vanishing in line with the deepness of the models, which could be addressed by normalization, in particular, layer normalization, which induces consistent improvements over the plain skip connection.

Image Classification Machine Translation +1

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

1 code implementation NAACL 2021 Wenkai Yang, Lei LI, Zhiyuan Zhang, Xuancheng Ren, Xu sun, Bin He

However, in this paper, we find that it is possible to hack the model in a data-free way by modifying one single word embedding vector, with almost no accuracy sacrificed on clean samples.

Backdoor Attack Data Poisoning +4

Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation

no code implementations22 Feb 2021 Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun

The finding of general knowledge is further hindered by the unidirectional distillation, as the student should obey the teacher and may discard some knowledge that is truly general but refuted by the teacher.

Dialogue Generation General Knowledge +1

High-Likelihood Area Matters --- Rewarding Near-Correct Predictions Under Imbalanced Distributions

no code implementations1 Jan 2021 Guangxiang Zhao, Lei LI, Xuancheng Ren, Xu sun, Bin He

We find in practice that the high-likelihood area contains correct predictions for tail classes and it plays a vital role in learning imbalanced class distributions.

Vocal Bursts Intensity Prediction

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

no code implementations14 Dec 2020 Deli Chen, Yankai Lin, Lei LI, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).

Contrastive Learning Graph Learning +1

Rethinking Skip Connection with Layer Normalization

no code implementations COLING 2020 Fenglin Liu, Xuancheng Ren, Zhiyuan Zhang, Xu sun, Yuexian Zou

In this work, we investigate how the scale factors in the effectiveness of the skip connection and reveal that a trivial adjustment of the scale will lead to spurious gradient exploding or vanishing in line with the deepness of the models, which could by addressed by normalization, in particular, layer normalization, which induces consistent improvements over the plain skip connection.

Image Classification Machine Translation +1

Prophet Attention: Predicting Attention with Future Attention

no code implementations NeurIPS 2020 Fenglin Liu, Xuancheng Ren, Xian Wu, Shen Ge, Wei Fan, Yuexian Zou, Xu sun

Especially for image captioning, the attention based models are expected to ground correct image regions with proper generated words.

Image Captioning

CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations

no code implementations13 Oct 2020 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

Regularizing Dialogue Generation by Imitating Implicit Scenarios

no code implementations EMNLP 2020 Shaoxiong Feng, Xuancheng Ren, Hongshen Chen, Bin Sun, Kan Li, Xu sun

Human dialogues are scenario-based and appropriate responses generally relate to the latent context knowledge entailed by the specific scenario.

Dialogue Generation Diversity +1

Collaborative Group Learning

no code implementations16 Sep 2020 Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, Xu sun

Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima.

Computational Efficiency Inductive Bias +1

Exploring the Vulnerability of Deep Neural Networks: A Study of Parameter Corruption

1 code implementation10 Jun 2020 Xu Sun, Zhiyuan Zhang, Xuancheng Ren, Ruixuan Luo, Liangyou Li

We argue that the vulnerability of model parameters is of crucial value to the study of model robustness and generalization but little research has been devoted to understanding this matter.

Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding

no code implementations16 May 2020 Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Chenyu You, Xuewei Ma, Xian Wu, Xu sun

While it is common practice to draw information from only the last encoder layer, recent work has proposed to use representations from different encoder layers for diversified levels of information.

Abstractive Text Summarization Decoder +6

Exploring and Distilling Cross-Modal Information for Image Captioning

no code implementations28 Feb 2020 Fenglin Liu, Xuancheng Ren, Yuanxin Liu, Kai Lei, Xu sun

Recently, attention-based encoder-decoder models have been used extensively in image captioning.

Attribute Decoder +1

Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection

2 code implementations25 Dec 2019 Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun

Self-attention based Transformer has demonstrated the state-of-the-art performances in a number of natural language processing tasks.

Image Captioning Language Modelling +2

An Adaptive and Momental Bound Method for Stochastic Learning

2 code implementations27 Oct 2019 Jianbang Ding, Xuancheng Ren, Ruixuan Luo, Xu sun

The dynamic learning rate bounds are based on the exponential moving averages of the adaptive learning rates themselves, which smooth out unexpected large learning rates and stabilize the training of deep neural networks.

Stochastic Optimization

Sparse Transformer: Concentrated Attention Through Explicit Selection

no code implementations25 Sep 2019 Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Xu sun

Extensive experimental results on a series of natural language processing tasks, including neural machine translation, image captioning, and language modeling, all demonstrate the advantages of Sparse Transformer in model performance.

Image Captioning Language Modelling +2

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

4 code implementations27 Jun 2019 Ruixuan Luo, Jingjing Xu, Yi Zhang, Zhiyuan Zhang, Xuancheng Ren, Xu sun

Through this method, we generate synthetic data using a large amount of unlabeled data in the target domain and then obtain a word segmentation model for the target domain.

Chinese Word Segmentation Domain Adaptation +3

A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer

1 code implementation ACL 2019 Chen Wu, Xuancheng Ren, Fuli Luo, Xu sun

Unsupervised text style transfer aims to alter text styles while preserving the content, without aligned data for supervision.

Sentence Style Transfer +2

Memorized Sparse Backpropagation

no code implementations24 May 2019 Zhiyuan Zhang, Pengcheng Yang, Xuancheng Ren, Qi Su, Xu sun

Neural network learning is usually time-consuming since backpropagation needs to compute full gradients and backpropagate them across multiple layers.

Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

1 code implementation EMNLP 2018 Junyang Lin, Xu sun, Xuancheng Ren, Muyu Li, Qi Su

Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism.

Decoder Machine Translation +2

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

1 code implementation EMNLP 2018 Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu sun

Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation.

Sentence Story Generation

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

no code implementations16 Aug 2018 Wei Li, Xuancheng Ren, Damai Dai, Yunfang Wu, Houfeng Wang, Xu sun

In the experiments, we take a real-world sememe knowledge base HowNet and the corresponding descriptions of the words in Baidu Wiki for training and evaluation.

Deconvolution-Based Global Decoding for Neural Machine Translation

1 code implementation COLING 2018 Junyang Lin, Xu sun, Xuancheng Ren, Shuming Ma, Jinsong Su, Qi Su

A great proportion of sequence-to-sequence (Seq2Seq) models for Neural Machine Translation (NMT) adopt Recurrent Neural Network (RNN) to generate translation word by word following a sequential order.

Machine Translation NMT +1

Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation

1 code implementation NAACL 2018 Shuming Ma, Xu sun, Wei Li, Sujian Li, Wenjie Li, Xuancheng Ren

The existing sequence-to-sequence model tends to memorize the words and the patterns in the training dataset instead of learning the meaning of the words.

Abstractive Text Summarization Decoder +3

Building an Ellipsis-aware Chinese Dependency Treebank for Web Text

1 code implementation LREC 2018 Xuancheng Ren, Xu sun, Ji Wen, Bingzhen Wei, Weidong Zhan, Zhiyuan Zhang

Web 2. 0 has brought with it numerous user-produced data revealing one's thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction.

Dependency Parsing Sentence

Hybrid Oracle: Making Use of Ambiguity in Transition-based Chinese Dependency Parsing

1 code implementation28 Nov 2017 Xuancheng Ren, Xu sun

In the training of transition-based dependency parsers, an oracle is used to predict a transition sequence for a sentence and its gold tree.

Chinese Dependency Parsing Dependency Parsing +1

Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?

1 code implementation COLING 2018 Yi Zhang, Xu sun, Shuming Ma, Yang Yang, Xuancheng Ren

In our work, we first design a new model called "high order LSTM" to predict multiple tags for the current token which contains not only the current tag but also the previous several tags.

Chunking NER +1

Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

3 code implementations17 Nov 2017 Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang

Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications.

Label Embedding Network: Learning Label Representation for Soft Training of Deep Networks

1 code implementation ICLR 2018 Xu Sun, Bingzhen Wei, Xuancheng Ren, Shuming Ma

We propose a method, called Label Embedding Network, which can learn label representation (label embedding) during the training process of deep networks.

Minimal Effort Back Propagation for Convolutional Neural Networks

no code implementations18 Sep 2017 Bingzhen Wei, Xu sun, Xuancheng Ren, Jingjing Xu

As traditional neural network consumes a significant amount of computing resources during back propagation, \citet{Sun2017mePropSB} propose a simple yet effective technique to alleviate this problem.

meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

2 code implementations ICML 2017 Xu Sun, Xuancheng Ren, Shuming Ma, Houfeng Wang

In back propagation, only a small subset of the full gradient is computed to update the model parameters.

Towards Easier and Faster Sequence Labeling for Natural Language Processing: A Search-based Probabilistic Online Learning Framework (SAPO)

4 code implementations29 Mar 2015 Xu Sun, Shuming Ma, Yi Zhang, Xuancheng Ren

We show that this method with fast training and theoretical guarantee of convergence, which is easy to implement, can support search-based optimization and obtain top accuracy.

Cannot find the paper you are looking for? You can Submit a new open access paper.