Search Results for author: Xu sun

Found 196 papers, 96 papers with code

Position Offset Label Prediction for Grammatical Error Correction

no code implementations COLING 2022 Xiuyu Wu, Jingsong Yu, Xu sun, Yunfang Wu

We introduce a novel position offset label prediction subtask to the encoder-decoder architecture for grammatical error correction (GEC) task.

Data Augmentation Grammatical Error Correction +2

Rethinking Denoised Auto-Encoding in Language Pre-Training

no code implementations EMNLP 2021 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun, Songfang Huang, Fei Huang

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

Translation as Cross-Domain Knowledge: Attention Augmentation for Unsupervised Cross-Domain Segmenting and Labeling Tasks

1 code implementation Findings (EMNLP) 2021 Ruixuan Luo, Yi Zhang, Sishuo Chen, Xu sun

The nature of no word delimiter or inflection that can indicate segment boundaries or word semantics increases the difficulty of Chinese text understanding, and also intensifies the demand for word-level semantic knowledge to accomplish the tagging goal in Chinese segmenting and labeling tasks.

Translation

Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality

1 code implementation28 Mar 2024 Sishuo Chen, Lei LI, Shuhuai Ren, Rundong Gao, Yuanxin Liu, Xiaohan Bi, Xu sun, Lu Hou

Video paragraph captioning (VPC) involves generating detailed narratives for long videos, utilizing supportive modalities such as speech and event boundaries.

Data Augmentation Video Understanding

TempCompass: Do Video LLMs Really Understand Videos?

1 code implementation1 Mar 2024 Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei LI, Sishuo Chen, Xu sun, Lu Hou

Motivated by these two problems, we propose the \textbf{TempCompass} benchmark, which introduces a diversity of temporal aspects and task formats.

Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents

1 code implementation17 Feb 2024 Wenkai Yang, Xiaohan Bi, Yankai Lin, Sishuo Chen, Jie zhou, Xu sun

We first formulate a general framework of agent backdoor attacks, then we present a thorough analysis on the different forms of agent backdoor attacks.

Backdoor Attack Data Poisoning

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

1 code implementation4 Dec 2023 Shuhuai Ren, Linli Yao, Shicheng Li, Xu sun, Lu Hou

This work proposes TimeChat, a time-sensitive multimodal large language model specifically designed for long video understanding.

Dense Captioning Highlight Detection +5

RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge

no code implementations14 Nov 2023 Yi Liu, Lianzhe Huang, Shicheng Li, Sishuo Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

Therefore, to evaluate the ability of LLMs to discern the reliability of external knowledge, we create a benchmark from existing knowledge bases.

counterfactual Knowledge Graphs +2

TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

1 code implementation29 Oct 2023 Shuhuai Ren, Sishuo Chen, Shicheng Li, Xu sun, Lu Hou

TESTA can reduce the number of visual tokens by 75% and thus accelerate video encoding.

 Ranked #1 on Video Retrieval on Condensed Movies (using extra training data)

Language Modelling Retrieval +2

Incorporating Pre-trained Model Prompting in Multimodal Stock Volume Movement Prediction

1 code implementation11 Sep 2023 Ruibo Chen, Zhiyuan Zhang, Yi Liu, Ruihan Bao, Keiko Harimoto, Xu sun

Existing multimodal works that train models from scratch face the problem of lacking universal knowledge when modeling financial news.

Time Series

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning

1 code implementation25 Aug 2023 Bang Yang, Fenglin Liu, Xian Wu, YaoWei Wang, Xu sun, Yuexian Zou

To deal with the label shortage problem, we present a simple yet effective zero-shot approach MultiCapCLIP that can generate visual captions for different scenarios and languages without any labeled vision-caption pairs of downstream datasets.

Image Captioning Video Captioning

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

1 code implementation29 Jul 2023 Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie zhou, Xu sun

As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs.

Language Modelling

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

no code implementations7 Jun 2023 Lei LI, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu sun, Lingpeng Kong, Qi Liu

To tackle this challenge and promote research in the vision-language field, we introduce the Multi-Modal, Multilingual Instruction Tuning (M$^3$IT) dataset, designed to optimize VLM alignment with human instructions.

World Knowledge

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

1 code implementation23 May 2023 Lean Wang, Lei LI, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks.

In-Context Learning

Can Language Models Understand Physical Concepts?

1 code implementation23 May 2023 Lei LI, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu sun

Language models~(LMs) gradually become general-purpose interfaces in the interactive and embodied world, where the understanding of physical concepts is an essential prerequisite.

Communication Efficient Federated Learning for Multilingual Neural Machine Translation with Adapter

1 code implementation21 May 2023 Yi Liu, Xiaohan Bi, Lei LI, Sishuo Chen, Wenkai Yang, Xu sun

However, as pre-trained language models (PLMs) continue to increase in size, the communication cost for transmitting parameters during synchronization has become a training speed bottleneck.

Clustering Federated Learning +2

PALM: Open Fundus Photograph Dataset with Pathologic Myopia Recognition and Anatomical Structure Annotation

1 code implementation13 May 2023 Huihui Fang, Fei Li, Junde Wu, Huazhu Fu, Xu sun, José Ignacio Orlando, Hrvoje Bogunović, Xiulan Zhang, Yanwu Xu

Our databases comprises 1200 images with associated labels for the pathologic myopia category and manual annotations of the optic disc, the position of the fovea and delineations of lesions such as patchy retinal atrophy (including peripapillary atrophy) and retinal detachment.

Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias

no code implementations8 May 2023 Zhiyuan Zhang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

To settle this issue, we propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

2 code implementations30 Jan 2023 Sishuo Chen, Wenkai Yang, Xiaohan Bi, Xu sun

We find that: (1) no existing method behaves well in both settings; (2) fine-tuning PLMs on in-distribution data benefits detecting semantic shifts but severely deteriorates detecting non-semantic shifts, which can be attributed to the distortion of task-agnostic features.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning

no code implementations25 Jan 2023 Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner.

Federated Learning Privacy Preserving

When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning

no code implementations25 Jan 2023 Wenkai Yang, Yankai Lin, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

Federated Learning has become a widely-used framework which allows learning a global model on decentralized local datasets under the condition of protecting local data privacy.

Federated Learning text-classification +1

A Survey on In-context Learning

1 code implementation31 Dec 2022 Qingxiu Dong, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu sun, Jingjing Xu, Lei LI, Zhifang Sui

With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few examples.

In-Context Learning

Aligning Source Visual and Target Language Domains for Unpaired Video Captioning

no code implementations22 Nov 2022 Fenglin Liu, Xian Wu, Chenyu You, Shen Ge, Yuexian Zou, Xu sun

To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language.

Translation Video Captioning

Gradient Knowledge Distillation for Pre-trained Language Models

1 code implementation2 Nov 2022 Lean Wang, Lei LI, Xu sun

Knowledge distillation (KD) is an effective framework to transfer knowledge from a large-scale teacher to a compact yet well-performing student.

Knowledge Distillation

DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention

no code implementations28 Oct 2022 Fenglin Liu, Xian Wu, Shen Ge, Xuancheng Ren, Wei Fan, Xu sun, Yuexian Zou

To enhance the correlation between vision and language in disentangled spaces, we introduce the visual concepts to DiMBERT which represent visual information in textual format.

Image Captioning Language Modelling +3

Generating Accurate and Faithful Discharge Instructions: Task, Dataset, and Model

2 code implementations23 Oct 2022 Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu sun, Yang Yang, David A. Clifton

We build a benchmark clinical dataset and propose the Re3Writer, which imitates the working patterns of physicians to first retrieve related working experience from historical PIs written by physicians, then reason related medical knowledge.

Prophet Attention: Predicting Attention with Future Attention for Image Captioning

no code implementations19 Oct 2022 Fenglin Liu, Xuancheng Ren, Xian Wu, Wei Fan, Yuexian Zou, Xu sun

Especially for image captioning, the attention based models are expected to ground correct image regions with proper generated words.

Image Captioning

Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models

1 code implementation18 Oct 2022 Zhiyuan Zhang, Lingjuan Lyu, Xingjun Ma, Chenguang Wang, Xu sun

In this work, we take the first step to exploit the pre-trained (unfine-tuned) weights to mitigate backdoors in fine-tuned language models.

Language Modelling Sentence +4

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

1 code implementation14 Oct 2022 Sishuo Chen, Wenkai Yang, Zhiyuan Zhang, Xiaohan Bi, Xu sun

In this work, we take the first step to investigate the unconcealment of textual poisoned samples at the intermediate-feature level and propose a feature-based efficient online defense method.

backdoor defense Sentiment Analysis

Holistic Sentence Embeddings for Better Out-of-Distribution Detection

1 code implementation14 Oct 2022 Sishuo Chen, Xiaohan Bi, Rundong Gao, Xu sun

On the basis of the observations that token averaging and layer combination contribute to improving OOD detection, we propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding and significantly surpasses the state-of-the-art on a comprehensive suite of benchmarks by a 9. 33% FAR95 margin.

Avg Out-of-Distribution Detection +4

Dim-Krum: Backdoor-Resistant Federated Learning for NLP with Dimension-wise Krum-Based Aggregation

no code implementations13 Oct 2022 Zhiyuan Zhang, Qi Su, Xu sun

NLP attacks tend to have small relative backdoor strengths, which may result in the failure of robust federated aggregation methods for NLP attacks.

Federated Learning

Stock Trading Volume Prediction with Dual-Process Meta-Learning

1 code implementation11 Oct 2022 Ruibo Chen, Wei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu sun

Our method can model the common pattern behind different stocks with a meta-learner, while modeling the specific pattern for each stock across time spans with stock-dependent parameters.

Algorithmic Trading Meta-Learning

From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models

1 code implementation11 Oct 2022 Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

We then design a Model Uncertainty--aware Knowledge Integration (MUKI) framework to recover the golden supervision for the student.

Distributional Correlation--Aware Knowledge Distillation for Stock Trading Volume Prediction

1 code implementation4 Aug 2022 Lei LI, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu sun

Traditional knowledge distillation in classification problems transfers the knowledge via class correlations in the soft label produced by teacher models, which are not available in regression problems like stock trading volume prediction.

Knowledge Distillation regression

Delving into the Openness of CLIP

1 code implementation4 Jun 2022 Shuhuai Ren, Lei LI, Xuancheng Ren, Guangxiang Zhao, Xu sun

However, evaluating the openness of CLIP-like models is challenging, as the models are open to arbitrary vocabulary in theory, but their accuracy varies in practice.

Image Classification Text Matching

Hierarchical Inductive Transfer for Continual Dialogue Learning

no code implementations Findings (ACL) 2022 Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun

However, for the continual increase of online chit-chat scenarios, directly fine-tuning these models for each of the new tasks not only explodes the capacity of the dialogue system on the embedded devices but also causes knowledge forgetting on pre-trained models and knowledge interference among diverse dialogue tasks.

General Knowledge

DFTR: Depth-supervised Fusion Transformer for Salient Object Detection

no code implementations12 Mar 2022 Heqin Zhu, Xu sun, Yuexiang Li, Kai Ma, S. Kevin Zhou, Yefeng Zheng

This paper, for the first time, seeks to expand the applicability of depth supervision to the Transformer architecture.

Benchmarking Object +3

ADAM Challenge: Detecting Age-related Macular Degeneration from Fundus Images

no code implementations16 Feb 2022 Huihui Fang, Fei Li, Huazhu Fu, Xu sun, Xingxing Cao, Fengbin Lin, Jaemin Son, Sunho Kim, Gwenole Quellec, Sarah Matta, Sharath M Shankaranarayana, Yi-Ting Chen, Chuen-heng Wang, Nisarg A. Shah, Chia-Yen Lee, Chih-Chung Hsu, Hai Xie, Baiying Lei, Ujjwal Baid, Shubham Innani, Kang Dang, Wenxiu Shi, Ravi Kamble, Nitin Singhal, Ching-Wei Wang, Shih-Chang Lo, José Ignacio Orlando, Hrvoje Bogunović, Xiulan Zhang, Yanwu Xu, iChallenge-AMD study group

The ADAM challenge consisted of four tasks which cover the main aspects of detecting and characterizing AMD from fundus images, including detection of AMD, detection and segmentation of optic disc, localization of fovea, and detection and segmentation of lesions.

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

no code implementations14 Dec 2021 Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.

KNAS: Green Neural Architecture Search

1 code implementation26 Nov 2021 Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu sun, Hongxia Yang

Many existing neural architecture search (NAS) solutions rely on downstream training for architecture evaluation, which takes enormous computations.

Image Classification Neural Architecture Search +2

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

1 code implementation EMNLP 2021 Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun

Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models.

Sentiment Analysis

Well-classified Examples are Underestimated in Classification with Deep Neural Networks

1 code implementation13 Oct 2021 Guangxiang Zhao, Wenkai Yang, Xuancheng Ren, Lei LI, Yunfang Wu, Xu sun

The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary.

Graph Classification imbalanced classification +4

Topology-Imbalance Learning for Semi-Supervised Node Classification

1 code implementation NeurIPS 2021 Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.

Classification Node Classification

Dynamic Knowledge Distillation for Pre-trained Language Models

1 code implementation EMNLP 2021 Lei LI, Yankai Lin, Shuhuai Ren, Peng Li, Jie zhou, Xu sun

Knowledge distillation~(KD) has been proved effective for compressing large-scale pre-trained language models.

Knowledge Distillation

Adversarial Parameter Defense by Multi-Step Risk Minimization

no code implementations7 Sep 2021 Zhiyuan Zhang, Ruixuan Luo, Xuancheng Ren, Qi Su, Liangyou Li, Xu sun

To enhance neural networks, we propose the adversarial parameter defense algorithm that minimizes the average risk of multiple adversarial parameter corruptions.

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

no code implementations ICLR 2022 Zhiyuan Zhang, Lingjuan Lyu, Weiqiang Wang, Lichao Sun, Xu sun

In this work, we observe an interesting phenomenon that the variations of parameters are always AWPs when tuning the trained clean model to inject backdoors.

Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification

1 code implementation EMNLP 2021 Shuhuai Ren, Jinchao Zhang, Lei LI, Xu sun, Jie zhou

Data augmentation aims to enrich training samples for alleviating the overfitting issue in low-resource or class-imbalanced situations.

Bayesian Optimization Data Augmentation +2

ASAT: Adaptively Scaled Adversarial Training in Time Series

no code implementations20 Aug 2021 Zhiyuan Zhang, Wei Li, Ruihan Bao, Keiko Harimoto, Yunfang Wu, Xu sun

Besides the security concerns of potential adversarial examples, adversarial training can also improve the generalization ability of neural networks, train robust neural networks, and provide interpretability for neural networks.

Adversarial Robustness Time Series +1

Rethinking Stealthiness of Backdoor Attack against NLP Models

1 code implementation ACL 2021 Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun

In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users.

Backdoor Attack Data Augmentation +2

Contrastive Attention for Automatic Chest X-ray Report Generation

no code implementations Findings (ACL) 2021 Fenglin Liu, Changchang Yin, Xian Wu, Shen Ge, Ping Zhang, Yuexian Zou, Xu sun

In addition, according to the analysis, the CA model can help existing models better attend to the abnormal regions and provide more accurate descriptions which are crucial for an interpretable diagnosis.

Neural Network Surgery: Injecting Data Patterns into Pre-trained Models with Minimal Instance-wise Side Effects

no code implementations NAACL 2021 Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun, Bin He

Motivated by neuroscientific evidence and theoretical results, we demonstrate that side effects can be controlled by the number of changed parameters and thus, we propose to conduct \textit{neural network surgery} by only modifying a limited number of parameters.

A Global Past-Future Early Exit Method for Accelerating Inference of Pre-trained Language Models

1 code implementation NAACL 2021 Kaiyuan Liao, Yi Zhang, Xuancheng Ren, Qi Su, Xu sun, Bin He

We first take into consideration all the linguistic information embedded in the past layers and then take a further step to engage the future information which is originally inaccessible for predictions.

Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge

no code implementations28 May 2021 Yi Zhang, Lei LI, Yunfang Wu, Qi Su, Xu sun

Knowledge facts are typically represented by relational triples, while we observe that some commonsense facts are represented by the triples whose forms are inconsistent with the expression of language.

Learning Relation Alignment for Calibrated Cross-modal Retrieval

1 code implementation ACL 2021 Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu sun, Hongxia Yang

To bridge the semantic gap between the two modalities, previous studies mainly focus on word-region alignment at the object level, lacking the matching between the linguistic relation among the words and the visual relation among the regions.

Cross-Modal Retrieval Image-to-Text Retrieval +4

Rethinking Skip Connection with Layer Normalization in Transformers and ResNets

no code implementations15 May 2021 Fenglin Liu, Xuancheng Ren, Zhiyuan Zhang, Xu sun, Yuexian Zou

In this work, we investigate how the scale factors in the effectiveness of the skip connection and reveal that a trivial adjustment of the scale will lead to spurious gradient exploding or vanishing in line with the deepness of the models, which could be addressed by normalization, in particular, layer normalization, which induces consistent improvements over the plain skip connection.

Image Classification Machine Translation +1

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

1 code implementation NAACL 2021 Wenkai Yang, Lei LI, Zhiyuan Zhang, Xuancheng Ren, Xu sun, Bin He

However, in this paper, we find that it is possible to hack the model in a data-free way by modifying one single word embedding vector, with almost no accuracy sacrificed on clean samples.

Backdoor Attack Data Poisoning +4

Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation

no code implementations22 Feb 2021 Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun

The finding of general knowledge is further hindered by the unidirectional distillation, as the student should obey the teacher and may discard some knowledge that is truly general but refuted by the teacher.

Dialogue Generation General Knowledge +1

High-Likelihood Area Matters --- Rewarding Near-Correct Predictions Under Imbalanced Distributions

no code implementations1 Jan 2021 Guangxiang Zhao, Lei LI, Xuancheng Ren, Xu sun, Bin He

We find in practice that the high-likelihood area contains correct predictions for tail classes and it plays a vital role in learning imbalanced class distributions.

Vocal Bursts Intensity Prediction

A Gradient-based Kernel Approach for Efficient Network Architecture Search

no code implementations1 Jan 2021 Jingjing Xu, Liang Zhao, Junyang Lin, Xu sun, Hongxia Yang

Inspired by our new finding, we explore a simple yet effective network architecture search (NAS) approach that leverages gradient correlation and gradient values to find well-performing architectures.

Image Classification text-classification +1

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

no code implementations14 Dec 2020 Deli Chen, Yankai Lin, Lei LI, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).

Contrastive Learning Graph Learning +1

EQG-RACE: Examination-Type Question Generation

1 code implementation11 Dec 2020 Xin Jia, Wenjie Zhou, Xu sun, Yunfang Wu

Question Generation (QG) is an essential component of the automatic intelligent tutoring systems, which aims to generate high-quality questions for facilitating the reading practice and assessments.

Question Generation Question-Generation +2

Prophet Attention: Predicting Attention with Future Attention

no code implementations NeurIPS 2020 Fenglin Liu, Xuancheng Ren, Xian Wu, Shen Ge, Wei Fan, Yuexian Zou, Xu sun

Especially for image captioning, the attention based models are expected to ground correct image regions with proper generated words.

Image Captioning

Rethinking Skip Connection with Layer Normalization

no code implementations COLING 2020 Fenglin Liu, Xuancheng Ren, Zhiyuan Zhang, Xu sun, Yuexian Zou

In this work, we investigate how the scale factors in the effectiveness of the skip connection and reveal that a trivial adjustment of the scale will lead to spurious gradient exploding or vanishing in line with the deepness of the models, which could by addressed by normalization, in particular, layer normalization, which induces consistent improvements over the plain skip connection.

Image Classification Machine Translation +1

Pretrain-KGE: Learning Knowledge Representation from Pretrained Language Models

no code implementations Findings of the Association for Computational Linguistics 2020 Zhiyuan Zhang, Xiaoqian Liu, Yi Zhang, Qi Su, Xu sun, Bin He

Conventional knowledge graph embedding (KGE) often suffers from limited knowledge representation, leading to performance degradation especially on the low-resource problem.

Knowledge Graph Embedding World Knowledge

A Backbone Replaceable Fine-tuning Framework for Stable Face Alignment

no code implementations19 Oct 2020 Xu sun, Zhenfeng Fan, Zihao Zhang, Yingjie Guo, Shihong Xia

The proposed framework achieves at least 40% improvement on stability evaluation metrics while enhancing detection accuracy versus state-of-the-art methods.

Attribute Face Alignment

CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations

no code implementations13 Oct 2020 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

Regularizing Dialogue Generation by Imitating Implicit Scenarios

no code implementations EMNLP 2020 Shaoxiong Feng, Xuancheng Ren, Hongshen Chen, Bin Sun, Kan Li, Xu sun

Human dialogues are scenario-based and appropriate responses generally relate to the latent context knowledge entailed by the specific scenario.

Dialogue Generation Imitation Learning

Graph-based Multi-hop Reasoning for Long Text Generation

no code implementations28 Sep 2020 Liang Zhao, Jingjing Xu, Junyang Lin, Yichang Zhang, Hongxia Yang, Xu sun

The reasoning module is responsible for searching skeleton paths from a knowledge graph to imitate the imagination process in the human writing for semantic transfer.

Review Generation Sentence +1

Collaborative Group Learning

no code implementations16 Sep 2020 Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, Xu sun

Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima.

Computational Efficiency Inductive Bias +1

Robust Retinal Vessel Segmentation from a Data Augmentation Perspective

1 code implementation31 Jul 2020 Xu Sun, Huihui Fang, Yehui Yang, Dongwei Zhu, Lei Wang, Junwei Liu, Yanwu Xu

In this paper, we propose two new data augmentation modules, namely, channel-wise random Gamma correction and channel-wise random vessel augmentation.

Data Augmentation Retinal Vessel Segmentation

How to Ask Good Questions? Try to Leverage Paraphrases

no code implementations ACL 2020 Xin Jia, Wenjie Zhou, Xu sun, Yunfang Wu

Given a sentence and its relevant answer, how to ask good questions is a challenging task, which has many real applications.

Multi-Task Learning Paraphrase Generation +4

Exploring the Vulnerability of Deep Neural Networks: A Study of Parameter Corruption

1 code implementation10 Jun 2020 Xu Sun, Zhiyuan Zhang, Xuancheng Ren, Ruixuan Luo, Liangyou Li

We argue that the vulnerability of model parameters is of crucial value to the study of model robustness and generalization but little research has been devoted to understanding this matter.

Building BROOK: A Multi-modal and Facial Video Database for Human-Vehicle Interaction Research

no code implementations18 May 2020 Xiangjun Peng, Zhentao Huang, Xu sun

Finally, we discuss related issues when building such a database and our future directions in the context of BROOK.

Autonomous Vehicles

Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding

no code implementations16 May 2020 Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Chenyu You, Xuewei Ma, Xian Wu, Xu sun

While it is common practice to draw information from only the last encoder layer, recent work has proposed to use representations from different encoder layers for diversified levels of information.

Abstractive Text Summarization Image Captioning +5

Parallel Data Augmentation for Formality Style Transfer

1 code implementation ACL 2020 Yi Zhang, Tao Ge, Xu sun

The main barrier to progress in the task of Formality Style Transfer is the inadequacy of training data.

Data Augmentation Formality Style Transfer +2

Query-Variant Advertisement Text Generation with Association Knowledge

1 code implementation14 Apr 2020 Siyu Duan, Wei Li, Cai Jing, Yancheng He, Yunfang Wu, Xu sun

In this paper, we propose the query-variant advertisement text generation task that aims to generate candidate advertisement texts for different web search queries with various needs based on queries and item keywords.

Text Generation

Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

2 code implementations14 Apr 2020 Shu Liu, Wei Li, Yunfang Wu, Qi Su, Xu sun

Target-Based Sentiment Analysis aims to detect the opinion aspects (aspect extraction) and the sentiment polarities (sentiment detection) towards them.

Aspect Extraction Sentiment Analysis

Exploring and Distilling Cross-Modal Information for Image Captioning

no code implementations28 Feb 2020 Fenglin Liu, Xuancheng Ren, Yuanxin Liu, Kai Lei, Xu sun

Recently, attention-based encoder-decoder models have been used extensively in image captioning.

Attribute Image Captioning

Mining Commonsense Facts from the Physical World

no code implementations8 Feb 2020 Yanyan Zou, Wei Lu, Xu sun

In this paper, we propose a new task of mining commonsense facts from the raw text that describes the physical world.

Knowledge Base Completion

Visual Agreement Regularized Training for Multi-Modal Machine Translation

no code implementations27 Dec 2019 Pengcheng Yang, Boxing Chen, Pei Zhang, Xu sun

Further analysis demonstrates that the proposed regularized training can effectively improve the agreement of attention on the image, leading to better use of visual information.

Machine Translation Sentence +1

Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection

2 code implementations25 Dec 2019 Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu sun

Self-attention based Transformer has demonstrated the state-of-the-art performances in a number of natural language processing tasks.

Image Captioning Language Modelling +2

MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning

2 code implementations17 Nov 2019 Guangxiang Zhao, Xu sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo

In this work, we explore parallel multi-scale representation learning on sequence data, striving to capture both long-range and short-range language structures.

Machine Translation Representation Learning +1

Understanding and Improving Layer Normalization

1 code implementation NeurIPS 2019 Jingjing Xu, Xu sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin

Unlike them, we find that the derivatives of the mean and variance are more important than forward normalization by re-centering and re-scaling backward gradients.

Machine Translation Translation

HighwayGraph: Modelling Long-distance Node Relations for Improving General Graph Neural Network

no code implementations10 Nov 2019 Deli Chen, Xiaoqian Liu, Yankai Lin, Peng Li, Jie zhou, Qi Su, Xu sun

To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label.

General Classification Node Classification

Specificity-Driven Cascading Approach for Unsupervised Sentiment Modification

no code implementations IJCNLP 2019 Pengcheng Yang, Junyang Lin, Jingjing Xu, Jun Xie, Qi Su, Xu sun

The task of unsupervised sentiment modification aims to reverse the sentiment polarity of the input text while preserving its semantic content without any parallel data.

Specificity

Blast-wave description of $Υ$ elliptic flow at energies available at the CERN Large Hadron Collider

1 code implementation31 Oct 2019 Klaus Reygers, Alexander Schmah, Anastasia Berdnikova, Xu sun

A simultaneous blast-wave fit to particle yields and elliptic flow ($v_{2}$) measured as a function of transverse momentum in Pb-Pb collisions at LHC energies is presented.

High Energy Physics - Phenomenology Nuclear Experiment Nuclear Theory

An Adaptive and Momental Bound Method for Stochastic Learning

2 code implementations27 Oct 2019 Jianbang Ding, Xuancheng Ren, Ruixuan Luo, Xu sun

The dynamic learning rate bounds are based on the exponential moving averages of the adaptive learning rates themselves, which smooth out unexpected large learning rates and stabilize the training of deep neural networks.

Stochastic Optimization

Pun-GAN: Generative Adversarial Network for Pun Generation

1 code implementation IJCNLP 2019 Fuli Luo, Shunyao Li, Pengcheng Yang, Lei LI, Baobao Chang, Zhifang Sui, Xu sun

It consists of a generator to produce pun sentences, and a discriminator to distinguish between the generated pun sentences and the real sentences with specific word senses.

Generative Adversarial Network Sentence

Sparse Transformer: Concentrated Attention Through Explicit Selection

no code implementations25 Sep 2019 Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Xu sun

Extensive experimental results on a series of natural language processing tasks, including neural machine translation, image captioning, and language modeling, all demonstrate the advantages of Sparse Transformer in model performance.

Image Captioning Language Modelling +2

Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information

1 code implementation ACL 2019 Pengcheng Yang, Zhihan Zhang, Fuli Luo, Lei LI, Chengyang Huang, Xu sun

Automatic commenting of online articles can provide additional opinions and facts to the reader, which improves user experience and engagement on social media platforms.

Comment Generation

MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction

no code implementations ACL 2019 Pengcheng Yang, Fuli Luo, Peng Chen, Tianyu Liu, Xu sun

The task of unsupervised bilingual lexicon induction (UBLI) aims to induce word translations from monolingual corpora in two languages.

Bilingual Lexicon Induction Denoising +2

Enhancing Topic-to-Essay Generation with External Commonsense Knowledge

no code implementations ACL 2019 Pengcheng Yang, Lei LI, Fuli Luo, Tianyu Liu, Xu sun

Experiments show that with external commonsense knowledge and adversarial training, the generated essays are more novel, diverse, and topic-consistent than existing methods in terms of both automatic and human evaluation.

Concept-To-Text Generation

A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification

1 code implementation ACL 2019 Pengcheng Yang, Fuli Luo, Shuming Ma, Junyang Lin, Xu sun

In this way, we can reduce the dependence of the model on the label order, as well as capture high-order correlations between labels.

General Classification Multi-Label Classification

Learning to Control the Fine-grained Sentiment for Story Ending Generation

no code implementations ACL 2019 Fuli Luo, Damai Dai, Pengcheng Yang, Tianyu Liu, Baobao Chang, Zhifang Sui, Xu sun

Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges.

Text Generation

Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model

1 code implementation ACL 2019 Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun

In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.

Graph-to-Sequence

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

4 code implementations27 Jun 2019 Ruixuan Luo, Jingjing Xu, Yi Zhang, Zhiyuan Zhang, Xuancheng Ren, Xu sun

Through this method, we generate synthetic data using a large amount of unlabeled data in the target domain and then obtain a word segmentation model for the target domain.

Chinese Word Segmentation Domain Adaptation +3

A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer

1 code implementation ACL 2019 Chen Wu, Xuancheng Ren, Fuli Luo, Xu sun

Unsupervised text style transfer aims to alter text styles while preserving the content, without aligned data for supervision.

Sentence Style Transfer +2

Coherent Comment Generation for Chinese Articles with a Graph-to-Sequence Model

1 code implementation4 Jun 2019 Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun

In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.

Comment Generation Graph-to-Sequence

A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer

2 code implementations24 May 2019 Fuli Luo, Peng Li, Jie zhou, Pengcheng Yang, Baobao Chang, Zhifang Sui, Xu sun

Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style.

reinforcement-learning Reinforcement Learning (RL) +2

Memorized Sparse Backpropagation

no code implementations24 May 2019 Zhiyuan Zhang, Pengcheng Yang, Xuancheng Ren, Qi Su, Xu sun

Neural network learning is usually time-consuming since backpropagation needs to compute full gradients and backpropagate them across multiple layers.

Adaptive Gradient Methods with Dynamic Bound of Learning Rate

5 code implementations ICLR 2019 Liangchen Luo, Yuanhao Xiong, Yan Liu, Xu sun

Recent work has put forward some algorithms such as AMSGrad to tackle this issue but they failed to achieve considerable improvement over existing methods.

Learning Personalized End-to-End Goal-Oriented Dialog

no code implementations12 Nov 2018 Liangchen Luo, Wenhao Huang, Qi Zeng, Zaiqing Nie, Xu sun

Most existing works on dialog systems only consider conversation content while neglecting the personality of the user the bot is interacting with, which begets several unsolved issues.

Goal-Oriented Dialog

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy

no code implementations1 Nov 2018 Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dong-dong Zhang, Xu sun

In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding.

Cross-Lingual Word Embeddings Density Estimation +4

Unsupervised Machine Commenting with Neural Variational Topic Model

no code implementations13 Sep 2018 Shuming Ma, Lei Cui, Furu Wei, Xu sun

To fully exploit the unpaired data, we completely remove the need for parallel data and propose a novel unsupervised approach to train an automatic article commenting model, relying on nothing but unpaired articles and comments.

Retrieval

An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation

1 code implementation EMNLP 2018 Liangchen Luo, Jingjing Xu, Junyang Lin, Qi Zeng, Xu sun

Different from conventional text generation tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs.

Dialogue Generation

Identifying High-Quality Chinese News Comments Based on Multi-Target Text Matching Model

no code implementations22 Aug 2018 Deli Chen, Shuming Ma, Pengcheng Yang, Xu sun

In this work, we introduce a novel task: high-quality comment identification (HQCI), which aims to automatically assess the quality of online comments.

Informativeness Text Matching

Learning Sentiment Memories for Sentiment Modification without Parallel Data

1 code implementation EMNLP 2018 Yi Zhang, Jingjing Xu, Pengcheng Yang, Xu sun

The task of sentiment modification requires reversing the sentiment of the input and preserving the sentiment-independent content.

Text Style Transfer

Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation

1 code implementation EMNLP 2018 Junyang Lin, Xu sun, Xuancheng Ren, Muyu Li, Qi Su

Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism.

Machine Translation NMT +1

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

1 code implementation EMNLP 2018 Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu sun

Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation.

Sentence Story Generation

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

no code implementations16 Aug 2018 Wei Li, Xuancheng Ren, Damai Dai, Yunfang Wu, Houfeng Wang, Xu sun

In the experiments, we take a real-world sememe knowledge base HowNet and the corresponding descriptions of the words in Baidu Wiki for training and evaluation.

Primal Meaning Recommendation via On-line Encyclopedia

no code implementations14 Aug 2018 Zhiyuan Zhang, Wei Li, Jingjing Xu, Xu sun

We define the primal meaning of an expression to be a frequently used sense of that expression from which its other frequent senses can be deduced.

A Neural Question Answering Model Based on Semi-Structured Tables

no code implementations COLING 2018 Hao Wang, Xiaodong Zhang, Shuming Ma, Xu sun, Houfeng Wang, Mengxiang Wang

Then the system measures the relevance between each question and candidate table cells, and choose the most related cell as the source of answer.

Knowledge Graphs Multiple-choice +1

SGM: Sequence Generation Model for Multi-label Classification

1 code implementation COLING 2018 Pengcheng Yang, Xu sun, Wei Li, Shuming Ma, Wei Wu, Houfeng Wang

Further analysis of experimental results demonstrates that the proposed methods not only capture the correlations between labels, but also select the most informative words automatically when predicting different labels.

Classification General Classification +1

Deconvolution-Based Global Decoding for Neural Machine Translation

1 code implementation COLING 2018 Junyang Lin, Xu sun, Xuancheng Ren, Shuming Ma, Jinsong Su, Qi Su

A great proportion of sequence-to-sequence (Seq2Seq) models for Neural Machine Translation (NMT) adopt Recurrent Neural Network (RNN) to generate translation word by word following a sequential order.

Machine Translation NMT +1

Bag-of-Words as Target for Neural Machine Translation

1 code implementation ACL 2018 Shuming Ma, Xu sun, Yizhong Wang, Junyang Lin

However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage.

Machine Translation Sentence +1

Global Encoding for Abstractive Summarization

4 code implementations ACL 2018 Junyang Lin, Xu sun, Shuming Ma, Qi Su

To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context.

Abstractive Text Summarization

Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network

1 code implementation ACL 2018 Pengcheng Yang, Xu sun, Wei Li, Shuming Ma

As more and more academic papers are being submitted to conferences and journals, evaluating all these papers by professionals is time-consuming and can cause inequality due to the personal factors of the reviewers.

Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation

no code implementations6 Feb 2018 Junyang Lin, Shuming Ma, Qi Su, Xu sun

ACA learns to control the attention by keeping track of the decoding history and the current information with a memory vector, so that the model can take the translated contents and the current information into consideration.

Machine Translation NMT +1

Exploration on Generating Traditional Chinese Medicine Prescription from Symptoms with an End-to-End method

no code implementations27 Jan 2018 Wei Li, Zheng Yang, Xu sun

Traditional Chinese Medicine (TCM) is an influential form of medical treatment in China and surrounding areas.

Building an Ellipsis-aware Chinese Dependency Treebank for Web Text

1 code implementation LREC 2018 Xuancheng Ren, Xu sun, Ji Wen, Bingzhen Wei, Weidong Zhan, Zhiyuan Zhang

Web 2. 0 has brought with it numerous user-produced data revealing one's thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction.

Dependency Parsing Sentence

A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction

1 code implementation LREC 2018 Yi Zhang, Xu sun

However, due to the deficiency in the abbreviation corpora, such a task is limited in current studies, especially considering general abbreviation prediction should also include those full form expressions that do not have valid abbreviations, namely the negative full forms (NFFs).

valid

Hybrid Oracle: Making Use of Ambiguity in Transition-based Chinese Dependency Parsing

1 code implementation28 Nov 2017 Xuancheng Ren, Xu sun

In the training of transition-based dependency parsers, an oracle is used to predict a transition sequence for a sentence and its gold tree.

Chinese Dependency Parsing Dependency Parsing +1

Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?

1 code implementation COLING 2018 Yi Zhang, Xu sun, Shuming Ma, Yang Yang, Xuancheng Ren

In our work, we first design a new model called "high order LSTM" to predict multiple tags for the current token which contains not only the current tag but also the previous several tags.

Chunking NER +1

A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text

2 code implementations19 Nov 2017 Jingjing Xu, Ji Wen, Xu sun, Qi Su

To build a high quality dataset, we propose two tagging methods to solve the problem of data inconsistency, including a heuristic tagging method and a machine auxiliary tagging method.

named-entity-recognition Named Entity Recognition +3

Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

3 code implementations17 Nov 2017 Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang

Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications.

Deep Stacking Networks for Low-Resource Chinese Word Segmentation with Transfer Learning

no code implementations4 Nov 2017 Jingjing Xu, Xu sun, Sujian Li, Xiaoyan Cai, Bingzhen Wei

In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains.

Chinese Word Segmentation Transfer Learning

Addressing Domain Adaptation for Chinese Word Segmentation with Global Recurrent Structure

no code implementations IJCNLP 2017 Shen Huang, Xu sun, Houfeng Wang

Boundary features are widely used in traditional Chinese Word Segmentation (CWS) methods as they can utilize unlabeled data to help improve the Out-of-Vocabulary (OOV) word recognition performance.

Chinese Word Segmentation Domain Adaptation +2

Label Embedding Network: Learning Label Representation for Soft Training of Deep Networks

1 code implementation ICLR 2018 Xu Sun, Bingzhen Wei, Xuancheng Ren, Shuming Ma

We propose a method, called Label Embedding Network, which can learn label representation (label embedding) during the training process of deep networks.

A Semantic Relevance Based Neural Network for Text Summarization and Text Simplification

1 code implementation6 Oct 2017 Shuming Ma, Xu sun

In this work, our goal is to improve semantic relevance between source texts and simplified texts for text summarization and text simplification.

Semantic Similarity Semantic Textual Similarity +3

Minimal Effort Back Propagation for Convolutional Neural Networks

no code implementations18 Sep 2017 Bingzhen Wei, Xu sun, Xuancheng Ren, Jingjing Xu

As traditional neural network consumes a significant amount of computing resources during back propagation, \citet{Sun2017mePropSB} propose a simple yet effective technique to alleviate this problem.

meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

2 code implementations ICML 2017 Xu Sun, Xuancheng Ren, Shuming Ma, Houfeng Wang

In back propagation, only a small subset of the full gradient is computed to update the model parameters.

Lock-Free Parallel Perceptron for Graph-based Dependency Parsing

no code implementations2 Mar 2017 Xu Sun, Shuming Ma

To deal with this problem, we propose a parallel algorithm called parallel perceptron.

Dependency Parsing

A Generic Online Parallel Learning Framework for Large Margin Models

no code implementations2 Mar 2017 Shuming Ma, Xu sun

To speed up the training process, many existing systems use parallel technology for online learning algorithms.

Transfer Deep Learning for Low-Resource Chinese Word Segmentation with a Novel Neural Network

no code implementations15 Feb 2017 Jingjing Xu, Xu sun

First, we train a teacher model on high-resource corpora and then use the learned knowledge to initialize a student model.

Chinese Word Segmentation Segmentation +1

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

no code implementations COLING 2016 Xu Sun

Existing asynchronous parallel learning methods are only for the sparse feature models, and they face new challenges for the dense feature models like neural networks (e. g., LSTM, RNN).

Low-Rank Matrix Completion

A New Recurrent Neural CRF for Learning Non-linear Edge Features

no code implementations14 Nov 2016 Shuming Ma, Xu sun

Conditional Random Field (CRF) and recurrent neural models have achieved success in structured prediction.

Chinese Word Segmentation Chunking +3

Towards Easier and Faster Sequence Labeling for Natural Language Processing: A Search-based Probabilistic Online Learning Framework (SAPO)

4 code implementations29 Mar 2015 Xu Sun, Shuming Ma, Yi Zhang, Xuancheng Ren

We show that this method with fast training and theoretical guarantee of convergence, which is easy to implement, can support search-based optimization and obtain top accuracy.

Structure Regularization for Structured Prediction

no code implementations NeurIPS 2014 Xu Sun

Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model.

Structured Prediction

Structure Regularization for Structured Prediction: Theories and Experiments

no code implementations23 Nov 2014 Xu Sun

Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model.

Structured Prediction

Exact Decoding on Latent Variable Conditional Models is NP-Hard

no code implementations18 Jun 2014 Xu Sun

The computational complexity of the exact decoding/inference in latent conditional random fields is unclear.

Cannot find the paper you are looking for? You can Submit a new open access paper.