Search Results for author: Liang Ding

Found 43 papers, 13 papers with code

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation

1 code implementation ACL 2022 Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu

In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.

Knowledge Distillation Translation +1

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks

no code implementations18 Jul 2022 Chuang Liu, Xueqi Ma, Yibing Zhan, Liang Ding, Dapeng Tao, Bo Du, Wenbin Hu, Danilo Mandic

However, the LTH-based methods suffer from two major drawbacks: 1) they require exhaustive and iterative training of dense models, resulting in an extremely large training computation cost, and 2) they only trim graph structures and model parameters but ignore the node feature dimension, where significant redundancy exists.

Node Classification

Dynamic Contrastive Distillation for Image-Text Retrieval

no code implementations4 Jul 2022 Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, DaCheng Tao

Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable).

Contrastive Learning Metric Learning +1

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

no code implementations30 May 2022 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Sequence-to-sequence (seq2seq) learning has become a popular trend for pretraining language models, due to its succinct and universal framework.

Denoising Language Modelling

Parameter-Efficient and Student-Friendly Knowledge Distillation

no code implementations28 May 2022 Jun Rao, Xv Meng, Liang Ding, Shuhan Qi, DaCheng Tao

In this paper, we present a parameter-efficient and student-friendly knowledge distillation method, namely PESF-KD, to achieve efficient and sufficient knowledge transfer by updating relatively few partial parameters.

Knowledge Distillation Transfer Learning

Interpretable Proof Generation via Iterative Backward Reasoning

1 code implementation NAACL 2022 Hanhao Qu, Yu Cao, Jun Gao, Liang Ding, Ruifeng Xu

We present IBR, an Iterative Backward Reasoning model to solve the proof generation tasks on rule-based Question Answering (QA), where models are required to reason over a series of textual rules and facts to find out the related proof path and derive the final answer.

Question Answering

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

no code implementations16 Apr 2022 Changtong Zan, Liang Ding, Li Shen, Yu Cao, Weifeng Liu, DaCheng Tao

For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e. g. mBART, the self-supervised pretraining task is trained on a wide range of monolingual languages, e. g. 25 languages from commoncrawl, while the downstream cross-lingual tasks generally progress on a bilingual language subset, e. g. English-German, making there exists the cross-lingual data discrepancy, namely \textit{domain discrepancy}, and cross-lingual learning objective discrepancy, namely \textit{task discrepancy}, between the pretrain and finetune stages.

Pretrained Language Models Text Generation +2

BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation

no code implementations16 Apr 2022 Zheng Zhang, Liang Ding, Dazhao Cheng, Xuebo Liu, Min Zhang, DaCheng Tao

Data augmentations (DA) are the cores to achieving robust sequence-to-sequence learning on various natural language processing (NLP) tasks.

Grammatical Error Correction Machine Translation +2

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

no code implementations CVPR 2022 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Ling-Yu Duan

Instead, we propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG), which relieves the issue of direct model aggregation.

Federated Learning Knowledge Distillation

Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval

1 code implementation8 Mar 2022 Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, DaCheng Tao

In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text.

Information Retrieval

Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process Regression with Matérn Correlations

no code implementations7 Mar 2022 HaoYuan Chen, Liang Ding, Rui Tuo

We develop an exact and scalable algorithm for one-dimensional Gaussian process regression with Mat\'ern correlations whose smoothness parameter $\nu$ is a half-integer.

Improving Neural Machine Translation by Denoising Training

no code implementations19 Jan 2022 Liang Ding, Keqin Peng, DaCheng Tao

We present a simple and effective pretraining strategy {D}en{o}ising {T}raining DoT for neural machine translation.

Denoising Knowledge Distillation +2

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

1 code implementation13 Jan 2022 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Hua Jin, DaCheng Tao

To this end, we propose a knowledge graph augmented network (KGAN), which aims to effectively incorporate external knowledge with explicitly syntactic and contextual information.

Aspect-Based Sentiment Analysis Knowledge Graphs +1

A Sparse Expansion For Deep Gaussian Processes

no code implementations11 Dec 2021 Liang Ding, Rui Tuo, Shahin Shahrampour

In this work, we propose an efficient scheme for accurate inference and prediction based on a range of Gaussian Processes, called the Tensor Markov Gaussian Processes (TMGP).

Gaussian Processes

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

1 code implementation26 Oct 2021 Juhua Liu, Qihuang Zhong, Liang Ding, Hua Jin, Bo Du, DaCheng Tao

In practice, we formulate the model pretrained on the sampled instances into a knowledge guidance model and a learner model, respectively.

Aspect-Based Sentiment Analysis Transfer Learning

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

1 code implementation Findings (EMNLP) 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).

Machine Translation Translation

FLBoost: On-the-Fly Fine-tuning Boosts Federated Learning via Data-free Distillation

no code implementations29 Sep 2021 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Lingyu Duan

On the contrary, we propose a new solution: on-the-fly fine-tuning the global model in server via data-free distillation to boost its performance, dubbed FLBoost to relieve the issue of direct model aggregation.

Federated Learning

Improving Neural Machine Translation by Bidirectional Training

no code implementations EMNLP 2021 Liang Ding, Di wu, DaCheng Tao

We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.

Machine Translation Translation

Progressive Multi-Granularity Training for Non-Autoregressive Translation

no code implementations Findings (ACL) 2021 Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu

Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence.

Translation

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation

1 code implementation ACL 2021 Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu

Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.

Knowledge Distillation Translation

Self-Guided Curriculum Learning for Neural Machine Translation

no code implementations ACL (IWSLT) 2021 Lei Zhou, Liang Ding, Kevin Duh, Shinji Watanabe, Ryohei Sasano, Koichi Takeda

In the field of machine learning, the well-trained model is assumed to be able to recover the training labels, i. e. the synthetic labels predicted by the model should be as close to the ground-truth labels as possible.

Machine Translation Translation

Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding

no code implementations13 Apr 2021 Di wu, Yiren Chen, Liang Ding, DaCheng Tao

Spoken language understanding (SLU) system usually consists of various pipeline components, where each component heavily relies on the results of its upstream ones.

Automatic Speech Recognition Denoising +5

Towards Efficiently Diversifying Dialogue Generation via Embedding Augmentation

1 code implementation2 Mar 2021 Yu Cao, Liang Ding, Zhiliang Tian, Meng Fang

Dialogue generation models face the challenge of producing generic and repetitive responses.

Dialogue Generation

Unsupervised Word Alignment via Cross-Lingual Contrastive Learning

no code implementations1 Jan 2021 Di wu, Liang Ding, Shuo Yang, DaCheng Tao

Recently, the performance of the neural word alignment models has exceeded that of statistical models.

Contrastive Learning Translation +1

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

1 code implementation ICLR 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu

Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.

Grammatical Error Correction Machine Translation +3

Understanding and Improving Lexical Choice in Non-Autoregressive Translation

no code implementations ICLR 2021 Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu

To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.

Knowledge Distillation Translation

Context-Aware Cross-Attention for Non-Autoregressive Translation

1 code implementation COLING 2020 Liang Ding, Longyue Wang, Di wu, DaCheng Tao, Zhaopeng Tu

Non-autoregressive translation (NAT) significantly accelerates the inference process by predicting the entire target sequence.

Translation

Sample and Computationally Efficient Stochastic Kriging in High Dimensions

no code implementations14 Oct 2020 Liang Ding, Xiaowei Zhang

However, its use is limited to cases where the design space is low-dimensional because, in general, the sample complexity (i. e., the number of design points required for stochastic kriging to produce an accurate prediction) grows exponentially in the dimensionality of the design space.

Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns

no code implementations WMT (EMNLP) 2020 Lei Zhou, Liang Ding, Koichi Takeda

In response to this issue, we propose to expose explicit cross-lingual patterns, \textit{e. g.} word alignments and generation score, to our proposed zero-shot models.

Translation

High-Dimensional Non-Parametric Density Estimation in Mixed Smooth Sobolev Spaces

no code implementations5 Jun 2020 Liang Ding, Lu Zou, Wenjia Wang, Shahin Shahrampour, Rui Tuo

Density estimation plays a key role in many tasks in machine learning, statistical inference, and visualization.

Density Estimation

Self-Attention with Cross-Lingual Position Representation

no code implementations ACL 2020 Liang Ding, Long-Yue Wang, DaCheng Tao

Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences.

Machine Translation Natural Language Processing +1

Recurrent Graph Syntax Encoder for Neural Machine Translation

no code implementations19 Aug 2019 Liang Ding, DaCheng Tao

Syntax-incorporated machine translation models have been proven successful in improving the model's reasoning and meaning preservation ability.

Machine Translation Translation

Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling

1 code implementation21 Jan 2018 Liang Ding, Di Chang, Russell Malmberg, Aaron Martinez, David Robinson, Matthew Wicker, Hongfei Yan, Liming Cai

The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree.

Cannot find the paper you are looking for? You can Submit a new open access paper.