Search Results for author: Mu Li

Found 87 papers, 30 papers with code

Recurrent Attention for Neural Machine Translation

1 code implementation EMNLP 2021 Jiali Zeng, Shuangzhi Wu, Yongjing Yin, Yufan Jiang, Mu Li

Across an extensive set of experiments on 10 machine translation tasks, we find that RAN models are competitive and outperform their Transformer counterpart in certain scenarios, with fewer parameters and inference time.

Machine Translation Translation

Tencent Translation System for the WMT21 News Translation Task

no code implementations WMT (EMNLP) 2021 Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang

Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.

Data Augmentation Translation

Modeling Multi-Granularity Hierarchical Features for Relation Extraction

1 code implementation9 Apr 2022 Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li

In this paper, we propose a novel method to extract multi-granularity features based solely on the original input sentences.

Relation Extraction

Task-guided Disentangled Tuning for Pretrained Language Models

1 code implementation Findings (ACL) 2022 Jiali Zeng, Yufan Jiang, Shuangzhi Wu, Yongjing Yin, Mu Li

Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks.

Pretrained Language Models

Learning Confidence for Transformer-based Neural Machine Translation

1 code implementation ACL 2022 Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li

Confidence estimation aims to quantify the confidence of the model prediction, providing an expectation of success.

Machine Translation Translation

Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression

1 code implementation25 Dec 2021 Mu Li, Kede Ma, Jinxing Li, David Zhang

We first describe parametric pseudocylindrical representation as a generalization of common pseudocylindrical map projections.

Image Compression

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

1 code implementation4 Nov 2021 Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.


Blending Anti-Aliasing into Vision Transformer

no code implementations NeurIPS 2021 Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia

In this work, we analyze the uncharted problem of aliasing in vision transformer and explore to incorporate anti-aliasing properties.

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing

no code implementations EMNLP (sustainlp) 2021 Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis

We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.

Data Augmentation Hyperparameter Optimization

Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global Context

1 code implementation EMNLP 2021 Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li

In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.

Document Embedding Keyphrase Extraction

Progressive Coordinate Transforms for Monocular 3D Object Detection

1 code implementation NeurIPS 2021 Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, xiangyang xue

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment.

Monocular 3D Object Detection Object Localization

Attention Calibration for Transformer in Neural Machine Translation

no code implementations ACL 2021 Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li

Attention mechanisms have achieved substantial improvements in neural machine translation by dynamically selecting relevant inputs for different predictions.

Machine Translation Translation

A Unified Efficient Pyramid Transformer for Semantic Segmentation

no code implementations29 Jul 2021 Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo wu, Yanwei Fu, Mu Li

Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries.

Semantic Segmentation

Dive into Deep Learning

1 code implementation21 Jun 2021 Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.

Multi-Domain Recommender Systems

Multimodal AutoML on Structured Tables with Text Fields

2 code implementations ICML Workshop AutoML 2021 Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola

We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.


CrossNorm and SelfNorm for Generalization under Distribution Shifts

1 code implementation ICCV 2021 Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris Metaxas

Can we develop new normalization methods to improve generalization robustness under distribution shifts?

Unity of Opposites: SelfNorm and CrossNorm for Model Robustness

no code implementations1 Jan 2021 Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris N. Metaxas

CrossNorm exchanges styles between feature channels to perform style augmentation, diversifying the content and style mixtures.

Object Recognition Unity

Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning

no code implementations6 Nov 2020 Yufan Jiang, Shuangzhi Wu, Jing Gong, Yahui Cheng, Peng Meng, Weiliang Lin, Zhibo Chen, Mu Li

In addition, by transferring knowledge from other kinds of MRC tasks, our model achieves a new state-of-the-art results in both single and ensemble settings.

AutoML Machine Reading Comprehension +1

FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems

no code implementations26 Aug 2020 Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang

FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge.

CSER: Communication-efficient SGD with Error Reset

no code implementations NeurIPS 2020 Cong Xie, Shuai Zheng, Oluwasanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin

The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks.

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

1 code implementation24 Jun 2020 Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li

Using the proposed LANS method and the learning rate scheme, we scaled up the mini-batch sizes to 96K and 33K in phases 1 and 2 of BERT pretraining, respectively.

Natural Language Understanding

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

no code implementations4 Jun 2020 Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang

Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes.

Machine learning formation enthalpies of intermetallics

1 code implementation26 May 2020 Zhaohan Zhang, Mu Li, Katharine Flores, Rohan Mishra

The model uses easily accessible elemental properties as descriptors and has a mean absolute error (MAE) of 0. 025 eV/atom in predicting the formation enthalpy of stable binary intermetallics reported in the Materials Project database.

Materials Science

Learning Context-Based Non-local Entropy Modeling for Image Compression

no code implementations10 May 2020 Mu Li, Kai Zhang, WangMeng Zuo, Radu Timofte, David Zhang

To address this issue, we propose a non-local operation for context modeling by employing the global similarity within the context.

Image Compression

Improving Semantic Segmentation via Self-Training

no code implementations30 Apr 2020 Yi Zhu, Zhongyue Zhang, Chongruo wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li, Alexander Smola

In the case of semantic segmentation, this means that large amounts of pixelwise annotations are required to learn accurate models.

Domain Generalization Semantic Segmentation

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

7 code implementations13 Mar 2020 Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, Alexander Smola

We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file.

Neural Architecture Search

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

4 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression

2 code implementations24 Jun 2019 Mu Li, Kede Ma, Jane You, David Zhang, WangMeng Zuo

For the former, we directly apply a CCN to the binarized representation of an image to compute the Bernoulli distribution of each code for entropy estimation.

Image Compression

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

2 code implementations26 Apr 2019 Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.

Image Classification Object Detection +1

Language Models with Transformers

1 code implementation arXiv 2019 Chenguang Wang, Mu Li, Alexander J. Smola

In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient.

Language Modelling Neural Architecture Search

Learning Content-Weighted Deep Image Compression

1 code implementation1 Apr 2019 Mu Li, WangMeng Zuo, Shuhang Gu, Jane You, David Zhang

Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance.

Image Compression

Bag of Tricks for Image Classification with Convolutional Neural Networks

24 code implementations CVPR 2019 Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li

Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods.

General Classification Image Classification +3

Bidirectional Generative Adversarial Networks for Neural Machine Translation

no code implementations CONLL 2018 Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen

To address this issue and stabilize the GAN training, in this paper, we propose a novel Bidirectional Generative Adversarial Network for Neural Machine Translation (BGAN-NMT), which aims to introduce a generator model to act as the discriminator, whereby the discriminator naturally considers the entire translation space so that the inadequate training problem can be alleviated.

Language Modelling Machine Translation +1

Approximate Distribution Matching for Sequence-to-Sequence Learning

no code implementations24 Aug 2018 Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou

Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions.

Image Captioning Machine Translation +1

Style Transfer as Unsupervised Machine Translation

no code implementations23 Aug 2018 Zhirui Zhang, Shuo Ren, Shujie Liu, Jianyong Wang, Peng Chen, Mu Li, Ming Zhou, Enhong Chen

Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content.

Style Transfer Translation +1

Regularizing Neural Machine Translation by Target-bidirectional Agreement

no code implementations13 Aug 2018 Zhirui Zhang, Shuangzhi Wu, Shujie Liu, Mu Li, Ming Zhou, Tong Xu

Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation.

Machine Translation Translation

Generative Bridging Network for Neural Sequence Prediction

no code implementations NAACL 2018 Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou

In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).

Abstractive Text Summarization Image Captioning +4

Triangular Architecture for Rare Language Translation

no code implementations ACL 2018 Shuo Ren, Wenhu Chen, Shujie Liu, Mu Li, Ming Zhou, Shuai Ma

Neural Machine Translation (NMT) performs poor on the low-resource language pair $(X, Z)$, especially when $Z$ is a rare language.

Machine Translation Translation

Joint Training for Neural Machine Translation Models with Monolingual Data

no code implementations1 Mar 2018 Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen

Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough.

Domain Adaptation Machine Translation +1

Shift-Net: Image Inpainting via Deep Feature Rearrangement

1 code implementation ECCV 2018 Zhaoyi Yan, Xiaoming Li, Mu Li, WangMeng Zuo, Shiguang Shan

To this end, the encoder feature of the known region is shifted to serve as an estimation of the missing parts.

Image Inpainting

Enlarging Context with Low Cost: Efficient Arithmetic Coding with Trimmed Convolution

no code implementations15 Jan 2018 Mu Li, Shuhang Gu, David Zhang, WangMeng Zuo

One key issue of arithmetic encoding method is to predict the probability of the current coding symbol from its context, i. e., the preceding encoded symbols, which usually can be executed by building a look-up table (LUT).

Image Compression

Stack-based Multi-layer Attention for Transition-based Dependency Parsing

no code implementations EMNLP 2017 Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen

Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other state-of-the-art methods, such as stack-LSTM and head selection.

Language Modelling Machine Translation +3

Sequence-to-Dependency Neural Machine Translation

no code implementations ACL 2017 Shuangzhi Wu, Dong-dong Zhang, Nan Yang, Mu Li, Ming Zhou

Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned.

Machine Translation Translation

Generative Bridging Network in Neural Sequence Prediction

no code implementations28 Jun 2017 Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou

In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).

Abstractive Text Summarization Language Modelling +2

Learning Convolutional Networks for Content-weighted Image Compression

1 code implementation CVPR 2018 Mu Li, WangMeng Zuo, Shuhang Gu, Debin Zhao, David Zhang

Therefore, the encoder, decoder, binarizer and importance map can be jointly optimized in an end-to-end manner by using a subset of the ImageNet database.

Binarization Image Compression +1

Improving Attention Modeling with Implicit Distortion and Fertility for Machine Translation

no code implementations COLING 2016 Shi Feng, Shujie Liu, Nan Yang, Mu Li, Ming Zhou, Kenny Q. Zhu

In neural machine translation, the attention mechanism facilitates the translation process by producing a soft alignment between the source sentence and the target sentence.

Machine Translation Translation

Deep Identity-aware Transfer of Facial Attributes

no code implementations18 Oct 2016 Mu Li, WangMeng Zuo, David Zhang

In general, our model consists of a mask network and an attribute transform network which work in synergy to generate a photo-realistic facial image with the reference attribute.

Denoising Face Hallucination +1

Convolutional Network for Attribute-driven and Identity-preserving Human Face Generation

no code implementations23 Aug 2016 Mu Li, WangMeng Zuo, David Zhang

Here we address this problem from the view of optimization, and suggest an optimization model to generate human face with the given attributes while keeping the identity of the reference image.

Face Generation

On the Powerball Method for Optimization

no code implementations24 Mar 2016 Ye Yuan, Mu Li, Jun Liu, Claire J. Tomlin

We propose a new method to accelerate the convergence of optimization algorithms.

Revise Saturated Activation Functions

no code implementations18 Feb 2016 Bing Xu, Ruitong Huang, Mu Li

In this paper, we revise two commonly used saturated functions, the logistic sigmoid and the hyperbolic tangent (tanh).

Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model

no code implementations13 Jan 2016 Shi Feng, Shujie Liu, Mu Li, Ming Zhou

Aiming to resolve these problems, we propose new variations of attention-based encoder-decoder and compare them with other models on machine translation.

Image Captioning Machine Translation +2

Data Driven Resource Allocation for Distributed Learning

no code implementations15 Dec 2015 Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola

In distributed machine learning, data is dispatched to multiple machines for processing.

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

2 code implementations3 Dec 2015 Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang

This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.

Dimensionality Reduction General Classification

High Performance Latent Variable Models

no code implementations21 Oct 2015 Aaron Q. Li, Amr Ahmed, Mu Li, Vanja Josifovski

Latent variable models have accumulated a considerable amount of interest from the industry and academia for their versatility in a wide range of applications.

AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

no code implementations20 Aug 2015 Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola

We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients.

Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning

no code implementations18 May 2015 Mu Li, Dave G. Andersen, Alexander J. Smola

Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance.

Distributed Computing graph partitioning

Empirical Evaluation of Rectified Activations in Convolutional Network

2 code implementations5 May 2015 Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li

In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU).

General Classification Image Classification

Beyond Word-based Language Model in Statistical Machine Translation

no code implementations5 Feb 2015 Jiajun Zhang, Shujie Liu, Mu Li, Ming Zhou, Cheng-qing Zong

Language model is one of the most important modules in statistical machine translation and currently the word-based language model dominants this community.

Language Modelling Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.