no code implementations • Findings (ACL) 2021 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
Ranked #1 on Unsupervised Extractive Summarization on arXiv Summarization Dataset (using extra training data)
Extractive Summarization Unsupervised Extractive Summarization
1 code implementation • EMNLP 2021 • Jiali Zeng, Shuangzhi Wu, Yongjing Yin, Yufan Jiang, Mu Li
Across an extensive set of experiments on 10 machine translation tasks, we find that RAN models are competitive and outperform their Transformer counterpart in certain scenarios, with fewer parameters and inference time.
no code implementations • WMT (EMNLP) 2020 • Shuangzhi Wu, Xing Wang, Longyue Wang, Fangxu Liu, Jun Xie, Zhaopeng Tu, Shuming Shi, Mu Li
This paper describes Tencent Neural Machine Translation systems for the WMT 2020 news translation tasks.
no code implementations • WMT (EMNLP) 2021 • Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang
Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.
1 code implementation • CVPR 2024 • Kanglong Fan, Wen Wen, Mu Li, Yifan Peng, Kede Ma
Panoramic videos have the advantage of providing an immersive and interactive viewing experience.
1 code implementation • CVPR 2024 • Wen Wen, Mu Li, Yabin Zhang, Yiting Liao, Junlin Li, Li Zhang, Kede Ma
Blind video quality assessment (BVQA) plays a pivotal role in evaluating and improving the viewing experience of end-users across a wide range of video-based platforms and services.
1 code implementation • NeurIPS 2023 • Zhihan Gao, Xingjian Shi, Boran Han, Hao Wang, Xiaoyong Jin, Danielle Maddix, Yi Zhu, Mu Li, Yuyang Wang
We conduct empirical studies on two datasets: N-body MNIST, a synthetic dataset with chaotic behavior, and SEVIR, a real-world precipitation nowcasting dataset.
1 code implementation • 16 May 2023 • Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li
It has been commonly observed that a teacher model with superior performance does not necessarily result in a stronger student, highlighting a discrepancy between current teacher training practices and effective knowledge transfer.
1 code implementation • 10 May 2023 • Bingzhao Zhu, Xingjian Shi, Nick Erickson, Mu Li, George Karypis, Mahsa Shoaran
The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data.
no code implementations • 4 May 2023 • Mu Li, Kanglong Fan, Kede Ma
Predicting human scanpaths when exploring panoramic videos is a challenging task due to the spherical geometry and the multimodality of the input, and the inherent uncertainty and diversity of the output.
1 code implementation • 10 Apr 2023 • Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang
Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.
1 code implementation • NeurIPS 2023 • Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu sun
This work proposes POMP, a prompt pre-training method for vision-language models.
3 code implementations • 8 Mar 2023 • Cody Hao Yu, Haozheng Fan, Guangtai Huang, Zhen Jia, Yizhi Liu, Jie Wang, Zach Zheng, Yuan Zhou, Haichen Shen, Junru Shao, Mu Li, Yida Wang
In this paper, we present RAF, a deep learning compiler for training.
no code implementations • 16 Feb 2023 • Jiaxin Cheng, Xiao Liang, Xingjian Shi, Tong He, Tianjun Xiao, Mu Li
Layout-to-image generation refers to the task of synthesizing photo-realistic images based on semantic layouts.
1 code implementation • 6 Feb 2023 • Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li
Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.
Ranked #3 on Action Recognition on Diving-48 (using extra training data)
3 code implementations • 2 Feb 2023 • Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola
Experimental results on ScienceQA and A-OKVQA benchmark datasets show the effectiveness of our proposed approach.
Ranked #4 on Science Question Answering on ScienceQA
no code implementations • 4 Jan 2023 • Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang
We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.
1 code implementation • 29 Dec 2022 • Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson
The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.
no code implementations • 21 Dec 2022 • M Saiful Bari, Aston Zhang, Shuai Zheng, Xingjian Shi, Yi Zhu, Shafiq Joty, Mu Li
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way.
no code implementations • 21 Dec 2022 • Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, Jiaya Jia
The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm.
no code implementations • 15 Dec 2022 • JieLin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li
Multimodal image-text models have shown remarkable performance in the past few years.
no code implementations • 10 Oct 2022 • Yunhe Gao, Xingjian Shi, Yi Zhu, Hao Wang, Zhiqiang Tang, Xiong Zhou, Mu Li, Dimitris N. Metaxas
First, DePT plugs visual prompts into the vision Transformer and only tunes these source-initialized prompts during adaptation.
Ranked #6 on Domain Adaptation on VisDA2017
5 code implementations • 7 Oct 2022 • Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.
1 code implementation • COLING 2022 • Xinnian Liang, Jing Li, Shuangzhi Wu, Jiali Zeng, Yufan Jiang, Mu Li, Zhoujun Li
To tackle this problem, in this paper, we proposed an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization, which is based on the semantic block.
2 code implementations • 12 Jul 2022 • Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li, Dit-yan Yeung
With the explosive growth of the spatiotemporal Earth observation data in the past decade, data-driven models that apply Deep Learning (DL) are demonstrating impressive potential for various Earth system forecasting tasks.
Ranked #1 on Earth Surface Forecasting on EarthNet2021 OOD Track
1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang
In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.
1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, Zhangyang Wang
However, in real-world applications, it is common for the training sets to have long-tailed distributions.
1 code implementation • 16 Jun 2022 • Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li
Data augmentation is a necessity to enhance data efficiency in deep learning.
1 code implementation • 13 Jun 2022 • Wen Wen, Mu Li, Yiru Yao, Xiangjie Sui, Yabin Zhang, Long Lan, Yuming Fang, Kede Ma
Investigating how people perceive virtual reality (VR) videos in the wild (i. e., those captured by everyday users) is a crucial and challenging task in VR-related applications due to complex authentic distortions localized in space and time.
1 code implementation • NAACL 2022 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In this paper, we propose a novel method to extract multi-granularity features based solely on the original input sentences.
2 code implementations • 24 Mar 2022 • Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, xiangyang xue
Multiple datasets and open challenges for object detection have been introduced in recent years.
Ranked #1 on Object Detection on BigDetection val
1 code implementation • Findings (ACL) 2022 • Jiali Zeng, Yufan Jiang, Shuangzhi Wu, Yongjing Yin, Mu Li
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks.
1 code implementation • ACL 2022 • Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li
Confidence estimation aims to quantify the confidence of the model prediction, providing an expectation of success.
1 code implementation • 25 Dec 2021 • Mu Li, Kede Ma, Jinxing Li, David Zhang
We first describe parametric pseudocylindrical representation as a generalization of common pseudocylindrical map projections.
2 code implementations • 4 Nov 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola
We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.
Ranked #2 on Binary Classification on kickstarter
no code implementations • NeurIPS 2021 • Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia
In this work, we analyze the uncharted problem of aliasing in vision transformer and explore to incorporate anti-aliasing properties.
no code implementations • EMNLP (sustainlp) 2021 • Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis
We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.
1 code implementation • EMNLP 2021 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.
1 code implementation • NeurIPS 2021 • Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, xiangyang xue
Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment.
1 code implementation • 5 Aug 2021 • Haofei Kuang, Yi Zhu, Zhi Zhang, Xinyu Li, Joseph Tighe, Sören Schwertfeger, Cyrill Stachniss, Mu Li
Our formulation is able to capture global context in a video, thus robust to temporal content change.
no code implementations • ACL 2021 • Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li
Attention mechanisms have achieved substantial improvements in neural machine translation by dynamically selecting relevant inputs for different predictions.
no code implementations • 29 Jul 2021 • Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo wu, Yanwei Fu, Mu Li
Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries.
1 code implementation • 21 Jun 2021 • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.
1 code implementation • ICML Workshop AutoML 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola
We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.
1 code implementation • ICCV 2021 • Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris Metaxas
Can we develop new normalization methods to improve generalization robustness under distribution shifts?
no code implementations • 1 Jan 2021 • Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris N. Metaxas
CrossNorm exchanges styles between feature channels to perform style augmentation, diversifying the content and style mixtures.
1 code implementation • 11 Dec 2020 • Yi Zhu, Xinyu Li, Chunhui Liu, Mohammadreza Zolfaghari, Yuanjun Xiong, Chongruo wu, Zhi Zhang, Joseph Tighe, R. Manmatha, Mu Li
Video action recognition is one of the representative tasks for video understanding.
no code implementations • COLING 2020 • Deyu Zhou, Shuangzhi Wu, Qing Wang, Jun Xie, Zhaopeng Tu, Mu Li
Emotion lexicons have been shown effective for emotion classification (Baziotis et al., 2018).
no code implementations • 6 Nov 2020 • Yufan Jiang, Shuangzhi Wu, Jing Gong, Yahui Cheng, Peng Meng, Weiliang Lin, Zhibo Chen, Mu Li
In addition, by transferring knowledge from other kinds of MRC tasks, our model achieves a new state-of-the-art results in both single and ensemble settings.
Ranked #1 on Reading Comprehension on RACE
no code implementations • 26 Aug 2020 • Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang
FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge.
no code implementations • NeurIPS 2020 • Cong Xie, Shuai Zheng, Oluwasanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin
The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks.
1 code implementation • 24 Jun 2020 • Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li
Using the proposed LANS method and the learning rate scheme, we scaled up the mini-batch sizes to 96K and 33K in phases 1 and 2 of BERT pretraining, respectively.
no code implementations • 4 Jun 2020 • Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang
Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes.
1 code implementation • 26 May 2020 • Zhaohan Zhang, Mu Li, Katharine Flores, Rohan Mishra
The model uses easily accessible elemental properties as descriptors and has a mean absolute error (MAE) of 0. 025 eV/atom in predicting the formation enthalpy of stable binary intermetallics reported in the Materials Project database.
Materials Science
no code implementations • 10 May 2020 • Mu Li, Kai Zhang, WangMeng Zuo, Radu Timofte, David Zhang
To address this issue, we propose a non-local operation for context modeling by employing the global similarity within the context.
no code implementations • 30 Apr 2020 • Yi Zhu, Zhongyue Zhang, Chongruo wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li, Alexander Smola
In the case of semantic segmentation, this means that large amounts of pixelwise annotations are required to learn accurate models.
35 code implementations • 19 Apr 2020 • Hang Zhang, Chongruo wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola
It is well known that featuremap attention and multi-path representation are important for visual recognition.
Ranked #10 on Instance Segmentation on COCO test-dev (APM metric)
8 code implementations • 13 Mar 2020 • Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, Alexander Smola
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file.
Ranked #8 on Molecular Property Prediction on Tox21
3 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
2 code implementations • 24 Jun 2019 • Mu Li, Kede Ma, Jane You, David Zhang, WangMeng Zuo
For the former, we directly apply a CCN to the binarized representation of an image to compute the Bernoulli distribution of each code for entropy estimation.
2 code implementations • 26 Apr 2019 • Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li
One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.
1 code implementation • arXiv 2019 • Chenguang Wang, Mu Li, Alexander J. Smola
In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient.
Ranked #2 on Language Modelling on Penn Treebank (Word Level) (using extra training data)
1 code implementation • 1 Apr 2019 • Mu Li, WangMeng Zuo, Shuhang Gu, Jane You, David Zhang
Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance.
3 code implementations • 11 Feb 2019 • Zhi Zhang, Tong He, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li
Training heuristics greatly improve various image classification model accuracies~\cite{he2018bag}.
27 code implementations • CVPR 2019 • Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li
Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods.
Ranked #38 on Domain Generalization on VizWiz-Classification
no code implementations • CONLL 2018 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
To address this issue and stabilize the GAN training, in this paper, we propose a novel Bidirectional Generative Adversarial Network for Neural Machine Translation (BGAN-NMT), which aims to introduce a generator model to act as the discriminator, whereby the discriminator naturally considers the entire translation space so that the inadequate training problem can be alleviated.
no code implementations • 24 Aug 2018 • Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions.
no code implementations • 23 Aug 2018 • Zhirui Zhang, Shuo Ren, Shujie Liu, Jianyong Wang, Peng Chen, Mu Li, Ming Zhou, Enhong Chen
Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content.
Ranked #3 on Unsupervised Text Style Transfer on GYAFC
no code implementations • 13 Aug 2018 • Zhirui Zhang, Shuangzhi Wu, Shujie Liu, Mu Li, Ming Zhou, Tong Xu
Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation.
no code implementations • NAACL 2018 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
no code implementations • ACL 2018 • Shuo Ren, Wenhu Chen, Shujie Liu, Mu Li, Ming Zhou, Shuai Ma
Neural Machine Translation (NMT) performs poor on the low-resource language pair $(X, Z)$, especially when $Z$ is a rare language.
2 code implementations • 15 Mar 2018 • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dong-dong Zhang, Zhirui Zhang, Ming Zhou
Machine translation has made rapid advances in recent years.
Ranked #3 on Machine Translation on WMT 2017 English-Chinese
no code implementations • 1 Mar 2018 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough.
2 code implementations • ECCV 2018 • Zhaoyi Yan, Xiaoming Li, Mu Li, WangMeng Zuo, Shiguang Shan
To this end, the encoder feature of the known region is shifted to serve as an estimation of the missing parts.
no code implementations • 15 Jan 2018 • Mu Li, Shuhang Gu, David Zhang, WangMeng Zuo
One key issue of arithmetic encoding method is to predict the probability of the current coding symbol from its context, i. e., the preceding encoded symbols, which usually can be executed by building a look-up table (LUT).
no code implementations • EMNLP 2017 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other state-of-the-art methods, such as stack-LSTM and head selection.
no code implementations • ACL 2017 • Shuangzhi Wu, Dong-dong Zhang, Nan Yang, Mu Li, Ming Zhou
Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned.
no code implementations • ACL 2017 • Shonosuke Ishiwatari, JingTao Yao, Shujie Liu, Mu Li, Ming Zhou, Naoki Yoshinaga, Masaru Kitsuregawa, Weijia Jia
The chunk-level decoder models global dependencies while the word-level decoder decides the local word order in a chunk.
no code implementations • 28 Jun 2017 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
1 code implementation • CVPR 2018 • Mu Li, WangMeng Zuo, Shuhang Gu, Debin Zhao, David Zhang
Therefore, the encoder, decoder, binarizer and importance map can be jointly optimized in an end-to-end manner by using a subset of the ImageNet database.
no code implementations • COLING 2016 • Shi Feng, Shujie Liu, Nan Yang, Mu Li, Ming Zhou, Kenny Q. Zhu
In neural machine translation, the attention mechanism facilitates the translation process by producing a soft alignment between the source sentence and the target sentence.
no code implementations • 18 Oct 2016 • Mu Li, WangMeng Zuo, David Zhang
In general, our model consists of a mask network and an attribute transform network which work in synergy to generate a photo-realistic facial image with the reference attribute.
Ranked #2 on Image-to-Image Translation on RaFD
no code implementations • 23 Aug 2016 • Mu Li, WangMeng Zuo, David Zhang
Here we address this problem from the view of optimization, and suggest an optimization model to generate human face with the given attributes while keeping the identity of the reference image.
no code implementations • 24 Mar 2016 • Ye Yuan, Mu Li, Jun Liu, Claire J. Tomlin
We propose a new method to accelerate the convergence of optimization algorithms.
no code implementations • 18 Feb 2016 • Bing Xu, Ruitong Huang, Mu Li
In this paper, we revise two commonly used saturated functions, the logistic sigmoid and the hyperbolic tangent (tanh).
no code implementations • 13 Jan 2016 • Shi Feng, Shujie Liu, Mu Li, Ming Zhou
Aiming to resolve these problems, we propose new variations of attention-based encoder-decoder and compare them with other models on machine translation.
no code implementations • 15 Dec 2015 • Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola
In distributed machine learning, data is dispatched to multiple machines for processing.
2 code implementations • 3 Dec 2015 • Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang
This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.
no code implementations • 21 Oct 2015 • Aaron Q. Li, Amr Ahmed, Mu Li, Vanja Josifovski
Latent variable models have accumulated a considerable amount of interest from the industry and academia for their versatility in a wide range of applications.
no code implementations • 20 Aug 2015 • Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola
We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients.
no code implementations • 18 May 2015 • Mu Li, Dave G. Andersen, Alexander J. Smola
Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance.
2 code implementations • 5 May 2015 • Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li
In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU).
Ranked #198 on Image Classification on CIFAR-100
no code implementations • 5 Feb 2015 • Jiajun Zhang, Shujie Liu, Mu Li, Ming Zhou, Cheng-qing Zong
Language model is one of the most important modules in statistical machine translation and currently the word-based language model dominants this community.
no code implementations • NeurIPS 2014 • Mu Li, David G. Andersen, Alexander J. Smola, Kai Yu
This paper describes a third-generation parameter server framework for distributed machine learning.