Search Results for author: Li Shen

Found 167 papers, 63 papers with code

Are Large Language Models Really Robust to Word-Level Perturbations?

no code implementations20 Sep 2023 Haoyu Wang, Guozheng Ma, Cong Yu, Ning Gui, Linrui Zhang, Zhiqi Huang, Suwei Ma, Yongzhe Chang, Sen Zhang, Li Shen, Xueqian Wang, Peilin Zhao, DaCheng Tao

To address this issue, we propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools to evaluate the robustness of LLMs, which we refer to as the Reward Model for Reasonable Robustness Evaluation (TREvaL).

Question Answering

FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data

no code implementations18 Sep 2023 Hao Sun, Li Shen, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data.

Federated Learning Scheduling

Continual Learning From a Stream of APIs

no code implementations31 Aug 2023 Enneng Yang, Zhenyi Wang, Li Shen, Nan Yin, Tongliang Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model.

Continual Learning

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations30 Aug 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

1 code implementation24 Aug 2023 Hanchi Huang, Li Shen, Deheng Ye, Wei Liu

We propose a novel master-slave architecture to solve the top-$K$ combinatorial multi-armed bandits problem with non-linear bandit feedback and diversity constraints, which, to the best of our knowledge, is the first combinatorial bandits setting considering diversity constraints under bandit feedback.

Multi-Armed Bandits

Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?

1 code implementation24 Aug 2023 Fei Wang, Liang Ding, Jun Rao, Ye Liu, Li Shen, Changxing Ding

The multimedia community has shown a significant interest in perceiving and representing the physical world with multimodal pretrained neural network models, and among them, the visual-language pertaining (VLP) is, currently, the most captivating topic.

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

no code implementations18 Aug 2023 Xiaoge Deng, Li Shen, Shengwei Li, Tao Sun, Dongsheng Li, DaCheng Tao

Stochastic gradient descent (SGD) performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.

DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning

no code implementations16 Aug 2023 Qinglun Li, Li Shen, Guanghao Li, Quanjun Yin, DaCheng Tao

To address the communication burden issues associated with federated learning (FL), decentralized federated learning (DFL) discards the central server and establishes a decentralized communication network, where each client communicates only with neighboring clients.

Federated Learning

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

no code implementations1 Aug 2023 Guanyu Xu, Jiawei Hao, Li Shen, Han Hu, Yong Luo, Hui Lin, Jialie Shen

Recently, the efficient deployment and acceleration of powerful vision transformers (ViTs) on resource-limited edge devices for providing multimedia services have become attractive tasks.

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

no code implementations30 Jul 2023 Yan Sun, Li Shen, Hao Sun, Liang Ding, DaCheng Tao

Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global adaptive optimizer; (ii) client drifts exacerbated by local over-fitting with the local adaptive optimizer.

Federated Learning

High-Resolution Volumetric Reconstruction for Clothed Humans

no code implementations25 Jul 2023 Sicong Tang, Guangyuan Wang, Qing Ran, Lingzhi Li, Li Shen, Ping Tan

We present a novel method for reconstructing clothed humans from a sparse set of, e. g., 1 to 6 RGB images.


GBT: Two-stage transformer framework for non-stationary time series forecasting

1 code implementation17 Jul 2023 Li Shen, Yuning Wei, Yangzhu Wang

It decouples the prediction process of TSFT into two stages, including Auto-Regression stage and Self-Regression stage to tackle the problem of different statistical properties between input and prediction sequences. Prediction results of Auto-Regression stage serve as a Good Beginning, i. e., a better initialization for inputs of Self-Regression stage.

regression Time Series +1

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning

1 code implementation16 Jul 2023 Zhenyi Wang, Enneng Yang, Li Shen, Heng Huang

Through this comprehensive survey, we aspire to uncover potential solutions by drawing upon ideas and approaches from various fields that have dealt with forgetting.

Continual Learning Federated Learning +1

Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

no code implementations14 Jul 2023 Zihao Zhu, Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu

To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization. Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss.

Backdoor Attack Data Poisoning

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

1 code implementation30 Jun 2023 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao

Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation25 Jun 2023 Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

FDNet: Focal Decomposed Network for Efficient, Robust and Practical Time Series Forecasting

1 code implementation19 Jun 2023 Li Shen, Yuning Wei, Yangzhu Wang, Huaxin Qiu

Moreover, we propose focal input sequence decomposition method which decomposes input sequence in a focal manner for efficient and robust forecasting when facing Long Sequence Time series Input (LSTI) problem.

Inductive Bias Time Series +1

Understanding How Consistency Works in Federated Learning via Stage-wise Relaxed Initialization

no code implementations9 Jun 2023 Yan Sun, Li Shen, DaCheng Tao

To alleviate the negative impact of the ``client drift'' and explore its substance in FL, in this paper, we first design an efficient FL algorithm \textit{FedInit}, which allows employing the personalized relaxed initialization state at the beginning of each local training stage.

Federated Learning

One-step Multi-view Clustering with Diverse Representation

no code implementations8 Jun 2023 Xinhang Wan, Jiyuan Liu, Xinwang Liu, Siwei Wang, Yi Wen, Tianjiao Wan, Li Shen, En Zhu

In light of this, we propose a one-step multi-view clustering with diverse representation method, which incorporates multi-view learning and $k$-means into a unified framework.


CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

no code implementations8 Jun 2023 Nan Yin, Li Shen, Mengzhu Wang, Long Lan, Zeyu Ma, Chong Chen, Xian-Sheng Hua, Xiao Luo

Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire.

Contrastive Learning Domain Adaptation +2

Dynamic Sparsity Is Channel-Level Sparsity Learner

no code implementations30 May 2023 Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu

Dynamic sparse training (DST), as a leading sparse training approach, can train deep neural networks at high sparsity from scratch to match the performance of their dense counterparts.

Are Large Kernels Better Teachers than Transformers for ConvNets?

1 code implementation30 May 2023 Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.

Knowledge Distillation

Compact Real-time Radiance Fields with Neural Codebook

no code implementations29 May 2023 Lingzhi Li, Zhongshu Wang, Zhen Shen, Li Shen, Ping Tan

Reconstructing neural radiance fields with explicit volumetric representations, demonstrated by Plenoxels, has shown remarkable advantages on training and rendering efficiency, while grid-based representations typically induce considerable overhead for storage and transmission.

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Knowledge Distillation Meta-Learning

Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning

no code implementations25 May 2023 Guozheng Ma, Linrui Zhang, Haoyu Wang, Lu Li, Zilin Wang, Zhen Wang, Li Shen, Xueqian Wang, DaCheng Tao

Taking the non-stationary nature of RL into account, we propose a RL-tailored multi-type DA fusion scheme called Cycling Augmentation (CycAug), which performs periodic cycles of different DA operations to increase type diversity while maintaining data distribution consistency.

Data Augmentation reinforcement-learning +1

Incomplete Multimodal Learning for Complex Brain Disorders Prediction

no code implementations25 May 2023 Reza Shirkavand, Liang Zhan, Heng Huang, Li Shen, Paul M. Thompson

Especially in studies of brain diseases, research cohorts may include both neuroimaging data and genetic data, but for practical clinical diagnosis, we often need to make disease predictions only based on neuroimages.

Data Integration

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training

no code implementations24 May 2023 Yifan Shi, Yingqi Liu, Yan Sun, Zihao Lin, Li Shen, Xueqian Wang, DaCheng Tao

Personalized federated learning (PFL) aims to produce the greatest personalized model for each client to face an insurmountable problem--data heterogeneity in real FL systems.

Personalized Federated Learning

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

no code implementations19 May 2023 Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao

In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.

Federated Learning

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

1 code implementation10 May 2023 Fa-Ting Hong, Li Shen, Dan Xu

In this work, firstly, we present a novel self-supervised method for learning dense 3D facial geometry (ie, depth) from face videos, without requiring camera parameters and 3D geometry annotations in training.

Keypoint Estimation Talking Head Generation +1

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

1 code implementation1 May 2023 Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization

no code implementations24 Apr 2023 Mingli Zhu, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu

Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model.

backdoor defense

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

no code implementations7 Apr 2023 Li Shen, Yan Sun, Zhiyuan Yu, Liang Ding, Xinmei Tian, DaCheng Tao

The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.

Quantum Imitation Learning

no code implementations4 Apr 2023 Zhihao Cheng, Kaining Zhang, Li Shen, DaCheng Tao

Despite remarkable successes in solving various complex decision-making tasks, training an imitation learning (IL) algorithm with deep neural networks (DNNs) suffers from the high computation burden.

Behavioural cloning

Towards Making the Most of ChatGPT for Machine Translation

1 code implementation24 Mar 2023 Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information further improves ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.

Machine Translation Translation +1

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

1 code implementation CVPR 2023 Zhuo Huang, Miaoxi Zhu, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Bo Du, Tongliang Liu

Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

1 code implementation CVPR 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, DaCheng Tao

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data.


Make Landscape Flatter in Differentially Private Federated Learning

1 code implementation CVPR 2023 Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, DaCheng Tao

Specifically, DP-FedSAM integrates Sharpness Aware Minimization (SAM) optimizer to generate local flatness models with better stability and weight perturbation robustness, which results in the small norm of local updates and robustness to DP noise, thereby improving the performance.

Federated Learning

Visual Prompt Based Personalized Federated Learning

no code implementations15 Mar 2023 Guanghao Li, Wansen Wu, Yan Sun, Li Shen, Baoyuan Wu, DaCheng Tao

Then, the local model is trained on the input composed of raw data and a visual prompt to learn the distribution information contained in the prompt.

Image Classification Personalized Federated Learning

SGDA: Towards 3D Universal Pulmonary Nodule Detection via Slice Grouped Domain Attention

1 code implementation7 Mar 2023 Rui Xu, Zhi Liu, Yong Luo, Han Hu, Li Shen, Bo Du, Kaiming Kuang, Jiancheng Yang

To address this issue, we propose a slice grouped domain attention (SGDA) module to enhance the generalization capability of the pulmonary nodule detection networks.

Computed Tomography (CT)

Graph Decision Transformer

no code implementations7 Mar 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.

Offline RL OpenAI Gym +1

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

no code implementations1 Mar 2023 Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step.

Subspace based Federated Unlearning

no code implementations24 Feb 2023 Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, DaCheng Tao

Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data.

Federated Learning

Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling

no code implementations21 Feb 2023 Selena Wang, Yiting Wang, Frederick H. Xu, Li Shen, Yize Zhao

By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer's Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.

Fusion of Global and Local Knowledge for Personalized Federated Learning

1 code implementation21 Feb 2023 Tiansheng Huang, Li Shen, Yan Sun, Weiwei Lin, DaCheng Tao

Personalized federated learning, as a variant of federated learning, trains customized models for clients using their heterogeneously distributed data.

Personalized Federated Learning

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

1 code implementation21 Feb 2023 Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, DaCheng Tao

Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections.

Federated Learning

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

no code implementations18 Feb 2023 Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, DaCheng Tao

This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.

Contrastive Learning Denoising +11

FedABC: Targeting Fair Competition in Personalized Federated Learning

no code implementations15 Feb 2023 Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, DaCheng Tao

In particular, we adopt the ``one-vs-all'' training strategy in each client to alleviate the unfair competition between classes by constructing a personalized binary classification problem for each class.

Binary Classification Personalized Federated Learning

Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

no code implementations11 Feb 2023 Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, DaCheng Tao

Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the aggravated heterogeneous dataset.

Edge-computing Federated Learning

Improving the Model Consistency of Decentralized Federated Learning

no code implementations8 Feb 2023 Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, DaCheng Tao

To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.

Federated Learning

MetaMix: Towards Corruption-Robust Continual Learning With Temporally Self-Adaptive Data Transformation

no code implementations CVPR 2023 Zhenyi Wang, Li Shen, Donglin Zhan, Qiuling Suo, Yanjun Zhu, Tiehang Duan, Mingchen Gao

To make them trustworthy and robust to corruptions deployed in safety-critical scenarios, we propose a meta-learning framework of self-adaptive data augmentation to tackle the corruption robustness in CL.

Continual Learning Data Augmentation +1

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

no code implementations29 Dec 2022 Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao

Transformer, originally devised for natural language processing, has also attested significant success in computer vision.

Autonomous Driving reinforcement-learning +2

SVSBI: Sequence-based virtual screening of biomolecular interactions

1 code implementation27 Dec 2022 Li Shen, Hongsong Feng, Yuchi Qiu, Guo-Wei Wei

Virtual screening (VS) is an essential technique for understanding biomolecular interactions, particularly, drug design and discovery.

Drug Discovery Molecular Docking

Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation

no code implementations15 Dec 2022 Wenyu Zhang, Li Shen, Chuan-Sheng Foo

We propose to distil useful target domain information through a co-learning strategy to improve target pseudolabel quality for finetuning the source model.

Representation Learning Source-Free Domain Adaptation +1

Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks

no code implementations12 Dec 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, DaCheng Tao

Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics.

Autonomous Driving reinforcement-learning +2

4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions

1 code implementation9 Dec 2022 Zhongshu Wang, Lingzhi Li, Zhen Shen, Li Shen, Liefeng Bo

In this paper, we present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions, building on the methodology of neural radiance fields (NeRF).

Vocal Bursts Intensity Prediction

Compressing Volumetric Radiance Fields to 1 MB

1 code implementation CVPR 2023 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Liefeng Bo

Approximating radiance fields with volumetric grids is one of promising directions for improving NeRF, represented by methods like Plenoxels and DVGO, which achieve super-fast training convergence and real-time rendering.

Model Compression Neural Rendering +1

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

no code implementations28 Nov 2022 Enneng Yang, Junwei Pan, Ximei Wang, Haibin Yu, Li Shen, Xihua Chen, Lei Xiao, Jie Jiang, Guibing Guo

In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter.

Multi-Task Learning Recommendation Systems

Curriculum-based Asymmetric Multi-task Reinforcement Learning

1 code implementation7 Nov 2022 Hanchi Huang, Deheng Ye, Li Shen, Wei Liu

To mitigate the negative influence of customizing the one-off training order in curriculum-based AMTL, CAMRL switches its training mode between parallel single-task RL and asymmetric multi-task RL (MTRL), according to an indicator regarding the training time, the overall performance, and the performance gap among tasks.

Multi-Task Learning reinforcement-learning +1

Streaming Radiance Fields for 3D Video Synthesis

1 code implementation26 Oct 2022 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Ping Tan

Instead of training a single model that combines all the frames, we formulate the dynamic modeling problem with an incremental learning paradigm in which per-frame model difference is trained to complement the adaption of a base model on the current frame.

Incremental Learning Model Optimization +1

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

2 code implementations12 Oct 2022 Zeyu Qin, Yanbo Fan, Yi Liu, Li Shen, Yong Zhang, Jue Wang, Baoyuan Wu

Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability.

Adversarial Attack

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

1 code implementation11 Oct 2022 Qihuang Zhong, Liang Ding, Li Shen, Peng Mi, Juhua Liu, Bo Du, DaCheng Tao

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization.

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

1 code implementation11 Oct 2022 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao

One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.

Strength-Adaptive Adversarial Training

no code implementations4 Oct 2022 Chaojian Yu, Dawei Zhou, Li Shen, Jun Yu, Bo Han, Mingming Gong, Nannan Wang, Tongliang Liu

Firstly, applying a pre-specified perturbation budget on networks of various model capacities will yield divergent degree of robustness disparity between natural and robust accuracies, which deviates from robust network's desideratum.

Adversarial Robustness Scheduling

Tensor-Based Multi-Modality Feature Selection and Regression for Alzheimer's Disease Diagnosis

1 code implementation23 Sep 2022 Jun Yu, Zhaoming Kong, Liang Zhan, Li Shen, Lifang He

The assessment of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI) associated with brain changes remains a challenging task.

feature selection regression

Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions

1 code implementation3 Sep 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, Mingchen Gao

Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift.


Respecting Time Series Properties Makes Deep Time Series Forecasting Perfect

1 code implementation22 Jul 2022 Li Shen, Yuning Wei, Yangzhu Wang

Thanks to the core idea of respecting time series properties, no matter in which forecasting format, RTNet shows obviously superior forecasting performances compared with dozens of other SOTA time series forecasting baselines in three real-world benchmark datasets.

Time Series Time Series Forecasting

Improving Task-free Continual Learning by Distributionally Robust Memory Evolution

1 code implementation15 Jul 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Tiehang Duan, Mingchen Gao

To address these problems, for the first time, we propose a principled memory evolution framework to dynamically evolve the memory data distribution by making the memory buffer gradually harder to be memorized with distributionally robust optimization (DRO).

Continual Learning

Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

no code implementations7 Jul 2022 Zhuo Huang, Xiaobo Xia, Li Shen, Bo Han, Mingming Gong, Chen Gong, Tongliang Liu

Machine learning models are vulnerable to Out-Of-Distribution (OOD) examples, and such a problem has drawn much attention.

Data Augmentation Disentanglement +3

Local Sample-weighted Multiple Kernel Clustering with Consensus Discriminative Graph

1 code implementation5 Jul 2022 Liang Li, Siwei Wang, Xinwang Liu, En Zhu, Li Shen, Kenli Li, Keqin Li

Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.


Dynamic Contrastive Distillation for Image-Text Retrieval

no code implementations4 Jul 2022 Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, DaCheng Tao

Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable).

Contrastive Learning Metric Learning +3

Towards Harnessing Feature Embedding for Robust Learning with Noisy Labels

no code implementations27 Jun 2022 Chuang Zhang, Li Shen, Jian Yang, Chen Gong

To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels.

Learning with noisy labels Memorization

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

1 code implementation17 Jun 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang

Safe reinforcement learning (RL) has achieved significant success on risk-sensitive tasks and shown promise in autonomous driving (AD) as well.

Autonomous Driving reinforcement-learning +2

Understanding Robust Overfitting of Adversarial Training and Beyond

1 code implementation17 Jun 2022 Chaojian Yu, Bo Han, Li Shen, Jun Yu, Chen Gong, Mingming Gong, Tongliang Liu

Here, we explore the causes of robust overfitting by comparing the data distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted} (strong adversary) adversarial training, and observe that the distribution of the adversarial data generated by weak adversary mainly contain small-loss data.

Adversarial Robustness Data Ablation

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

1 code implementation1 Jun 2022 Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, DaCheng Tao

In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge.

Personalized Federated Learning

Robust Weight Perturbation for Adversarial Training

1 code implementation30 May 2022 Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du, Tongliang Liu

Based on these observations, we propose a robust perturbation strategy to constrain the extent of weight perturbation.


Few-Shot Adaptation of Pre-Trained Networks for Domain Shift

1 code implementation30 May 2022 Wenyu Zhang, Li Shen, Wanyue Zhang, Chuan-Sheng Foo

Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation.

Domain Adaptation Semantic Segmentation

Efficient-Adam: Communication-Efficient Distributed Adam

no code implementations28 May 2022 Congliang Chen, Li Shen, Wei Liu, Zhi-Quan Luo

Distributed adaptive stochastic gradient methods have been widely used for large-scale nonconvex optimization, such as training deep learning models.


MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

1 code implementation27 May 2022 Erdun Gao, Ignavier Ng, Mingming Gong, Li Shen, Wei Huang, Tongliang Liu, Kun Zhang, Howard Bondell

In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.

Causal Discovery Imputation +1

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

no code implementations24 May 2022 Linrui Zhang, Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, DaCheng Tao

Safe reinforcement learning aims to learn the optimal policy while satisfying safety constraints, which is essential in real-world applications.

reinforcement-learning Reinforcement Learning (RL) +1

Interpretable Graph Convolutional Network of Multi-Modality Brain Imaging for Alzheimer's Disease Diagnosis

no code implementations27 Apr 2022 Houliang Zhou, Lifang He, Yu Zhang, Li Shen, Brian Chen

Identification of brain regions related to the specific neurological disorders are of great importance for biomarker and diagnostic studies.

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding

1 code implementation16 Apr 2022 Changtong Zan, Liang Ding, Li Shen, Yu Cao, Weifeng Liu, DaCheng Tao

For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e. g. mBART, the self-supervised pretraining task is trained on a wide range of monolingual languages, e. g. 25 languages from CommonCrawl, while the downstream cross-lingual tasks generally progress on a bilingual language subset, e. g. English-German, making there exists the data discrepancy, namely domain discrepancy, and cross-lingual learning objective discrepancy, namely task discrepancy, between the pretraining and finetuning stages.

Cross-Lingual Natural Language Inference Text Generation +2

Robust Unlearnable Examples: Protecting Data Against Adversarial Learning

2 code implementations28 Mar 2022 Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, DaCheng Tao

To address this concern, methods are proposed to make data unlearnable for deep learning models by adding a type of error-minimizing noise.

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

no code implementations CVPR 2022 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Ling-Yu Duan

Instead, we propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG), which relieves the issue of direct model aggregation.

Federated Learning Knowledge Distillation

Depth-Aware Generative Adversarial Network for Talking Head Video Generation

1 code implementation CVPR 2022 Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu

In a more dense way, the depth is also utilized to learn 3D-aware cross-modal (i. e. appearance and depth) attention to guide the generation of motion fields for warping source image representations.

Talking Head Generation Video Generation

Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance

no code implementations5 Mar 2022 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.

Model Compression

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation

no code implementations28 Feb 2022 Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang

We study the convergence of the actor-critic algorithm with nonlinear function approximation under a nonconvex-nonconcave primal-dual formulation.

Continuous Control OpenAI Gym +1

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

1 code implementation ICLR 2022 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.

Adversarial Robustness Out-of-Distribution Detection

Achieving Personalized Federated Learning with Sparse Local Models

no code implementations27 Jan 2022 Tiansheng Huang, Shiwei Liu, Li Shen, Fengxiang He, Weiwei Lin, DaCheng Tao

To counter this issue, personalized FL (PFL) was proposed to produce dedicated local models for each individual user.

Personalized Federated Learning

Learning To Learn and Remember Super Long Multi-Domain Task Sequence

1 code implementation CVPR 2022 Zhenyi Wang, Li Shen, Tiehang Duan, Donglin Zhan, Le Fang, Mingchen Gao

We propose a domain shift detection technique to capture latent domain change and equip the meta optimizer with it to work in this setting.


DGL-GAN: Discriminator Guided Learning for GAN Compression

no code implementations13 Dec 2021 Yuesong Tian, Li Shen, DaCheng Tao, Zhifeng Li, Wei Liu

Generative Adversarial Networks (GANs) with high computation costs, e. g., BigGAN and StyleGAN2, have achieved remarkable results in synthesizing high resolution and diverse images with high fidelity from random noises.

Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

no code implementations12 Dec 2021 Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen, Fengxiang He, DaCheng Tao

Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters.

Adversarial Robustness Variational Inference

FedDAG: Federated DAG Structure Learning

1 code implementation7 Dec 2021 Erdun Gao, Junjia Chen, Li Shen, Tongliang Liu, Mingming Gong, Howard Bondell

To date, most directed acyclic graphs (DAGs) structure learning approaches require data to be stored in a central server.

Causal Discovery

FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks

1 code implementation22 Nov 2021 Chaoyang He, Alay Dilipbhai Shah, Zhenheng Tang, Di Fan1Adarshan Naiynar Sivashunmugam, Keerti Bhogaraju, Mita Shimpi, Li Shen, Xiaowen Chu, Mahdi Soltanolkotabi, Salman Avestimehr

To bridge the gap and facilitate the development of FL for computer vision tasks, in this work, we propose a federated learning library and benchmarking framework, named FedCV, to evaluate FL on the three most representative computer vision tasks: image classification, image segmentation, and object detection.

Benchmarking Federated Learning +5

Off-policy Imitation Learning from Visual Inputs

no code implementations8 Nov 2021 Zhihao Cheng, Li Shen, DaCheng Tao

We propose OPIfVI (Off-Policy Imitation from Visual Inputs), which is composed of an off-policy learning manner, data augmentation, and encoder techniques, to tackle the mentioned challenges, respectively.

Data Augmentation Imitation Learning

Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning

no code implementations ICLR 2022 Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, DaCheng Tao

To address this concern, methods are proposed to make data unlearnable for deep learning models by adding a type of error-minimizing noise.

Lagrangian Generative Adversarial Imitation Learning with Safety

no code implementations29 Sep 2021 Zhihao Cheng, Li Shen, Meng Fang, Liu Liu, DaCheng Tao

Imitation Learning (IL) merely concentrates on reproducing expert behaviors and could take dangerous actions, which is unbearable in safety-critical scenarios.

Imitation Learning

Sparse Unbalanced GAN Training with In-Time Over-Parameterization

no code implementations29 Sep 2021 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.

Model Compression

On Heterogeneously Distributed Data, Sparsity Matters

no code implementations29 Sep 2021 Tiansheng Huang, Shiwei Liu, Li Shen, Fengxiang He, Weiwei Lin, DaCheng Tao

Federated learning (FL) is particularly vulnerable to heterogeneously distributed data, since a common global model in FL may not adapt to the heterogeneous data distribution of each user.

Personalized Federated Learning

FLBoost: On-the-Fly Fine-tuning Boosts Federated Learning via Data-free Distillation

no code implementations29 Sep 2021 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Lingyu Duan

On the contrary, we propose a new solution: on-the-fly fine-tuning the global model in server via data-free distillation to boost its performance, dubbed FLBoost to relieve the issue of direct model aggregation.

Federated Learning

On Learning to Solve Cardinality Constrained Combinatorial Optimization in One-Shot: A Re-parameterization Approach via Gumbel-Sinkhorn-TopK

no code implementations29 Sep 2021 Runzhong Wang, Li Shen, Yiting Chen, Junchi Yan, Xiaokang Yang, DaCheng Tao

Cardinality constrained combinatorial optimization requires selecting an optimal subset of $k$ elements, and it will be appealing to design data-driven algorithms that perform TopK selection over a probability distribution predicted by a neural network.

Combinatorial Optimization One-Shot Learning +1

Source-Free Few-Shot Domain Adaptation

no code implementations29 Sep 2021 Wenyu Zhang, Li Shen, Chuan-Sheng Foo, Wanyue Zhang

Test-time adaptation of pre-trained source models with streaming unlabelled target data is an attractive setting that protects the privacy of source data, but it has mini-batch size and class-distribution requirements on the streaming data which might not be desirable in practice.

Domain Adaptation

Enabling variable high spatial resolution retrieval from a long pulse BOTDA sensor

no code implementations9 Sep 2021 Zhao Ge, Li Shen, Can Zhao, Hao Wu, Zhiyong Zhao, Ming Tang

We propose a convolutional neural network (CNN) to process the data of conventional Brillouin optical time domain analysis (BOTDA) sensors, which achieves unprecedented performance improvement that allows to directly retrieve higher spatial resolution (SR) from the sensing system that use long pump pulses.


Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

no code implementations30 Aug 2021 Shenao Zhang, Lei Han, Li Shen

In multi-agent reinforcement learning, the behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number.

Multi-agent Reinforcement Learning reinforcement-learning +1

TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting

2 code implementations29 Aug 2021 Li Shen, Yangzhu Wang

To address this issue, we propose the concept of tightly-coupled convolutional Transformer(TCCT) and three TCCT architectures which apply transformed CNN architectures into Transformer: (1) CSPAttention: through fusing CSPNet with self-attention mechanism, the computation cost of self-attention mechanism is reduced by 30% and the memory usage is reduced by 50% while achieving equivalent or beyond prediction accuracy.

Time Series Time Series Forecasting

End-to-End Adaptive Monte Carlo Denoising and Super-Resolution

no code implementations16 Aug 2021 Xinyue Wei, HaoZhi Huang, Yujin Shi, Hongliang Yuan, Li Shen, Jue Wang

We show in this work that Monte Carlo path tracing can be further accelerated by joint super-resolution and denoising (SRD) in post-processing.

Denoising Super-Resolution

UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing

no code implementations12 Aug 2021 Meng Cao, HaoZhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Face Reenactment +3

S2Looking: A Satellite Side-Looking Dataset for Building Change Detection

1 code implementation20 Jul 2021 Li Shen, Yao Lu, Hao Chen, Hao Wei, Donghai Xie, Jiabao Yue, Rui Chen, Shouye Lv, Bitao Jiang

This paper therefore introduces S2Looking, a building-change-detection dataset that contains large-scale side-looking satellite images captured at various off-nadir angles.

Change Detection Management

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

2 code implementations NeurIPS 2021 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).

Network Pruning Sparse Learning

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Optimization

no code implementations18 Jun 2021 Luofeng Liao, Li Shen, Jia Duan, Mladen Kolar, DaCheng Tao

Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks.

Vocal Bursts Type Prediction

Structure-Regularized Attention for Deformable Object Representation

1 code implementation12 Jun 2021 Shenao Zhang, Li Shen, Zhifeng Li, Wei Liu

Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks.

Attacking Adversarial Attacks as A Defense

no code implementations9 Jun 2021 Boxi Wu, Heng Pan, Li Shen, Jindong Gu, Shuai Zhao, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu

In this work, we find that the adversarial attacks can also be vulnerable to small perturbations.

Differentiable Neural Architecture Search for Extremely Lightweight Image Super-Resolution

1 code implementation9 May 2021 Han Huang, Li Shen, Chaoyang He, Weisheng Dong, Wei Liu

Specifically, the cell-level search space is designed based on an information distillation mechanism, focusing on the combinations of lightweight operations and aiming to build a more lightweight and accurate SR structure.

Image Super-Resolution Neural Architecture Search +2

Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity

no code implementations31 Mar 2021 Yuanxin Ye, Jie Shan, Lorenzo Bruzzone, Li Shen

Moreover, a robust registration method is also proposed in this paper based on HOPCncc, which is evaluated using six pairs of multimodal remote sensing images.

Image Registration Template Matching

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

no code implementations14 Jan 2021 Congliang Chen, Li Shen, Fangyu Zou, Wei Liu

Adam is one of the most influential adaptive stochastic algorithms for training deep neural networks, which has been pointed out to be divergent even in the simple convex setting via a few simple counterexamples.

Stochastic Optimization

Adaptive Compact Attention For Few-shot Video-to-video Translation

no code implementations30 Nov 2020 Risheng Huang, Li Shen, Xuan Wang, Cheng Lin, Hao-Zhi Huang

This paper proposes an adaptive compact attention model for few-shot video-to-video translation.


Stochastic Client Selection for Federated Learning with Volatile Clients

no code implementations17 Nov 2020 Tiansheng Huang, Weiwei Lin, Li Shen, Keqin Li, Albert Y. Zomaya

Federated Learning (FL), arising as a privacy-preserving machine learning paradigm, has received notable attention from the public.

Fairness Federated Learning +1

Functional Connectome Fingerprint Gradients in Young Adults

no code implementations10 Nov 2020 Uttara Tipnis, Kausar Abbas, Elizabeth Tran, Enrico Amico, Li Shen, Alan D. Kaplan, Joaquín Goñi

Our differential identifiability results show that the fingerprint gradients based on genetic and environmental similarities are indeed present when comparing FCs for all parcellations and fMRI conditions.

A Distributed Training Algorithm of Generative Adversarial Networks with Quantized Gradients

no code implementations26 Oct 2020 Xiaojun Chen, Shu Yang, Li Shen, Xuanrong Pang

In this paper, we propose a {distributed GANs training algorithm with quantized gradient, dubbed DQGAN,} which is the first distributed training method with quantized gradient for GANs.

Improving the spatial resolution of a BOTDA sensor using deconvolution algorithm

no code implementations15 Sep 2020 Li Shen, Zhiyong Zhao, Can Zhao, Hao Wu, Chao Lu, Ming Tang

The frequency dependency of Brillouin gain temporal envelope is investigated by simulation, and its impact on the recovered results of deconvolution algorithm is thoroughly analyzed.


Network reinforcement driven drug repurposing for COVID-19 by exploiting disease-gene-drug associations

no code implementations12 Aug 2020 Yonghyun Nam, Jae-Seung Yun, Seung Mi Lee, Ji Won Park, Ziqi Chen, Brian Lee, Anurag Verma, Xia Ning, Li Shen, Dokyoon Kim

To reduce trial and error in finding treatments for COVID-19, we propose building a network-based drug repurposing framework to prioritize repurposable drugs.

Grouping effects of sparse CCA models in variable selection

no code implementations7 Aug 2020 Kefei Liu, Qi Long, Li Shen

The sparse canonical correlation analysis (SCCA) is a bi-multivariate association model that finds sparse linear combinations of two sets of variables that are maximally correlated with each other.

Variable Selection

Task-agnostic Temporally Consistent Facial Video Editing

no code implementations3 Jul 2020 Meng Cao, Hao-Zhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Video Editing

Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

no code implementations16 Jun 2020 Jie An, Tao Li, Hao-Zhi Huang, Li Shen, Xuan Wang, Yongyi Tang, Jinwen Ma, Wei Liu, Jiebo Luo

Extracting effective deep features to represent content and style information is the key to universal style transfer.

Style Transfer

AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

1 code implementation16 Jun 2020 Yuesong Tian, Li Shen, Guinan Su, Zhifeng Li, Wei Liu

To this end, we propose a fully differentiable search framework for generative adversarial networks, dubbed alphaGAN.

CPOT: Channel Pruning via Optimal Transport

no code implementations21 May 2020 Yucong Shen, Li Shen, Hao-Zhi Huang, Xuan Wang, Wei Liu

Recent advances in deep neural networks (DNNs) lead to tremendously growing network parameters, making the deployments of DNNs on platforms with limited resources extremely difficult.

Image-to-Image Translation Translation

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

1 code implementation ICML 2020 Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, Tianbao Yang

In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.

Distributed Optimization

Quantized Adam with Error Feedback

no code implementations29 Apr 2020 Congliang Chen, Li Shen, Hao-Zhi Huang, Wei Liu

In this paper, we present a distributed variant of adaptive stochastic gradient method for training deep neural networks in the parameter-server model.


MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation

1 code implementation CVPR 2020 Chaoyang He, Haishan Ye, Li Shen, Tong Zhang

To remedy this, this paper proposes \mldas, a mixed-level reformulation for NAS that can be optimized efficiently and reliably.

Bilevel Optimization Neural Architecture Search +1

Cognitive Biomarker Prioritization in Alzheimer's Disease using Brain Morphometric Data

no code implementations18 Feb 2020 Bo Peng, Xiaohui Yao, Shannon L. Risacher, Andrew J. Saykin, Li Shen, Xia Ning

This method learns the latent scoring function that pushes the most effective cognitive assessments onto the top of the prioritization list.


Generalized Embedding Machines for Recommender Systems

no code implementations16 Feb 2020 Enneng Yang, Xin Xin, Li Shen, Guibing Guo

In this work, we propose an alternative approach to model high-order interaction signals in the embedding level, namely Generalized Embedding Machine (GEM).

Recommendation Systems

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

no code implementations19 Nov 2019 Yingru Liu, Xuewen Yang, Dongliang Xie, Xin Wang, Li Shen, Hao-Zhi Huang, Niranjan Balasubramanian

In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL.

Multi-Task Learning

MAP Inference via L2-Sphere Linear Program Reformulation

1 code implementation9 May 2019 Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem

Thus, LS-LP is equivalent to the original MAP inference problem.

Drug-drug interaction prediction based on co-medication patterns and graph matching

no code implementations22 Feb 2019 Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

Background: The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript.

Graph Matching

A Sufficient Condition for Convergences of Adam and RMSProp

no code implementations CVPR 2019 Fangyu Zou, Li Shen, Zequn Jie, Weizhong Zhang, Wei Liu

Adam and RMSProp are two of the most influential adaptive stochastic algorithms for training deep neural networks, which have been pointed out to be divergent even in the convex setting via a few simple counterexamples.

Stochastic Optimization

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

8 code implementations NeurIPS 2018 Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi

We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.

A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration

no code implementations10 Aug 2018 Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun, Wei Liu

Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc}.

Stochastic Optimization

Image Registration and Predictive Modeling: Learning the Metric on the Space of Diffeomorphisms

no code implementations10 Aug 2018 Ayagoz Mussabayeva, Alexey Kroshnin, Anvar Kurmukov, Yulia Dodonova, Li Shen, Shan Cong, Lei Wang, Boris A. Gutman

We present a method for metric optimization in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, by treating the induced Riemannian metric on the space of diffeomorphisms as a kernel in a machine learning context.

General Classification Image Registration

Comparator Networks

no code implementations ECCV 2018 Weidi Xie, Li Shen, Andrew Zisserman

Our contributions are: (i) We propose a Deep Comparator Network (DCN) that can ingest a pair of sets (each may contain a variable number of images) as inputs, and compute a similarity between the pair--this involves attending to multiple discriminative local regions (landmarks), and comparing local descriptors between pairs of faces; (ii) To encourage high-quality representations for each set, internal competition is introduced for recalibration based on the landmark score; (iii) Inspired by image retrieval, a novel hard sample mining regime is proposed to control the sampling process, such that the DCN is complementary to the standard image classification models.

Face Recognition Image Classification +2

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method

no code implementations ICML 2018 Li Shen, Peng Sun, Yitong Wang, Wei Liu, Tong Zhang

Specifically, we find that a large class of primal and primal-dual operator splitting algorithms are all special cases of VMOR-HPE.

Drug Recommendation toward Safe Polypharmacy

no code implementations8 Mar 2018 Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

Adverse drug reactions (ADRs) induced from high-order drug-drug interactions (DDIs) due to polypharmacy represent a significant public health problem.

A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem

no code implementations CVPR 2019 Ganzhao Yuan, Li Shen, Wei-Shi Zheng

The sparse generalized eigenvalue problem arises in a number of standard and modern statistical learning models, including sparse principal component analysis, sparse Fisher discriminant analysis, and sparse canonical correlation analysis.

Numerical Analysis

End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design

4 code implementations15 Nov 2017 Li Shen

We also demonstrate that a whole image model trained on DDSM can be easily transferred to INbreast without using its lesion annotations and using only a small amount of training data.

VGGFace2: A dataset for recognising faces across pose and age

18 code implementations23 Oct 2017 Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, Andrew Zisserman

The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise.

 Ranked #1 on Face Verification on IJB-C (training dataset metric)

Face Recognition Face Verification +1

Squeeze-and-Excitation Networks

78 code implementations CVPR 2018 Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu

Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2. 251%, surpassing the winning entry of 2016 by a relative improvement of ~25%.

Image Classification

Deep Learning to Improve Breast Cancer Early Detection on Screening Mammography

5 code implementations30 Aug 2017 Li Shen, Laurie R. Margolies, Joseph H. Rothstein, Eugene Fluder, Russell B. McBride, Weiva Sieh

We also demonstrate that a whole image classifier trained using our end-to-end approach on the DDSM digitized film mammograms can be transferred to INbreast FFDM images using only a subset of the INbreast data for fine-tuning and without further reliance on the availability of lesion annotations.

Breast Cancer Detection Specificity

Contour Detection from Deep Patch-level Boundary Prediction

no code implementations9 May 2017 Teck Wee Chua, Li Shen

In this paper, we present a novel approach for contour detection with Convolutional Neural Networks.

Contour Detection

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

no code implementations24 Mar 2016 Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.

Action Recognition Skeleton Based Action Recognition +1

Shadow Optimization from Structured Deep Edge Detection

no code implementations CVPR 2015 Li Shen, Teck Wee Chua, Karianto Leman

In this paper, we present a novel learning-based framework for shadow region recovery from a single image.

Edge Detection Shadow Detection

Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization

no code implementations CVPR 2013 Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang

For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity.

Dictionary Learning