Search Results for author: Li Shen

Found 233 papers, 105 papers with code

Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic

1 code implementation9 Jul 2024 Ruochen Jin, BoJian Hou, Jiancong Xiao, Weijie Su, Li Shen

To further understand how our method improves the disentanglement of task arithmetic, we present a comprehensive study of task arithmetic by differentiating the role of representation model and task-specific model.

Classification Disentanglement +1

Volume-optimal persistence homological scaffolds of hemodynamic networks covary with MEG theta-alpha aperiodic dynamics

1 code implementation6 Jul 2024 Nghi Nguyen, Tao Hou, Enrico Amico, Jingyi Zheng, Huajun Huang, Alan D. Kaplan, Giovanni Petri, Joaquín Goñi, Yize Zhao, Duy Duong-Tran, Li Shen

Higher-order properties of functional magnetic resonance imaging (fMRI) induced connectivity have been shown to unravel many exclusive topological and dynamical insights beyond pairwise interactions.

Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model

no code implementations5 Jul 2024 Duy M. H. Nguyen, An T. Le, Trung Q. Nguyen, Nghiem T. Diep, Tai Nguyen, Duy Duong-Tran, Jan Peters, Li Shen, Mathias Niepert, Daniel Sonntag

Prompt learning methods are gaining increasing attention due to their ability to customize large vision-language models to new domains using pre-trained contextual knowledge and minimal training data.

Image Augmentation Language Modelling

Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning

1 code implementation25 Jun 2024 Tianfu Wang, Li Shen, Qilin Fan, Tong Xu, Tongliang Liu, Hui Xiong

Specifically, the whole VNE process is decomposed into an upper-level policy for deciding whether to admit the arriving VNR or not and a lower-level policy for allocating resources of the physical network to meet the requirement of VNR through the HRL approach.

Combinatorial Optimization Graph Neural Network +2

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion

1 code implementation14 Jun 2024 Anke Tang, Li Shen, Yong Luo, Shiwei Liu, Han Hu, Bo Du

Once the routers are learned and a preference vector is set, the MoE module can be unloaded, thus no additional computational cost is introduced during inference.

Multi-Task Learning

FusionBench: A Comprehensive Benchmark of Deep Model Fusion

1 code implementation5 Jun 2024 Anke Tang, Li Shen, Yong Luo, Han Hu, Bo Du, DaCheng Tao

These techniques range from model ensemble methods, which combine the predictions to improve the overall performance, to model merging, which integrates different models into a single one, and model mixing methods, which upscale or recombine the components of the original models.

Image Classification text-classification +2

AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization

1 code implementation28 May 2024 Longxiang He, Li Shen, Junbo Tan, Xueqian Wang

IDQL reinterprets IQL as an actor-critic method and gets weights of implicit policy, however, this weight only holds for the optimal value function.

D4RL Offline RL +2

Decentralized Directed Collaboration for Personalized Federated Learning

no code implementations CVPR 2024 Yingqi Liu, Yifan Shi, Qinglun Li, Baoyuan Wu, Xueqian Wang, Li Shen

To avoid the central failure and communication bottleneck in the server-based FL, we concentrate on the Decentralized Personalized Federated Learning (DPFL) that performs distributed model training in a Peer-to-Peer (P2P) manner.

Personalized Federated Learning

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

1 code implementation28 May 2024 Shengchao Hu, Ziqing Fan, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

However, variations in task content and complexity pose significant challenges in policy formulation, necessitating judicious parameter sharing and management of conflicting gradients for optimal policy performance.

Management Meta-Learning +1

Q-value Regularized Transformer for Offline Reinforcement Learning

1 code implementation27 May 2024 Shengchao Hu, Ziqing Fan, Chaoqin Huang, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action distribution based on history trajectory and target returns for each state.

D4RL Offline RL +3

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models

no code implementations26 May 2024 Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Yu Li, Chun Yuan, DaCheng Tao

Based on our findings, we propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.

Meta-Learning

A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy

no code implementations22 May 2024 Puning Zhao, Lifeng Lai, Li Shen, Qingming Li, Jiafei Wu, Zhe Liu

We provide a theoretical analysis of our approach, which gives the noise strength needed for privacy protection, as well as the bound of mean squared error.

Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?

no code implementations20 May 2024 Yang Dai, Oubo Ma, Longfei Zhang, Xingxing Liang, Shengchao Hu, Mengzhu Wang, Shouling Ji, Jincai Huang, Li Shen

Transformer-based trajectory optimization methods have demonstrated exceptional performance in offline Reinforcement Learning (offline RL), yet it poses challenges due to substantial parameter size and limited scalability, which is particularly critical in sequential decision-making scenarios where resources are constrained such as in robots and drones with limited computational power.

Atari Games Offline RL +1

Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

1 code implementation17 May 2024 Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i. e., inter-subject variability.

EEG Motor Imagery +1

Learning Multi-Agent Communication from Graph Modeling Perspective

1 code implementation14 May 2024 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

In numerous artificial intelligence applications, the collaborative efforts of multiple intelligent agents are imperative for the successful attainment of target objectives.

DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

1 code implementation8 May 2024 Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, BoJian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

With a synergized framework of LLM and KG mutually enhancing each other, we first leverage LLM to construct an evolving AD-specific knowledge graph (KG) sourced from AD-related scientific literature, and then we utilize a coarse-to-fine sampling method with a novel self-aware knowledge retrieval approach to select appropriate knowledge from the KG to augment LLM inference capabilities.

Question Answering

Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training

no code implementations5 May 2024 Wenyu Zhang, Li Shen, Chuan-Sheng Foo

Despite having diverse features important for generalization, the pre-trained feature extractor can overfit to the source data distribution during source training and forget relevant target domain knowledge.

Language Modelling Representation Learning +2

FREE: Faster and Better Data-Free Meta-Learning

no code implementations CVPR 2024 Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, DaCheng Tao

Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns.

Meta-Learning

Federated Learning with Only Positive Labels by Exploring Label Correlations

no code implementations24 Apr 2024 Xuming An, Dui Wang, Li Shen, Yong Luo, Han Hu, Bo Du, Yonggang Wen, DaCheng Tao

Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training.

Federated Learning Multi-Label Classification

Continuous Spiking Graph Neural Networks

no code implementations2 Apr 2024 Nan Yin, Mengzhu Wan, Li Shen, Hitesh Laxmichand Patel, Baopu Li, Bin Gu, Huan Xiong

Inspired by recent spiking neural networks (SNNs), which emulate a biological inference process and provide an energy-efficient neural architecture, we incorporate the SNNs with CGNNs in a unified framework, named Continuous Spiking Graph Neural Networks (COS-GNN).

Heterogeneous Federated Learning with Splited Language Model

no code implementations24 Mar 2024 Yifan Shi, Yuhui Zhang, Ziyue Huang, Xiaofeng Yang, Li Shen, Wei Chen, Xueqian Wang

Federated Split Learning (FSL) is a promising distributed learning paradigm in practice, which gathers the strengths of both Federated Learning (FL) and Split Learning (SL) paradigms, to ensure model privacy while diminishing the resource overhead of each client, especially on large transformer models in a resource-constrained environment, e. g., Internet of Things (IoT).

Federated Learning Language Modelling

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

1 code implementation21 Mar 2024 Changtong Zan, Liang Ding, Li Shen, Yibing Zhen, Weifeng Liu, DaCheng Tao

In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.

In-Context Learning Instruction Following +1

A Unified and General Framework for Continual Learning

1 code implementation20 Mar 2024 Zhenyi Wang, Yan Li, Li Shen, Heng Huang

Extensive experiments on CL benchmarks and theoretical analysis demonstrate the effectiveness of the proposed refresh learning.

Continual Learning

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

no code implementations19 Feb 2024 Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, DaCheng Tao, Enhong Chen

However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate.

Revisiting Knowledge Distillation for Autoregressive Language Models

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, DaCheng Tao

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model.

Knowledge Distillation

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

no code implementations12 Feb 2024 Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

To further exploit the capabilities of bootstrapping, we investigate and adjust the training order of data, which yields improved performance of the model.

In-Context Learning

Representation Surgery for Multi-Task Model Merging

1 code implementation5 Feb 2024 Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xiaojun Chen, Xingwei Wang, DaCheng Tao

That is, there is a significant discrepancy in the representation distribution between the merged and individual models, resulting in poor performance of merged MTL.

Computational Efficiency Multi-Task Learning

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

1 code implementation1 Feb 2024 Anke Tang, Li Shen, Yong Luo, Nan Yin, Lefei Zhang, DaCheng Tao

A notable challenge is mitigating the interference between parameters of different models, which can substantially deteriorate performance.

Task Arithmetic

Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT

no code implementations31 Jan 2024 Diego Machado Reyes, Hanqing Chao, Juergen Hahn, Li Shen, Pingkun Yan

Thus, we propose a multimodal framework that uses early-stage indicators such as imaging, genetics and clinical assessments to classify AD patients into subtypes at early stages.

FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models

no code implementations28 Jan 2024 Feihong He, Gang Li, Mengyuan Zhang, Leilei Yan, Lingyu Si, Fanzhang Li, Li Shen

In the decoder, we further modulate features from the dual streams based on a given content image and the corresponding style text prompt for precise style transfer.

Decoder Style Transfer

Solving Continual Offline Reinforcement Learning with Decision Transformer

no code implementations16 Jan 2024 Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, DaCheng Tao

We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues.

Offline RL reinforcement-learning +1

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

1 code implementation12 Jan 2024 Shuai Wang, Liang Ding, Li Shen, Yong Luo, Bo Du, DaCheng Tao

Advancing automated programming necessitates robust and comprehensive code generation benchmarks, yet current evaluation frameworks largely neglect object-oriented programming (OOP) in favor of functional programming (FP), e. g., HumanEval and MBPP.

Code Generation

Siamese Networks with Soft Labels for Unsupervised Lesion Detection and Patch Pretraining on Screening Mammograms

no code implementations10 Jan 2024 Kevin Van Vorst, Li Shen

Self-supervised learning has become a popular way to pretrain a deep learning model and then transfer it to perform downstream tasks.

Lesion Detection Self-Supervised Learning

Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning

no code implementations CVPR 2024 Ziming Hong, Li Shen, Tongliang Liu

Motivated by these findings we uncover the potential risks of NTL by proposing a simple but effective method (dubbed as TransNTL) to recover the target domain performance with few source domain data.

Sheared Backpropagation for Fine-tuning Foundation Models

no code implementations CVPR 2024 Zhiyuan Yu, Li Shen, Liang Ding, Xinmei Tian, Yixin Chen, DaCheng Tao

To address these challenges we introduce PreBackRazor a novel activation pruning scheme offering both computational and memory efficiency through a sparsified backpropagation strategy which judiciously avoids unnecessary activation pruning and storage and gradient computation.

Benchmarking

Neural Network Approximation for Pessimistic Offline Reinforcement Learning

no code implementations19 Dec 2023 Di wu, Yuling Jiao, Li Shen, Haizhao Yang, Xiliang Lu

In this paper, we establish a non-asymptotic estimation error of pessimistic offline RL using general neural network approximation with $\mathcal{C}$-mixing data regarding the structure of networks, the dimension of datasets, and the concentrability of data coverage, under mild assumptions.

Offline RL reinforcement-learning +1

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

1 code implementation11 Dec 2023 Anke Tang, Li Shen, Yong Luo, Liang Ding, Han Hu, Bo Du, DaCheng Tao

At the upper level, we focus on learning a shared Concrete mask to identify the subspace, while at the inner level, model merging is performed to maximize the performance of the merged model.

Meta-Learning Task Arithmetic

Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

1 code implementation10 Dec 2023 Li Shen, Yuning Wei, Yangzhu Wang, Hongguang Li

With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies.

Decoder Time Series +1

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

1 code implementation CVPR 2024 Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi

While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.

Imitation Learning

Task-Distributionally Robust Data-Free Meta-Learning

no code implementations23 Nov 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Yongxian Wei, Baoyuan Wu, Chun Yuan, DaCheng Tao

TDS leads to a biased meta-learner because of the skewed task distribution towards newly generated tasks.

Meta-Learning Model Selection

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz

no code implementations23 Oct 2023 Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li

Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates.

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

no code implementations20 Oct 2023 Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem.

Language Modelling Quantization

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

1 code implementation15 Oct 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.

Computational Efficiency

Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems

no code implementations15 Oct 2023 Kuan Xu, Li Shen, Lindong Liu

In addition, we specify RLHH to solve two typical combinatorial optimization problems: Vehicle Routing Problem with Time Windows (VRPTW) and Bus Driver Scheduling Problem (BDSP).

Combinatorial Optimization Scheduling

Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer

no code implementations15 Oct 2023 Boan Liu, Liang Ding, Li Shen, Keqin Peng, Yu Cao, Dazhao Cheng, DaCheng Tao

The Mixture of Experts (MoE) has emerged as a highly successful technique in deep learning, based on the principle of divide-and-conquer to maximize model capacity without significant additional computational cost.

Diversity Question Answering

Learn From Model Beyond Fine-Tuning: A Survey

1 code implementation12 Oct 2023 Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, DaCheng Tao

LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks.

Meta-Learning Model Editing

Debias the Training of Diffusion Models

no code implementations12 Oct 2023 Hu Yu, Li Shen, Jie Huang, Man Zhou, Hongsheng Li, Feng Zhao

Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss.

Denoising

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

1 code implementation11 Oct 2023 Guozheng Ma, Lu Li, Sen Zhang, Zixuan Liu, Zhen Wang, Yixin Chen, Li Shen, Xueqian Wang, DaCheng Tao

Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning (VRL).

Data Augmentation reinforcement-learning

ChatGPT for Computational Topology

1 code implementation11 Oct 2023 Jian Liu, Li Shen, Guo-Wei Wei

This work serves as an initial step towards effectively transforming pure mathematical theories into practical computational tools, with the ultimate goal of enabling real applications across diverse fields.

Topological Data Analysis

DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning

1 code implementation9 Oct 2023 Longxiang He, Li Shen, Linrui Zhang, Junbo Tan, Xueqian Wang

Constrained policy search (CPS) is a fundamental problem in offline reinforcement learning, which is generally solved by advantage weighted regression (AWR).

D4RL Offline RL +1

Asymmetrically Decentralized Federated Learning

no code implementations8 Oct 2023 Qinglun Li, Miao Zhang, Nan Yin, Quanjun Yin, Li Shen

To further improve algorithm performance and alleviate local heterogeneous overfitting in Federated Learning (FL), our algorithm combines the Sharpness Aware Minimization (SAM) optimizer and local momentum.

Federated Learning

Parameter Efficient Multi-task Model Fusion with Partial Linearization

1 code implementation7 Oct 2023 Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, DaCheng Tao

We demonstrate that our partial linearization technique enables a more effective fusion of multiple tasks into a single model, outperforming standard adapter tuning and task arithmetic alone.

Task Arithmetic

Which mode is better for federated learning? Centralized or Decentralized

no code implementations5 Oct 2023 Yan Sun, Li Shen, DaCheng Tao

Both centralized and decentralized approaches have shown excellent performance and great application value in federated learning (FL).

Federated Learning valid

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models

no code implementations4 Oct 2023 Zihao Lin, Yan Sun, Yifan Shi, Xueqian Wang, Lifu Huang, Li Shen, DaCheng Tao

With the blowout development of pre-trained models (PTMs), the efficient tuning of these models for diverse downstream applications has emerged as a pivotal research concern.

AdaMerging: Adaptive Model Merging for Multi-Task Learning

1 code implementation4 Oct 2023 Enneng Yang, Zhenyi Wang, Li Shen, Shiwei Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

This approach aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.

Task Arithmetic

Towards Stable Backdoor Purification through Feature Shift Tuning

1 code implementation NeurIPS 2023 Rui Min, Zeyu Qin, Li Shen, Minhao Cheng

Our analysis shows that with the low poisoning rate, the entanglement between backdoor and clean features undermines the effect of tuning-based defenses.

Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation

1 code implementation28 Sep 2023 Changtong Zan, Liang Ding, Li Shen, Yibin Lei, Yibing Zhan, Weifeng Liu, DaCheng Tao

Zero-shot translation (ZST), which is generally based on a multilingual neural machine translation model, aims to translate between unseen language pairs in training data.

Machine Translation Navigate +2

Deep Model Fusion: A Survey

1 code implementation27 Sep 2023 Weishi Li, Yong Peng, Miao Zhang, Liang Ding, Han Hu, Li Shen

Specifically, we categorize existing deep model fusion methods as four-fold: (1) "Mode connectivity", which connects the solutions in weight space via a path of non-increasing loss, in order to obtain better initialization for model fusion; (2) "Alignment" matches units between neural networks to create better conditions for fusion; (3) "Weight average", a classical model fusion method, averages the weights of multiple models to obtain more accurate results closer to the optimal solution; (4) "Ensemble learning" combines the outputs of diverse models, which is a foundational technique for improving the accuracy and robustness of the final model.

Ensemble Learning

FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data

no code implementations18 Sep 2023 Hao Sun, Li Shen, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data.

Federated Learning Scheduling

Continual Learning From a Stream of APIs

no code implementations31 Aug 2023 Enneng Yang, Zhenyi Wang, Li Shen, Nan Yin, Tongliang Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model.

Continual Learning

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations30 Aug 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

1 code implementation24 Aug 2023 Hanchi Huang, Li Shen, Deheng Ye, Wei Liu

We propose a novel master-slave architecture to solve the top-$K$ combinatorial multi-armed bandits problem with non-linear bandit feedback and diversity constraints, which, to the best of our knowledge, is the first combinatorial bandits setting considering diversity constraints under bandit feedback.

Diversity Multi-Armed Bandits

Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?

1 code implementation24 Aug 2023 Fei Wang, Liang Ding, Jun Rao, Ye Liu, Li Shen, Changxing Ding

The multimedia community has shown a significant interest in perceiving and representing the physical world with multimodal pretrained neural network models, and among them, the visual-language pertaining (VLP) is, currently, the most captivating topic.

Attribute Negation +1

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

no code implementations18 Aug 2023 Xiaoge Deng, Li Shen, Shengwei Li, Tao Sun, Dongsheng Li, DaCheng Tao

Stochastic gradient descent (SGD) performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.

DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning

no code implementations16 Aug 2023 Qinglun Li, Li Shen, Guanghao Li, Quanjun Yin, DaCheng Tao

To address the communication burden issues associated with federated learning (FL), decentralized federated learning (DFL) discards the central server and establishes a decentralized communication network, where each client communicates only with neighboring clients.

Federated Learning

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

1 code implementation1 Aug 2023 Guanyu Xu, Jiawei Hao, Li Shen, Han Hu, Yong Luo, Hui Lin, Jialie Shen

Recently, the efficient deployment and acceleration of powerful vision transformers (ViTs) on resource-limited edge devices for providing multimedia services have become attractive tasks.

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

no code implementations30 Jul 2023 Yan Sun, Li Shen, Hao Sun, Liang Ding, DaCheng Tao

Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global adaptive optimizer; (ii) client drifts exacerbated by local over-fitting with the local adaptive optimizer.

Federated Learning

High-Resolution Volumetric Reconstruction for Clothed Humans

no code implementations25 Jul 2023 Sicong Tang, Guangyuan Wang, Qing Ran, Lingzhi Li, Li Shen, Ping Tan

We present a novel method for reconstructing clothed humans from a sparse set of, e. g., 1 to 6 RGB images.

Quantization

GBT: Two-stage transformer framework for non-stationary time series forecasting

1 code implementation17 Jul 2023 Li Shen, Yuning Wei, Yangzhu Wang

It decouples the prediction process of TSFT into two stages, including Auto-Regression stage and Self-Regression stage to tackle the problem of different statistical properties between input and prediction sequences. Prediction results of Auto-Regression stage serve as a Good Beginning, i. e., a better initialization for inputs of Self-Regression stage.

Decoder regression +2

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning

1 code implementation16 Jul 2023 Zhenyi Wang, Enneng Yang, Li Shen, Heng Huang

Through this comprehensive survey, we aspire to uncover potential solutions by drawing upon ideas and approaches from various fields that have dealt with forgetting.

Continual Learning Federated Learning +1

Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

no code implementations14 Jul 2023 Zihao Zhu, Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu

To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization. Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss.

Backdoor Attack Data Poisoning

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

1 code implementation30 Jun 2023 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao

Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation25 Jun 2023 Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

FDNet: Focal Decomposed Network for Efficient, Robust and Practical Time Series Forecasting

1 code implementation19 Jun 2023 Li Shen, Yuning Wei, Yangzhu Wang, Huaxin Qiu

Moreover, we propose focal input sequence decomposition method which decomposes input sequence in a focal manner for efficient and robust forecasting when facing Long Sequence Time series Input (LSTI) problem.

Inductive Bias Time Series +1

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

no code implementations8 Jun 2023 Nan Yin, Li Shen, Mengzhu Wang, Long Lan, Zeyu Ma, Chong Chen, Xian-Sheng Hua, Xiao Luo

Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire.

Contrastive Learning Domain Adaptation +2

One-step Multi-view Clustering with Diverse Representation

no code implementations8 Jun 2023 Xinhang Wan, Jiyuan Liu, Xinwang Liu, Siwei Wang, Yi Wen, Tianjiao Wan, Li Shen, En Zhu

In light of this, we propose a one-step multi-view clustering with diverse representation method, which incorporates multi-view learning and $k$-means into a unified framework.

Clustering MULTI-VIEW LEARNING +1

Are Large Kernels Better Teachers than Transformers for ConvNets?

1 code implementation30 May 2023 Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.

Knowledge Distillation

Compact Real-time Radiance Fields with Neural Codebook

no code implementations29 May 2023 Lingzhi Li, Zhongshu Wang, Zhen Shen, Li Shen, Ping Tan

Reconstructing neural radiance fields with explicit volumetric representations, demonstrated by Plenoxels, has shown remarkable advantages on training and rendering efficiency, while grid-based representations typically induce considerable overhead for storage and transmission.

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Few-Shot Learning Knowledge Distillation

Incomplete Multimodal Learning for Complex Brain Disorders Prediction

no code implementations25 May 2023 Reza Shirkavand, Liang Zhan, Heng Huang, Li Shen, Paul M. Thompson

Especially in studies of brain diseases, research cohorts may include both neuroimaging data and genetic data, but for practical clinical diagnosis, we often need to make disease predictions only based on neuroimages.

Data Integration

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training

no code implementations24 May 2023 Yifan Shi, Yingqi Liu, Yan Sun, Zihao Lin, Li Shen, Xueqian Wang, DaCheng Tao

Personalized federated learning (PFL) aims to produce the greatest personalized model for each client to face an insurmountable problem--data heterogeneity in real FL systems.

Personalized Federated Learning

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

1 code implementation19 May 2023 Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao

In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.

Federated Learning

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

1 code implementation10 May 2023 Fa-Ting Hong, Li Shen, Dan Xu

In this work, firstly, we present a novel self-supervised method for learning dense 3D facial geometry (ie, depth) from face videos, without requiring camera parameters and 3D geometry annotations in training.

Generative Adversarial Network Keypoint Estimation +2

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

1 code implementation1 May 2023 Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Evaluate Geometry of Radiance Fields with Low-frequency Color Prior

1 code implementation10 Apr 2023 Qihang Fang, Yafei Song, Keqiang Li, Li Shen, Huaiyu Wu, Gang Xiong, Liefeng Bo

From this insight, given a reconstructed density field and observation images, we design a closed-form method to approximate the color field with low-frequency spherical harmonics, and compute the inverse mean residual color.

3D Reconstruction Novel View Synthesis

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

no code implementations7 Apr 2023 Li Shen, Yan Sun, Zhiyuan Yu, Liang Ding, Xinmei Tian, DaCheng Tao

The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.

Quantum Imitation Learning

no code implementations4 Apr 2023 Zhihao Cheng, Kaining Zhang, Li Shen, DaCheng Tao

Despite remarkable successes in solving various complex decision-making tasks, training an imitation learning (IL) algorithm with deep neural networks (DNNs) suffers from the high computation burden.

Behavioural cloning

Towards Making the Most of ChatGPT for Machine Translation

1 code implementation24 Mar 2023 Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.

In-Context Learning Machine Translation +2

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

3 code implementations CVPR 2023 Zhuo Huang, Miaoxi Zhu, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Bo Du, Tongliang Liu

Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

1 code implementation CVPR 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, DaCheng Tao

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data.

Meta-Learning

Make Landscape Flatter in Differentially Private Federated Learning

1 code implementation CVPR 2023 Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, DaCheng Tao

Specifically, DP-FedSAM integrates Sharpness Aware Minimization (SAM) optimizer to generate local flatness models with better stability and weight perturbation robustness, which results in the small norm of local updates and robustness to DP noise, thereby improving the performance.

Federated Learning

Visual Prompt Based Personalized Federated Learning

no code implementations15 Mar 2023 Guanghao Li, Wansen Wu, Yan Sun, Li Shen, Baoyuan Wu, DaCheng Tao

Then, the local model is trained on the input composed of raw data and a visual prompt to learn the distribution information contained in the prompt.

Image Classification Personalized Federated Learning

SGDA: Towards 3D Universal Pulmonary Nodule Detection via Slice Grouped Domain Attention

1 code implementation7 Mar 2023 Rui Xu, Zhi Liu, Yong Luo, Han Hu, Li Shen, Bo Du, Kaiming Kuang, Jiancheng Yang

To address this issue, we propose a slice grouped domain attention (SGDA) module to enhance the generalization capability of the pulmonary nodule detection networks.

Computed Tomography (CT)

Graph Decision Transformer

no code implementations7 Mar 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.

Offline RL OpenAI Gym +1

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

no code implementations1 Mar 2023 Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step.

Subspace based Federated Unlearning

no code implementations24 Feb 2023 Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, DaCheng Tao

Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data.

Federated Learning

Fusion of Global and Local Knowledge for Personalized Federated Learning

1 code implementation21 Feb 2023 Tiansheng Huang, Li Shen, Yan Sun, Weiwei Lin, DaCheng Tao

Personalized federated learning, as a variant of federated learning, trains customized models for clients using their heterogeneously distributed data.

Personalized Federated Learning

Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling

no code implementations21 Feb 2023 Selena Wang, Yiting Wang, Frederick H. Xu, Li Shen, Yize Zhao

By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer's Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

1 code implementation21 Feb 2023 Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, DaCheng Tao

Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections.

Federated Learning

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

no code implementations18 Feb 2023 Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, DaCheng Tao

This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.

Contrastive Learning Denoising +12

FedABC: Targeting Fair Competition in Personalized Federated Learning

no code implementations15 Feb 2023 Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, DaCheng Tao

In particular, we adopt the ``one-vs-all'' training strategy in each client to alleviate the unfair competition between classes by constructing a personalized binary classification problem for each class.

Binary Classification Personalized Federated Learning

Robust Generalization against Corruptions via Worst-Case Sharp ness Minimization

no code implementations journal 2023 Zhuo Huang, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Tongliang Liu

Robust generalization aims to deal with the most challenging data distributions which are rarely presented in training set and contain severe noise corruptions.

Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

no code implementations11 Feb 2023 Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, DaCheng Tao

Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the aggravated heterogeneous dataset.

Edge-computing Federated Learning

Improving the Model Consistency of Decentralized Federated Learning

no code implementations8 Feb 2023 Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, DaCheng Tao

To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.

Federated Learning

MetaMix: Towards Corruption-Robust Continual Learning With Temporally Self-Adaptive Data Transformation

no code implementations CVPR 2023 Zhenyi Wang, Li Shen, Donglin Zhan, Qiuling Suo, Yanjun Zhu, Tiehang Duan, Mingchen Gao

To make them trustworthy and robust to corruptions deployed in safety-critical scenarios, we propose a meta-learning framework of self-adaptive data augmentation to tackle the corruption robustness in CL.

Continual Learning Data Augmentation +1

Data Augmented Flatness-aware Gradient Projection for Continual Learning

no code implementations ICCV 2023 Enneng Yang, Li Shen, Zhenyi Wang, Shiwei Liu, Guibing Guo, Xingwei Wang

In this paper, we first revisit the gradient projection method from the perspective of flatness of loss surface, and find that unflatness of the loss surface leads to catastrophic forgetting of the old tasks when the projection constraint is reduced to improve the performance of new tasks.

Continual Learning

Global Balanced Experts for Federated Long-Tailed Learning

1 code implementation ICCV 2023 Yaopei Zeng, Lei Liu, Li Liu, Li Shen, Shaoguo Liu, Baoyuan Wu

In particular, a proxy is derived from the accumulated gradients uploaded by the clients after local training, and is shared by all clients as the class prior for re-balance training.

Federated Learning Privacy Preserving

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

no code implementations29 Dec 2022 Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao

Transformer, originally devised for natural language processing, has also attested significant success in computer vision.

Autonomous Driving reinforcement-learning +2

SVSBI: Sequence-based virtual screening of biomolecular interactions

1 code implementation27 Dec 2022 Li Shen, Hongsong Feng, Yuchi Qiu, Guo-Wei Wei

Virtual screening (VS) is an essential technique for understanding biomolecular interactions, particularly, drug design and discovery.

Drug Discovery Molecular Docking

Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation

no code implementations ICCV 2023 Wenyu Zhang, Li Shen, Chuan-Sheng Foo

We propose to distil useful target domain information through a co-learning strategy to improve target pseudolabel quality for finetuning the source model.

Representation Learning Source-Free Domain Adaptation +1

Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks

no code implementations12 Dec 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, DaCheng Tao

Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics.

Autonomous Driving reinforcement-learning +2

4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions

1 code implementation9 Dec 2022 Zhongshu Wang, Lingzhi Li, Zhen Shen, Li Shen, Liefeng Bo

In this paper, we present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions, building on the methodology of neural radiance fields (NeRF).

4k Decoder +1

Compressing Volumetric Radiance Fields to 1 MB

1 code implementation CVPR 2023 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Liefeng Bo

Approximating radiance fields with volumetric grids is one of promising directions for improving NeRF, represented by methods like Plenoxels and DVGO, which achieve super-fast training convergence and real-time rendering.

Model Compression Neural Rendering +1

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

no code implementations28 Nov 2022 Enneng Yang, Junwei Pan, Ximei Wang, Haibin Yu, Li Shen, Xihua Chen, Lei Xiao, Jie Jiang, Guibing Guo

In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter.

Multi-Task Learning Recommendation Systems

Curriculum-based Asymmetric Multi-task Reinforcement Learning

1 code implementation7 Nov 2022 Hanchi Huang, Deheng Ye, Li Shen, Wei Liu

To mitigate the negative influence of customizing the one-off training order in curriculum-based AMTL, CAMRL switches its training mode between parallel single-task RL and asymmetric multi-task RL (MTRL), according to an indicator regarding the training time, the overall performance, and the performance gap among tasks.

Multi-Task Learning reinforcement-learning +1

Streaming Radiance Fields for 3D Video Synthesis

1 code implementation26 Oct 2022 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Ping Tan

Instead of training a single model that combines all the frames, we formulate the dynamic modeling problem with an incremental learning paradigm in which per-frame model difference is trained to complement the adaption of a base model on the current frame.

Incremental Learning Model Optimization +1

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

3 code implementations12 Oct 2022 Zeyu Qin, Yanbo Fan, Yi Liu, Li Shen, Yong Zhang, Jue Wang, Baoyuan Wu

Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability.

Adversarial Attack

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

1 code implementation11 Oct 2022 Qihuang Zhong, Liang Ding, Li Shen, Peng Mi, Juhua Liu, Bo Du, DaCheng Tao

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization.

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

2 code implementations11 Oct 2022 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao

One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.

Strength-Adaptive Adversarial Training

no code implementations4 Oct 2022 Chaojian Yu, Dawei Zhou, Li Shen, Jun Yu, Bo Han, Mingming Gong, Nannan Wang, Tongliang Liu

Firstly, applying a pre-specified perturbation budget on networks of various model capacities will yield divergent degree of robustness disparity between natural and robust accuracies, which deviates from robust network's desideratum.

Adversarial Robustness Scheduling

Tensor-Based Multi-Modality Feature Selection and Regression for Alzheimer's Disease Diagnosis

1 code implementation23 Sep 2022 Jun Yu, Zhaoming Kong, Liang Zhan, Li Shen, Lifang He

The assessment of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI) associated with brain changes remains a challenging task.

feature selection regression

Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions

1 code implementation3 Sep 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, Mingchen Gao

Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift.

Meta-Learning

Respecting Time Series Properties Makes Deep Time Series Forecasting Perfect

1 code implementation22 Jul 2022 Li Shen, Yuning Wei, Yangzhu Wang

Thanks to the core idea of respecting time series properties, no matter in which forecasting format, RTNet shows obviously superior forecasting performances compared with dozens of other SOTA time series forecasting baselines in three real-world benchmark datasets.

Time Series Time Series Forecasting

Improving Task-free Continual Learning by Distributionally Robust Memory Evolution

1 code implementation15 Jul 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Tiehang Duan, Mingchen Gao

To address these problems, for the first time, we propose a principled memory evolution framework to dynamically evolve the memory data distribution by making the memory buffer gradually harder to be memorized with distributionally robust optimization (DRO).

Continual Learning

Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

1 code implementation7 Jul 2022 Zhuo Huang, Xiaobo Xia, Li Shen, Bo Han, Mingming Gong, Chen Gong, Tongliang Liu

Machine learning models are vulnerable to Out-Of-Distribution (OOD) examples, and such a problem has drawn much attention.

Data Augmentation Disentanglement +3

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

1 code implementation6 Jul 2022 Davoud Ataee Tarzanagh, Parvin Nazari, BoJian Hou, Li Shen, Laura Balzano

This paper introduces \textit{online bilevel optimization} in which a sequence of time-varying bilevel problems is revealed one after the other.

Bilevel Optimization

Local Sample-weighted Multiple Kernel Clustering with Consensus Discriminative Graph

1 code implementation5 Jul 2022 Liang Li, Siwei Wang, Xinwang Liu, En Zhu, Li Shen, Kenli Li, Keqin Li

Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.

Clustering

Dynamic Contrastive Distillation for Image-Text Retrieval

no code implementations4 Jul 2022 Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, DaCheng Tao

Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable).

Contrastive Learning Image-text Retrieval +3

Towards Harnessing Feature Embedding for Robust Learning with Noisy Labels

no code implementations27 Jun 2022 Chuang Zhang, Li Shen, Jian Yang, Chen Gong

To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels.

Learning with noisy labels Memorization

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

1 code implementation17 Jun 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang

Safe reinforcement learning (RL) has achieved significant success on risk-sensitive tasks and shown promise in autonomous driving (AD) as well.

Autonomous Driving reinforcement-learning +2

Understanding Robust Overfitting of Adversarial Training and Beyond

1 code implementation17 Jun 2022 Chaojian Yu, Bo Han, Li Shen, Jun Yu, Chen Gong, Mingming Gong, Tongliang Liu

Here, we explore the causes of robust overfitting by comparing the data distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted} (strong adversary) adversarial training, and observe that the distribution of the adversarial data generated by weak adversary mainly contain small-loss data.

Adversarial Robustness Data Ablation

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

1 code implementation1 Jun 2022 Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, DaCheng Tao

In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge.

Personalized Federated Learning

Robust Weight Perturbation for Adversarial Training

1 code implementation30 May 2022 Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du, Tongliang Liu

Based on these observations, we propose a robust perturbation strategy to constrain the extent of weight perturbation.

Classification

Few-Shot Adaptation of Pre-Trained Networks for Domain Shift

1 code implementation30 May 2022 Wenyu Zhang, Li Shen, Wanyue Zhang, Chuan-Sheng Foo

Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation.

domain classification Semantic Segmentation +1

Efficient-Adam: Communication-Efficient Distributed Adam

no code implementations28 May 2022 Congliang Chen, Li Shen, Wei Liu, Zhi-Quan Luo

Distributed adaptive stochastic gradient methods have been widely used for large-scale nonconvex optimization, such as training deep learning models.

Quantization

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

1 code implementation27 May 2022 Erdun Gao, Ignavier Ng, Mingming Gong, Li Shen, Wei Huang, Tongliang Liu, Kun Zhang, Howard Bondell

In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.

Causal Discovery Imputation +1