Search Results for author: DaCheng Tao

Found 613 papers, 255 papers with code

LTF: A Label Transformation Framework for Correcting Label Shift

no code implementations ICML 2020 Jiaxian Guo, Mingming Gong, Tongliang Liu, Kun Zhang, DaCheng Tao

Distribution shift is a major obstacle to the deployment of current deep learning models on real-world problems.

Hallucinating Visual Instances in Total Absentia

no code implementations ECCV 2020 Jiayan Qiu, Yiding Yang, Xinchao Wang, DaCheng Tao

This seemingly minor difference in fact makes the HVITA a much challenging task, as the restoration algorithm would have to not only infer the category of the object in total absentia, but also hallucinate an object of which the appearance is consistent with the background.

Image Inpainting

Label-Noise Robust Domain Adaptation

no code implementations ICML 2020 Xiyu Yu, Tongliang Liu, Mingming Gong, Kun Zhang, Kayhan Batmanghelich, DaCheng Tao

Domain adaptation aims to correct the classifiers when faced with distribution shift between source (training) and target (test) domains.

Denoising Domain Adaptation

Polysemy Deciphering Network for Human-Object Interaction Detection

1 code implementation ECCV 2020 Xubin Zhong, Changxing Ding, Xian Qu, DaCheng Tao

First, PD-Net augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce the intra-class variation of the same verb.

Human-Object Interaction Detection Scene Understanding

Deep Streaming Label Learning

1 code implementation ICML 2020 Zhen Wang, Liu Liu, DaCheng Tao

In order to fill in these research gaps, we propose a novel deep neural network (DNN) based framework, Deep Streaming Label Learning (DSLL), to classify instances with newly emerged labels effectively.

Multi-Label Learning

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation

1 code implementation ACL 2022 Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu

In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.

Knowledge Distillation Translation +1

On Dropping Clusters to Regularize Graph Convolutional Neural Networks

no code implementations ECCV 2020 Xikun Zhang, Chang Xu, DaCheng Tao

Dropout has been widely adopted to regularize graph convolutional networks (GCNs) by randomly zeroing entries of the node feature vectors and obtains promising performance on various tasks.

Action Recognition Skeleton Based Action Recognition

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

no code implementations1 Jun 2023 Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, DaCheng Tao, Li Guo

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data.

Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation

no code implementations1 Jun 2023 Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham

In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Text Spotting

1 code implementation31 May 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

On the other hand, based on the extensibility of DeepSolo, we launch DeepSolo++ for multilingual text spotting, making a further step to let Transformer decoder with explicit points solo for multilingual text detection, recognition, and script identification all at once.

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Knowledge Distillation Meta-Learning

Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning

no code implementations25 May 2023 Guozheng Ma, Linrui Zhang, Haoyu Wang, Lu Li, Zilin Wang, Zhen Wang, Li Shen, Xueqian Wang, DaCheng Tao

Taking the non-stationary nature of RL into account, we propose a RL-tailored multi-type DA fusion scheme called Cycling Augmentation (CycAug), which performs periodic cycles of different DA operations to increase type diversity while maintaining data distribution consistency.

Data Augmentation reinforcement-learning +1

Self-Evolution Learning for Discriminative Language Model Pretraining

no code implementations24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Masked language modeling, widely used in discriminative language model (e. g., BERT) pretraining, commonly adopts a random masking strategy.

Language Modelling Masked Language Modeling

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

no code implementations24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, DaCheng Tao

Token dropping is a recently-proposed strategy to speed up the pretraining of masked language models, such as BERT, by skipping the computation of a subset of the input tokens at several middle layers.

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training

no code implementations24 May 2023 Yifan Shi, Yingqi Liu, Yan Sun, Zihao Lin, Li Shen, Xueqian Wang, DaCheng Tao

Personalized federated learning (PFL) aims to produce the greatest personalized model for each client to face an insurmountable problem--data heterogeneity in real FL systems.

Personalized Federated Learning

Improving Heterogeneous Model Reuse by Density Estimation

1 code implementation23 May 2023 Anke Tang, Yong Luo, Han Hu, Fengxiang He, Kehua Su, Bo Du, Yixin Chen, DaCheng Tao

This paper studies multiparty learning, aiming to learn a model using the private data of different participants.

Density Estimation Selection bias

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

no code implementations22 May 2023 Haoqi Zheng, Qihuang Zhong, Liang Ding, Zhiliang Tian, Xin Niu, Dongsheng Li, DaCheng Tao

However, most of the mixup methods do not consider the varying degree of learning difficulty in different stages of training and generate new samples with one hot labels, resulting in the model over confidence.

Data Augmentation Few-Shot Text Classification +2

VanillaNet: the Power of Minimalism in Deep Learning

1 code implementation22 May 2023 Hanting Chen, Yunhe Wang, Jianyuan Guo, DaCheng Tao

In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design.

Philosophy

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

no code implementations19 May 2023 Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao

In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.

Federated Learning

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

no code implementations10 May 2023 Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, DaCheng Tao

To this end, we propose to generate images conditioned on the compositions of multimodal control signals, where modalities are imperfectly complementary, i. e., composed multimodal conditional image synthesis (CMCIS).

Image Generation

Revolutionizing Agrifood Systems with Artificial Intelligence: A Survey

no code implementations3 May 2023 Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, DaCheng Tao

With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages.

Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

2 code implementations3 May 2023 Di Wang, Jing Zhang, Bo Du, DaCheng Tao, Liangpei Zhang

The success of the Segment Anything Model (SAM) demonstrates the significance of data-centric machine learning.

Instance Segmentation object-detection +2

Scalable Mask Annotation for Video Text Spotting

1 code implementation2 May 2023 Haibin He, Jing Zhang, Mengyang Xu, Juhua Liu, Bo Du, DaCheng Tao

Video text spotting refers to localizing, recognizing, and tracking textual elements such as captions, logos, license plates, signs, and other forms of text within consecutive video frames.

Text Spotting

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

1 code implementation1 May 2023 Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Deep Graph Reprogramming

no code implementations CVPR 2023 Yongcheng Jing, Chongbin Yuan, Li Ju, Yiding Yang, Xinchao Wang, DaCheng Tao

In this paper, we explore a novel model reusing task tailored for graph neural networks (GNNs), termed as "deep graph reprogramming".

3D Object Recognition Action Recognition +1

Compositional 3D Human-Object Neural Animation

no code implementations27 Apr 2023 Zhi Hou, Baosheng Yu, DaCheng Tao

Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics.

Human-Object Interaction Detection Scene Understanding

Segment Anything in Non-Euclidean Domains: Challenges and Opportunities

no code implementations23 Apr 2023 Yongcheng Jing, Xinchao Wang, DaCheng Tao

The recent work known as Segment Anything (SA) has made significant strides in pushing the boundaries of semantic segmentation into the era of foundation models.

Image Inpainting object-detection +2

HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search

1 code implementation23 Apr 2023 Di Wang, Bo Du, Liangpei Zhang, DaCheng Tao

Recent neural architecture search (NAS) based approaches have made great progress in hyperspectral image (HSI) classification tasks.

Neural Architecture Search

DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification

2 code implementations19 Apr 2023 Di Wang, Jing Zhang, Bo Du, Liangpei Zhang, DaCheng Tao

Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.

Hyperspectral Image Classification Image Generation

Event-based Simultaneous Localization and Mapping: A Comprehensive Survey

1 code implementation19 Apr 2023 Kunping Huang, Sen Zhang, Jing Zhang, DaCheng Tao

This paper presents a timely and comprehensive review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks.

Motion Compensation Simultaneous Localization and Mapping

UVA: Towards Unified Volumetric Avatar for View Synthesis, Pose rendering, Geometry and Texture Editing

no code implementations14 Apr 2023 Jinlong Fan, Jing Zhang, DaCheng Tao

Experiments on multiple human avatars demonstrate that our UVA achieves competitive results in novel view synthesis and novel pose rendering while enabling local and independent editing of geometry and appearance.

Novel View Synthesis

Deep Image Matting: A Comprehensive Survey

1 code implementation10 Apr 2023 Jizhizi Li, Jing Zhang, DaCheng Tao

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing.

Image Matting Referring Image Matting

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

no code implementations7 Apr 2023 Li Shen, Yan Sun, Zhiyuan Yu, Liang Ding, Xinmei Tian, DaCheng Tao

The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.

Quantum Imitation Learning

no code implementations4 Apr 2023 Zhihao Cheng, Kaining Zhang, Li Shen, DaCheng Tao

Despite remarkable successes in solving various complex decision-making tasks, training an imitation learning (IL) algorithm with deep neural networks (DNNs) suffers from the high computation burden.

Behavioural cloning

VTAE: Variational Transformer Autoencoder with Manifolds Learning

no code implementations3 Apr 2023 Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, DaCheng Tao, Xuelong Li

This weak projection, however, can be addressed by a Riemannian metric, and we show that geodesics computation and accurate interpolations between data samples on the Riemannian manifold can substantially improve the performance of deep generative models.

Representation Learning

BEVSimDet: Simulated Multi-modal Distillation in Bird's-Eye View for Multi-view 3D Object Detection

1 code implementation29 Mar 2023 Haimei Zhao, Qiming Zhang, Shanshan Zhao, Jing Zhang, DaCheng Tao

In this paper, we approach this challenge from the perspective of both architecture design and knowledge distillation and present a new simulated multi-modal 3D object detection method named BEVSimDet.

3D Object Detection Knowledge Distillation +1

Vision Transformer with Quadrangle Attention

1 code implementation27 Mar 2023 Qiming Zhang, Jing Zhang, Yufei Xu, DaCheng Tao

Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint.

object-detection Object Detection +2

Towards Making the Most of ChatGPT for Machine Translation

1 code implementation24 Mar 2023 Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information further improves ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.

Machine Translation Translation +1

Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT

1 code implementation24 Mar 2023 Qingyu Lu, Baopu Qiu, Liang Ding, Liping Xie, DaCheng Tao

Our results indicate that by combining Chain-of-Thoughts and Error Analysis, a new prompting method called \textbf{\texttt{Error Analysis Prompting}}, LLMs like ChatGPT can \textit{generate human-like MT evaluations at both the system and segment level}.

Machine Translation Natural Language Understanding +3

Make Landscape Flatter in Differentially Private Federated Learning

1 code implementation CVPR 2023 Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

1 code implementation CVPR 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, DaCheng Tao

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data.

Meta-Learning

Deep Learning for Camera Calibration and Beyond: A Survey

1 code implementation19 Mar 2023 Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, DaCheng Tao

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

Sensitivity-Aware Visual Parameter-Efficient Tuning

1 code implementation15 Mar 2023 Haoyu He, Jianfei Cai, Jing Zhang, DaCheng Tao, Bohan Zhuang

Visual Parameter-Efficient Tuning (VPET) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty.

Visual Prompt Based Personalized Federated Learning

no code implementations15 Mar 2023 Guanghao Li, Wansen Wu, Yan Sun, Li Shen, Baoyuan Wu, DaCheng Tao

Then, the local model is trained on the input composed of raw data and a visual prompt to learn the distribution information contained in the prompt.

Image Classification Personalized Federated Learning

Upcycling Models under Domain and Category Shift

2 code implementations CVPR 2023 Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang

We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.

Source-Free Domain Adaptation Universal Domain Adaptation +1

Centroid-centered Modeling for Efficient Vision Transformer Pre-training

no code implementations8 Mar 2023 Xin Yan, Zuchao Li, Lefei Zhang, Bo Du, DaCheng Tao

Our proposed approach, \textbf{CCViT}, leverages k-means clustering to obtain centroids for image modeling without supervised training of tokenizer model.

Semantic Segmentation

Graph Decision Transformer

no code implementations7 Mar 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.

Offline RL OpenAI Gym +1

ESceme: Vision-and-Language Navigation with Episodic Scene Memory

1 code implementation2 Mar 2023 Qi Zheng, Daqing Liu, Chaoyue Wang, Jing Zhang, Dadong Wang, DaCheng Tao

Vision-and-language navigation (VLN) simulates a visual agent that follows natural-language navigation instructions in real-world scenes.

Vision and Language Navigation

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

no code implementations1 Mar 2023 Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step.

Subspace based Federated Unlearning

no code implementations24 Feb 2023 Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, DaCheng Tao

Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data.

Federated Learning

Learning to Generalize Provably in Learning to Optimize

1 code implementation22 Feb 2023 Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

Fusion of Global and Local Knowledge for Personalized Federated Learning

1 code implementation21 Feb 2023 Tiansheng Huang, Li Shen, Yan Sun, Weiwei Lin, DaCheng Tao

Personalized federated learning, as a variant of federated learning, trains customized models for clients using their heterogeneously distributed data.

Personalized Federated Learning

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

no code implementations21 Feb 2023 Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, DaCheng Tao

Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections.

Federated Learning

Pseudo Contrastive Learning for Graph-based Semi-supervised Learning

no code implementations19 Feb 2023 Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang, Yuanhai Lv, Baosheng Yu, DaCheng Tao

Pseudo Labeling is a technique used to improve the performance of semi-supervised Graph Neural Networks (GNNs) by generating additional pseudo-labels based on confident predictions.

Contrastive Learning Data Augmentation

X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection

1 code implementation19 Feb 2023 Aishan Liu, Jun Guo, Jiakai Wang, Siyuan Liang, Renshuai Tao, Wenbo Zhou, Cong Liu, Xianglong Liu, DaCheng Tao

In this paper, we take the first step toward the study of adversarial attacks targeted at X-ray prohibited item detection, and reveal the serious threats posed by such attacks in this safety-critical scenario.

Adversarial Attack

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

1 code implementation19 Feb 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.

Question Answering Sentiment Analysis

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

no code implementations18 Feb 2023 Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, DaCheng Tao

This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.

Contrastive Learning Denoising +11

Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks

no code implementations17 Feb 2023 Xu Zheng, Yexin Liu, Yunfan Lu, Tongyan Hua, Tianbo Pan, Weiming Zhang, DaCheng Tao, Lin Wang

Being capable of capturing information in challenging visual conditions, event cameras have the potential to overcome the limitations of frame-based cameras in the computer vision and robotics community.

Event-based vision Image Reconstruction +3

FedABC: Targeting Fair Competition in Personalized Federated Learning

no code implementations15 Feb 2023 Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, DaCheng Tao

In particular, we adopt the ``one-vs-all'' training strategy in each client to alleviate the unfair competition between classes by constructing a personalized binary classification problem for each class.

Binary Classification Personalized Federated Learning

Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

no code implementations11 Feb 2023 Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, DaCheng Tao

Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the aggravated heterogeneous dataset.

Edge-computing Federated Learning

PointWavelet: Learning in Spectral Domain for 3D Point Cloud Analysis

no code implementations10 Feb 2023 Cheng Wen, Jianzhi Long, Baosheng Yu, DaCheng Tao

In this paper, we introduce a new method, PointWavelet, to explore local graphs in the spectral domain via a learnable graph wavelet transform.

Autonomous Driving Point Cloud Classification

Improving the Model Consistency of Decentralized Federated Learning

no code implementations8 Feb 2023 Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, DaCheng Tao

To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.

Federated Learning

AniPixel: Towards Animatable Pixel-Aligned Human Avatar

no code implementations7 Feb 2023 Jinlong Fan, Jing Zhang, Zhi Hou, DaCheng Tao

In this paper, we propose AniPixel, a novel animatable and generalizable human avatar reconstruction method that leverages pixel-aligned features for body geometry prediction and RGB color blending.

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning

1 code implementation6 Feb 2023 Yibo Yang, Haobo Yuan, Xiangtai Li, Zhouchen Lin, Philip Torr, DaCheng Tao

In this paper, we deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse, which reveals that the last-layer features of the same class will collapse into a vertex, and the vertices of all classes are aligned with the classifier prototypes, which are formed as a simplex equiangular tight frame (ETF).

class-incremental learning Few-Shot Class-Incremental Learning +1

Domain Re-Modulation for Few-Shot Generative Domain Adaptation

no code implementations6 Feb 2023 Yi Wu, Ziqiang Li, Chaoyue Wang, Heliang Zheng, Shanshan Zhao, Bin Li, DaCheng Tao

Specifically, DoRM freezes the source generator and introduces new mapping and affine modules (M&A modules) to capture the attributes of the target domain during GDA.

Domain Adaptation

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning

1 code implementation ICLR 2023 Yibo Yang, Haobo Yuan, Xiangtai Li, Zhouchen Lin, Philip Torr, DaCheng Tao

In this paper, we deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse, which reveals that the last-layer features of the same class will collapse into a vertex, and the vertices of all classes are aligned with the classifier prototypes, which are formed as a simplex equiangular tight frame (ETF).

Ranked #2 on Few-Shot Class-Incremental Learning on CUB-200-2011 (Average Accuracy metric)

class-incremental learning Few-Shot Class-Incremental Learning +1

Global Nash Equilibrium in Non-convex Multi-player Game: Theory and Algorithms

no code implementations19 Jan 2023 Guanpu Chen, Gehui Xu, Fengxiang He, Yiguang Hong, Leszek Rutkowski, DaCheng Tao

This paper takes conjugate transformation to the formulation of non-convex multi-player games, and casts the complementary problem into a variational inequality (VI) problem with a continuous pseudo-gradient mapping.

A Survey of Self-Supervised Learning from Multiple Perspectives: Algorithms, Theory, Applications and Future Trends

1 code implementation13 Jan 2023 Jie Gui, Tuo Chen, Qiong Cao, Zhenan Sun, Hao Luo, DaCheng Tao

To avoid the expensive cost incurred by collecting and labeling too many examples, as a subset of unsupervised learning, self-supervised learning (SSL) was proposed to learn good features from many unlabeled examples without any human-annotated labels.

Self-Supervised Learning

A Comprehensive Survey of Dataset Distillation

1 code implementation13 Jan 2023 Shiye Lei, DaCheng Tao

Dataset distillation, a dataset reduction method, addresses this problem by synthesizing a small typical dataset from substantial data and has attracted much attention from the deep learning community.

Meta-Learning

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

1 code implementation3 Jan 2023 Xiangtai Li, Shilin Xu, Yibo Yang, Haobo Yuan, Guangliang Cheng, Yunhai Tong, Zhouchen Lin, Ming-Hsuan Yang, DaCheng Tao

Third, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross-attention scheme to boost part segmentation qualities further.

Panoptic Segmentation

From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models

no code implementations CVPR 2023 Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, DaCheng Tao, Steven Hoi

To address this issue, we propose Img2Prompt, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training.

Question Answering Visual Question Answering

Leverage Interactive Affinity for Affordance Learning

no code implementations CVPR 2023 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

Perceiving potential "action possibilities" (i. e., affordance) regions of images and learning interactive functionalities of objects from human demonstration is a challenging task due to the diversity of human-object interactions.

Human-Object Interaction Detection

Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization

no code implementations CVPR 2023 Aishan Liu, Shiyu Tang, Siyuan Liang, Ruihao Gong, Boxi Wu, Xianglong Liu, DaCheng Tao

In particular, we comprehensively evaluated 20 most representative adversarially trained architectures on ImageNette and CIFAR-10 datasets towards multiple l_p-norm adversarial attacks.

Learnable Skeleton-Aware 3D Point Cloud Sampling

no code implementations CVPR 2023 Cheng Wen, Baosheng Yu, DaCheng Tao

In this paper, we introduce a new skeleton-aware learning-to-sample method by learning object skeletons as the prior knowledge to preserve the object geometry and topology information during sampling.

Point Cloud Classification Retrieval

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

no code implementations29 Dec 2022 Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao

Transformer, originally devised for natural language processing, has also attested significant success in computer vision.

Autonomous Driving reinforcement-learning +2

Demystify Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification

no code implementations29 Dec 2022 Yuxuan Du, Yibo Yang, DaCheng Tao, Min-Hsiu Hsieh

Using these findings, we propose a method that uses loss dynamics to probe whether a QC may be more effective than a classical classifier on a particular learning task.

Multi-class Classification

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models

1 code implementation21 Dec 2022 Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, DaCheng Tao, Steven C. H. Hoi

To address this issue, we propose \emph{Img2Prompt}, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training.

Question Answering Visual Question Answering

Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

1 code implementation20 Dec 2022 Qingyu Lu, Liang Ding, Liping Xie, Kanjian Zhang, Derek F. Wong, DaCheng Tao

To this end, we augment BARTScore by incorporating the human-like error analysis strategies, namely BARTScore++, where the final score consists of both the evaluations of major errors and minor errors.

Language Modelling Machine Translation +2

Original or Translated? On the Use of Parallel Data for Translation Quality Estimation

no code implementations20 Dec 2022 Baopu Qiu, Liang Ding, Di wu, Lin Shang, Yibing Zhan, DaCheng Tao

Machine Translation Quality Estimation (QE) is the task of evaluating translation output in the absence of human-written references.

Data Augmentation Machine Translation +1

ContraFeat: Contrasting Deep Features for Semantic Discovery

no code implementations14 Dec 2022 Xinqi Zhu, Chang Xu, DaCheng Tao

In this paper, we propose a model that automates this process and achieves state-of-the-art semantic discovery performance.

Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks

no code implementations12 Dec 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, DaCheng Tao

Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics.

Autonomous Driving reinforcement-learning +2

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

1 code implementation12 Dec 2022 Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao, Yu Qiao

Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character.

Font Generation

ViTPose+: Vision Transformer Foundation Model for Generic Body Pose Estimation

1 code implementation7 Dec 2022 Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao

In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model dubbed ViTPose.

Animal Pose Estimation Keypoint Detection

Learning to Learn Better for Video Object Segmentation

1 code implementation5 Dec 2022 Meng Lan, Jing Zhang, Lefei Zhang, DaCheng Tao

Recently, the joint learning framework (JOINT) integrates matching based transductive reasoning and online inductive learning to achieve accurate and robust semi-supervised video object segmentation (SVOS).

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

Improving Simultaneous Machine Translation with Monolingual Data

1 code implementation2 Dec 2022 Hexuan Deng, Liang Ding, Xuebo Liu, Meishan Zhang, DaCheng Tao, Min Zhang

Preliminary experiments on En-Zh and En-Ja news domain corpora demonstrate that monolingual data can significantly improve translation quality (e. g., +3. 15 BLEU on En-Zh).

Knowledge Distillation Machine Translation +2

AL-iGAN: An Active Learning Framework for Tunnel Geological Reconstruction Based on TBM Operational Data

no code implementations2 Dec 2022 Hao Wang, Lixue Liu, Xueguan Song, Chao Zhang, DaCheng Tao

In tunnel boring machine (TBM) underground projects, an accurate description of the rock-soil types distributed in the tunnel can decrease the construction risk ({\it e. g.} surface settlement and landslide) and improve the efficiency of construction.

Active Learning

Unified Discrete Diffusion for Simultaneous Vision-Language Generation

1 code implementation27 Nov 2022 Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan

The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.

multimodal generation Text Generation +1

Knowledge-Aware Federated Active Learning with Non-IID Data

1 code implementation24 Nov 2022 Yu-Tong Cao, Jingya Wang, Ye Shi, Baosheng Yu, DaCheng Tao

In this paper, we propose a federated active learning paradigm to efficiently learn a global model with limited annotation budget while protecting data privacy in a decentralized learning way.

Active Learning Federated Learning

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

no code implementations24 Nov 2022 Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Sagar Verma, Siddharth Gupta, Shishir Muralidhara, Niharika Hegde, Daitao Xing, Nikolaos Evangeliou, Anthony Tzes, Vojtěch Bartl, Jakub Špaňhel, Adam Herout, Neelanjan Bhowmik, Toby P. Breckon, Shivanand Kundargi, Tejas Anvekar, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudengudi, Arpita Vats, Yang song, Delong Liu, Yonglin Li, Shuman Li, Chenhao Tan, Long Lan, Vladimir Somers, Christophe De Vleeschouwer, Alexandre Alahi, Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee, Shuai Jiang, Haiwen Li, Zheng Ziqiang, Tuan-Anh Vu, Hai Nguyen-Truong, Sai-Kit Yeung, Zhuang Jia, Sophia Yang, Chih-Chung Hsu, Xiu-Yu Hou, Yu-An Jhang, Simon Yang, Mau-Tsuen Yang

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection.

object-detection Object Detection +1

Responsible Active Learning via Human-in-the-loop Peer Study

no code implementations24 Nov 2022 Yu-Tong Cao, Jingya Wang, Baosheng Yu, DaCheng Tao

To further enhance the active learner via large-scale unlabelled data, we introduce multiple peer students into the active learner which is trained by a novel learning paradigm, including the In-Class Peer Study on labelled data and the Out-of-Class Peer Study on unlabelled data.

Active Learning

Cross-Modal Contrastive Learning for Robust Reasoning in VQA

1 code implementation21 Nov 2022 Qi Zheng, Chaoyue Wang, Daqing Liu, Dadong Wang, DaCheng Tao

For each positive pair, we regard the images from different graphs as negative samples and deduct the version of multi-positive contrastive learning.

Contrastive Learning Question Answering +1

Adaptive Edge-to-Edge Interaction Learning for Point Cloud Analysis

no code implementations20 Nov 2022 Shanshan Zhao, Mingming Gong, Xi Li, DaCheng Tao

To explore the role of the relation between edges, this paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module, which aims to enhance the point-to-point relation through modelling the edge-to-edge interaction in the local region adaptively.

Semantic Segmentation

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

2 code implementations CVPR 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

Scene Text Detection Text Matching +1

PAD-Net: An Efficient Framework for Dynamic Networks

no code implementations10 Nov 2022 Shwai He, Liang Ding, Daize Dong, Boan Liu, Fuqiang Yu, DaCheng Tao

The main contributions of our work are challenging the basic commonsense in dynamic networks and proposing a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones.

Image Classification

Unifying Flow, Stereo and Depth Estimation

1 code implementation10 Nov 2022 Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Fisher Yu, DaCheng Tao, Andreas Geiger

We present a unified formulation and model for three motion and 3D perception tasks: optical flow, rectified stereo matching and unrectified stereo depth estimation from posed images.

Optical Flow Estimation Stereo Depth Estimation +1

Rethinking Hierarchies in Pre-trained Plain Vision Transformer

no code implementations3 Nov 2022 Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao

Self-supervised pre-training vision transformer (ViT) via masked image modeling (MIM) has been proven very effective.

Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach

1 code implementation2 Nov 2022 Kaiwen Yang, Yanchao Sun, Jiahao Su, Fengxiang He, Xinmei Tian, Furong Huang, Tianyi Zhou, DaCheng Tao

In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e. g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly.

Data Augmentation Representation Learning

TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack

1 code implementation27 Oct 2022 Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, DaCheng Tao

We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers.

Adversarial Attack Question Answering

Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations

no code implementations12 Oct 2022 Yuanyuan Wang, Wei Huang, Mingming Gong, Xi Geng, Tongliang Liu, Kun Zhang, DaCheng Tao

This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

1 code implementation11 Oct 2022 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao

One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

1 code implementation11 Oct 2022 Qihuang Zhong, Liang Ding, Li Shen, Peng Mi, Juhua Liu, Bo Du, DaCheng Tao

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization.

Benefits of Permutation-Equivariance in Auction Mechanisms

no code implementations11 Oct 2022 Tian Qin, Fengxiang He, Dingfeng Shi, Wenbing Huang, DaCheng Tao

Designing an incentive-compatible auction mechanism that maximizes the auctioneer's revenue while minimizes the bidders' ex-post regret is an important yet intricate problem in economics.

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

1 code implementation10 Oct 2022 Guozheng Ma, Zhen Wang, Zhecheng Yuan, Xueqian Wang, Bo Yuan, DaCheng Tao

Visual reinforcement learning (RL), which makes decisions directly from high-dimensional visual inputs, has demonstrated significant potential in various domains.

Data Augmentation reinforcement-learning +1

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

1 code implementation9 Oct 2022 Shwai He, Liang Ding, Daize Dong, Miao Zhang, DaCheng Tao

Adapter Tuning, which freezes the pretrained language models (PLMs) and only fine-tunes a few extra modules, becomes an appealing efficient alternative to the full model fine-tuning.

Network Pruning

Bridged Transformer for Vision and Point Cloud 3D Object Detection

1 code implementation CVPR 2022 Yikai Wang, TengQi Ye, Lele Cao, Wenbing Huang, Fuchun Sun, Fengxiang He, DaCheng Tao

Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises.

3D Object Detection object-detection

Alternating Differentiation for Optimization Layers

1 code implementation3 Oct 2022 Haixiang Sun, Ye Shi, Jingya Wang, Hoang Duong Tuan, H. Vincent Poor, DaCheng Tao

In this paper, we developed a new framework, named Alternating Differentiation (Alt-Diff), that differentiates optimization problems (here, specifically in the form of convex optimization problems with polyhedral constraints) in a fast and recursive way.

Exploring the Relationship between Architecture and Adversarially Robust Generalization

no code implementations28 Sep 2022 Aishan Liu, Shiyu Tang, Siyuan Liang, Ruihao Gong, Boxi Wu, Xianglong Liu, DaCheng Tao

Inparticular, we comprehensively evaluated 20 most representative adversarially trained architectures on ImageNette and CIFAR-10 datasets towards multiple `p-norm adversarial attacks.

Shuffle-QUDIO: accelerate distributed VQE with trainability enhancement and measurement reduction

no code implementations26 Sep 2022 Yang Qian, Yuxuan Du, DaCheng Tao

To gain such computational advantages on large-scale problems, a feasible solution is the QUantum DIstributed Optimization (QUDIO) scheme, which partitions the original problem into $K$ subproblems and allocates them to $K$ quantum machines followed by the parallel optimization.

Distributed Optimization

Towards Robust Referring Image Segmentation

no code implementations20 Sep 2022 Jianzong Wu, Xiangtai Li, Xia Li, Henghui Ding, Yunhai Tong, DaCheng Tao

Our proposed RefSegformer achieves the new state-of-the-art results on three regular RIS datasets and three R-RIS datasets, which serves as a new solid baseline for further research.

Image Segmentation Semantic Segmentation

Vega-MT: The JD Explore Academy Translation System for WMT22

1 code implementation20 Sep 2022 Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan, DaCheng Tao

As for model sizes, we scale the Transformer-Big up to the extremely large model that owns nearly 4. 7 Billion parameters, to fully enhance the model capacity for our Vega-MT.

Data Augmentation Machine Translation +1

On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation

1 code implementation19 Sep 2022 Haimei Zhao, Jing Zhang, Zhuo Chen, Bo Yuan, DaCheng Tao

Compared with the photometric consistency loss as well as the rigid point cloud alignment loss, the proposed DFA and VDA losses are more robust owing to the strong representation power of deep features as well as the high tolerance of voxel density to the aforementioned challenges.

Monocular Depth Estimation

Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition

no code implementations7 Sep 2022 Mengya Han, Yibing Zhan, Yong Luo, Bo Du, Han Hu, Yonggang Wen, DaCheng Tao

To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition.

Meta-Learning Representation Learning

Super-model ecosystem: A domain-adaptation perspective

no code implementations30 Aug 2022 Fengxiang He, DaCheng Tao

We model the super-model paradigm as a two-stage diffusion process: (1) in the pre-training stage, the model parameter diffuses from random initials and converges to a steady distribution; and (2) in the fine-tuning stage, the model parameter is transported to another steady distribution.

Domain Adaptation

Symmetric Pruning in Quantum Neural Networks

no code implementations30 Aug 2022 Xinbiao Wang, Junyu Liu, Tongliang Liu, Yong Luo, Yuxuan Du, DaCheng Tao

To fill this knowledge gap, here we propose the effective quantum neural tangent kernel (EQNTK) and connect this concept with over-parameterization theory to quantify the convergence of QNNs towards the global optima.

Grounded Affordance from Exocentric View

2 code implementations28 Aug 2022 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

Due to the diversity of interactive affordance, the uniqueness of different individuals leads to diverse interactions, which makes it difficult to establish an explicit link between object parts and affordance labels.

Human-Object Interaction Detection Transfer Learning

Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection

1 code implementation23 Aug 2022 Simin Li, Huangxinxin Xu, Jiakai Wang, Aishan Liu, Fazhi He, Xianglong Liu, DaCheng Tao

The threat of fingerprint leakage from social media raises a strong desire for anonymizing shared images while maintaining image qualities, since fingerprints act as a lifelong individual biometric password.

Adversarial Attack

PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation

no code implementations22 Aug 2022 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

In response to these problems, we propose a new metric to accurately predict the prompt transferability (regarding (i)), and a novel PoT approach (namely PANDA) that leverages the knowledge distillation technique to transfer the "knowledge" from the source prompt to the target prompt in a subtle manner and alleviate the catastrophic forgetting effectively (regarding (ii)).

Knowledge Distillation Transfer Learning

Domain-Specific Risk Minimization for Out-of-Distribution Generalization

1 code implementation18 Aug 2022 Yi-Fan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, DaCheng Tao, Xing Xie

Our bound motivates two strategies to reduce the gap: the first one is ensembling multiple classifiers to enrich the hypothesis space, then we propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.

Domain Generalization Out-of-Distribution Generalization

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

2 code implementations8 Aug 2022 Di Wang, Qiming Zhang, Yufei Xu, Jing Zhang, Bo Du, DaCheng Tao, Liangpei Zhang

Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability.

Aerial Scene Classification Few-Shot Learning +2

Balancing Stability and Plasticity through Advanced Null Space in Continual Learning

no code implementations25 Jul 2022 Yajing Kong, Liu Liu, Zhen Wang, DaCheng Tao

Continual learning is a learning paradigm that learns tasks sequentially with resources constraints, in which the key challenge is stability-plasticity dilemma, i. e., it is uneasy to simultaneously have the stability to prevent catastrophic forgetting of old tasks and the plasticity to learn new tasks well.

Continual Learning

Learning Graph Neural Networks for Image Style Transfer

no code implementations24 Jul 2022 Yongcheng Jing, Yining Mao, Yiding Yang, Yibing Zhan, Mingli Song, Xinchao Wang, DaCheng Tao

To this end, we develop an elaborated GNN model with content and style local patches as the graph vertices.

Image Stylization

Online Continual Learning with Contrastive Vision Transformer

no code implementations24 Jul 2022 Zhen Wang, Liu Liu, Yajing Kong, Jiaxian Guo, DaCheng Tao

Based on the learnable focuses, we design a focal contrastive loss to rebalance contrastive learning between new and past classes and consolidate previously learned representations.

Continual Learning Contrastive Learning

JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes

1 code implementation16 Jul 2022 Haimei Zhao, Jing Zhang, Sen Zhang, DaCheng Tao

A naive way is to accomplish them independently in a sequential or parallel manner, but there are many drawbacks, i. e., 1) the depth and VO results suffer from the inherent scale ambiguity issue; 2) the BEV layout is directly predicted from the front-view image without using any depth-related information, although the depth map contains useful geometry clues for inferring scene layouts.

Autonomous Driving Depth Estimation +3

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

1 code implementation15 Jul 2022 Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, Junchi Yan, DaCheng Tao

In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3.

Autonomous Driving Future prediction

Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection

no code implementations14 Jul 2022 Zhe Chen, Jing Zhang, Yufei Xu, DaCheng Tao

Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) which aims to mitigate the gap between features from different levels and form a comprehensive object representation to achieve better detection performance.

object-detection Object Detection

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

3 code implementations10 Jul 2022 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, DaCheng Tao

However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.

Inductive Bias Scene Text Detection

GFNet: Geometric Flow Network for 3D Point Cloud Semantic Segmentation

1 code implementation6 Jul 2022 Haibo Qiu, Baosheng Yu, DaCheng Tao

However, recent projection-based methods for point cloud semantic segmentation usually utilize a vanilla late fusion strategy for the predictions of different views, failing to explore the complementary information from a geometric perspective during the representation learning.

LIDAR Semantic Segmentation Representation Learning +1

Dynamic Contrastive Distillation for Image-Text Retrieval

no code implementations4 Jul 2022 Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, DaCheng Tao

Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable).

Contrastive Learning Metric Learning +3

Topology-aware Generalization of Decentralized SGD

1 code implementation25 Jun 2022 Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, DaCheng Tao

Our theory indicates that the generalizability of D-SGD is positively correlated with the spectral gap, and can explain why consensus control in initial training phase can ensure better generalization.

CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose

no code implementations CVPR 2023 Xu Zhang, Wen Wang, Zhe Chen, Yufei Xu, Jing Zhang, DaCheng Tao

Motivated by the progress of visual-language research, we propose that pre-trained language models (e. g., CLIP) can facilitate animal pose estimation by providing rich prior knowledge for describing animal keypoints in text.

Animal Pose Estimation Contrastive Learning

Variational Distillation for Multi-View Learning

3 code implementations20 Jun 2022 Xudong Tian, Zhizhong Zhang, Cong Wang, Wensheng Zhang, Yanyun Qu, Lizhuang Ma, Zongze Wu, Yuan Xie, DaCheng Tao

Information Bottleneck (IB) based multi-view learning provides an information theoretic principle for seeking shared information contained in heterogeneous data descriptions.

MULTI-VIEW LEARNING Representation Learning

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation19 Jun 2022 Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

Image Classification

Boosting Factorization Machines via Saliency-Guided Mixup

1 code implementation17 Jun 2022 Chenwang Wu, Defu Lian, Yong Ge, Min Zhou, Enhong Chen, DaCheng Tao

Second, considering that MixFM may generate redundant or even detrimental instances, we further put forward a novel Factorization Machine powered by Saliency-guided Mixup (denoted as SMFM).

Recommendation Systems

A Survey on Gradient Inversion: Attacks, Defenses and Future Directions

no code implementations15 Jun 2022 Rui Zhang, Song Guo, Junxiao Wang, Xin Xie, DaCheng Tao

In particular, we dig out some critical ingredients from the iteration-based attacks, including data initialization, model training and gradient matching.

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

4 code implementations12 Jun 2022 Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, DaCheng Tao

Based on APT-36K, we benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.

Animal Pose Estimation Domain Generalization +1

Toward Real-world Single Image Deraining: A New Benchmark and Beyond

1 code implementation11 Jun 2022 Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao

To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.

Domain Generalization Image Restoration +2

Referring Image Matting

1 code implementation CVPR 2023 Jizhizi Li, Jing Zhang, DaCheng Tao

Different from conventional image matting, which either requires user-defined scribbles/trimap to extract a specific foreground object or directly extracts all the foreground objects in the image indiscriminately, we introduce a new task named Referring Image Matting (RIM) in this paper, which aims to extract the meticulous alpha matte of the specific object that best matches the given natural language description, thus enabling a more natural and simpler instruction for image matting.

Domain Generalization Image Matting +4

A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning

1 code implementation ICLR 2022 Jixian Guo, Mingming Gong, DaCheng Tao

However, because environments are not labelled, the extracted information inevitably contains redundant information unrelated to the dynamics in transition segments and thus fails to maintain a crucial property of $Z$: $Z$ should be similar in the same environment and dissimilar in different ones.

Model-based Reinforcement Learning reinforcement-learning +1

Recent Advances for Quantum Neural Networks in Generative Learning

no code implementations7 Jun 2022 Jinkai Tian, Xiaoyu Sun, Yuxuan Du, Shanshan Zhao, Qing Liu, Kaining Zhang, Wei Yi, Wanrong Huang, Chaoyue Wang, Xingyao Wu, Min-Hsiu Hsieh, Tongliang Liu, Wenjing Yang, DaCheng Tao

Due to the intrinsic probabilistic nature of quantum mechanics, it is reasonable to postulate that quantum generative learning models (QGLMs) may surpass their classical counterparts.

BIG-bench Machine Learning Quantum Machine Learning

Understanding deep learning via decision boundary

no code implementations3 Jun 2022 Shiye Lei, Fengxiang He, Yancheng Yuan, DaCheng Tao

From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size.

Modeling Image Composition for Complex Scene Generation

1 code implementation CVPR 2022 Zuopeng Yang, Daqing Liu, Chaoyue Wang, Jie Yang, DaCheng Tao

Compared to existing CNN-based and Transformer-based generation models that entangled modeling on pixel-level&patch-level and object-level&patch-level respectively, the proposed focal attention predicts the current patch token by only focusing on its highly-related tokens that specified by the spatial layout, thereby achieving disambiguation during training.

Layout-to-Image Generation Scene Generation

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

1 code implementation1 Jun 2022 Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, DaCheng Tao

In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge.

Personalized Federated Learning