Search Results for author: DaCheng Tao

Found 740 papers, 351 papers with code

LTF: A Label Transformation Framework for Correcting Label Shift

no code implementations ICML 2020 Jiaxian Guo, Mingming Gong, Tongliang Liu, Kun Zhang, DaCheng Tao

Distribution shift is a major obstacle to the deployment of current deep learning models on real-world problems.

On Dropping Clusters to Regularize Graph Convolutional Neural Networks

no code implementations ECCV 2020 Xikun Zhang, Chang Xu, DaCheng Tao

Dropout has been widely adopted to regularize graph convolutional networks (GCNs) by randomly zeroing entries of the node feature vectors and obtains promising performance on various tasks.

Action Recognition Skeleton Based Action Recognition

Hallucinating Visual Instances in Total Absentia

no code implementations ECCV 2020 Jiayan Qiu, Yiding Yang, Xinchao Wang, DaCheng Tao

This seemingly minor difference in fact makes the HVITA a much challenging task, as the restoration algorithm would have to not only infer the category of the object in total absentia, but also hallucinate an object of which the appearance is consistent with the background.

Hallucination Image Inpainting +1

Deep Streaming Label Learning

1 code implementation ICML 2020 Zhen Wang, Liu Liu, DaCheng Tao

In order to fill in these research gaps, we propose a novel deep neural network (DNN) based framework, Deep Streaming Label Learning (DSLL), to classify instances with newly emerged labels effectively.

Multi-Label Learning

Label-Noise Robust Domain Adaptation

no code implementations ICML 2020 Xiyu Yu, Tongliang Liu, Mingming Gong, Kun Zhang, Kayhan Batmanghelich, DaCheng Tao

Domain adaptation aims to correct the classifiers when faced with distribution shift between source (training) and target (test) domains.

Denoising Domain Adaptation

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation

1 code implementation ACL 2022 Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu

In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.

Knowledge Distillation Translation +1

Polysemy Deciphering Network for Human-Object Interaction Detection

1 code implementation ECCV 2020 Xubin Zhong, Changxing Ding, Xian Qu, DaCheng Tao

First, PD-Net augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce the intra-class variation of the same verb.

Human-Object Interaction Detection Object +1

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

1 code implementation20 Jun 2024 Xincheng Shuai, Henghui Ding, Xingjun Ma, RongCheng Tu, Yu-Gang Jiang, DaCheng Tao

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users.

Video Editing

PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions

no code implementations20 Jun 2024 Sihan Ma, Jing Zhang, Qiong Cao, DaCheng Tao

We evaluated 60 representative models, including top-down, bottom-up, heatmap-based, regression-based, and classification-based methods, across three datasets for human and animal pose estimation.

Animal Pose Estimation Autonomous Driving +1

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

1 code implementation17 Jun 2024 Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, DaCheng Tao, Liangpei Zhang

To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA.

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

1 code implementation17 Jun 2024 Rong Bao, Rui Zheng, Shihan Dou, Xiao Wang, Enyu Zhou, Bo wang, Qi Zhang, Liang Ding, DaCheng Tao

In aligning large language models (LLMs), utilizing feedback from existing advanced AI rather than humans is an important method to scale supervisory signals.


Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks

1 code implementation10 Jun 2024 Zonghao Ying, Aishan Liu, Xianglong Liu, DaCheng Tao

The recent release of GPT-4o has garnered widespread attention due to its powerful general capabilities.

Uncertainty Aware Learning for Language Model Alignment

no code implementations7 Jun 2024 Yikun Wang, Rui Zheng, Liang Ding, Qi Zhang, Dahua Lin, DaCheng Tao

As instruction-tuned large language models (LLMs) evolve, aligning pretrained foundation models presents increasing challenges.

GSM8K Language Modelling

Revisiting Catastrophic Forgetting in Large Language Model Tuning

1 code implementation7 Jun 2024 Hongyu Li, Liang Ding, Meng Fang, DaCheng Tao

Catastrophic Forgetting (CF) means models forgetting previously acquired knowledge when learning new data.

Language Modelling Large Language Model

Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt

1 code implementation6 Jun 2024 Zonghao Ying, Aishan Liu, Tianyuan Zhang, Zhengmin Yu, Siyuan Liang, Xianglong Liu, DaCheng Tao

To address this limitation, this paper introduces the Bi-Modal Adversarial Prompt Attack (BAP), which executes jailbreaks by optimizing textual and visual prompts cohesively.

Language Modelling Large Language Model

FusionBench: A Comprehensive Benchmark of Deep Model Fusion

1 code implementation5 Jun 2024 Anke Tang, Li Shen, Yong Luo, Han Hu, Bo Du, DaCheng Tao

These techniques range from model ensemble methods, which combine the predictions to improve the overall performance, to model merging, which integrates different models into a single one, and model mixing methods, which upscale or recombine the components of the original models.

Image Classification text-classification +2

LanEvil: Benchmarking the Robustness of Lane Detection to Environmental Illusions

no code implementations3 Jun 2024 Tianyuan Zhang, Lu Wang, Hainan Li, Yisong Xiao, Siyuan Liang, Aishan Liu, Xianglong Liu, DaCheng Tao

For the first time, this paper studies the potential threats caused by these environmental illusions to LD and establishes the first comprehensive benchmark LanEvil for evaluating the robustness of LD against this natural corruption.

Autonomous Driving Benchmarking +1

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

1 code implementation28 May 2024 Shengchao Hu, Ziqing Fan, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

However, variations in task content and complexity pose significant challenges in policy formulation, necessitating judicious parameter sharing and management of conflicting gradients for optimal policy performance.

Management Meta-Learning +1

Q-value Regularized Transformer for Offline Reinforcement Learning

1 code implementation27 May 2024 Shengchao Hu, Ziqing Fan, Chaoqin Huang, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action distribution based on history trajectory and target returns for each state.

D4RL Offline RL +3

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models

no code implementations26 May 2024 Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Yu Li, Chun Yuan, DaCheng Tao

Based on our findings, we propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.


Learning Multi-Agent Communication from Graph Modeling Perspective

1 code implementation14 May 2024 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

In numerous artificial intelligence applications, the collaborative efforts of multiple intelligent agents are imperative for the successful attainment of target objectives.

Separable Power of Classical and Quantum Learning Protocols Through the Lens of No-Free-Lunch Theorem

no code implementations12 May 2024 Xinbiao Wang, Yuxuan Du, Kecheng Liu, Yong Luo, Bo Du, DaCheng Tao

The No-Free-Lunch (NFL) theorem, which quantifies problem- and data-independent generalization errors regardless of the optimization process, provides a foundational framework for comprehending diverse learning protocols' potential.

Attribute Quantum Machine Learning

LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models

1 code implementation9 May 2024 Ruihao Gong, Yang Yong, Shiqiao Gu, Yushi Huang, Yunchen Zhang, Xianglong Liu, DaCheng Tao

Recent advancements in large language models (LLMs) are propelling us toward artificial general intelligence, thanks to their remarkable emergent abilities and reasoning capabilities.

Benchmarking Computational Efficiency +1

Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning

no code implementations2 May 2024 Tianle Xia, Liang Ding, Guojia Wan, Yibing Zhan, Bo Du, DaCheng Tao

Specifically, we augment the arbitrary first-order logical queries via binary tree decomposition, to stimulate the reasoning capability of LLMs.

Knowledge Graphs Logical Reasoning +2

FREE: Faster and Better Data-Free Meta-Learning

no code implementations CVPR 2024 Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, DaCheng Tao

Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns.


3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

1 code implementation29 Apr 2024 Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, DaCheng Tao, Min Zhang

Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.

Multimodal Machine Translation Sentence +2

Federated Learning with Only Positive Labels by Exploring Label Correlations

no code implementations24 Apr 2024 Xuming An, Dui Wang, Li Shen, Yong Luo, Han Hu, Bo Du, Yonggang Wen, DaCheng Tao

Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training.

Federated Learning Multi-Label Classification

Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems

1 code implementation23 Apr 2024 Qihuang Zhong, Kang Wang, Ziyang Xu, Juhua Liu, Liang Ding, Bo Du, DaCheng Tao

To this end, we propose a simple-yet-effective method, namely Deeply Understanding the Problems (DUP), to improve the LLMs' math problem-solving ability by addressing semantic misunderstanding errors.

 Ranked #1 on Math Word Problem Solving on SVAMP (Accuracy metric)

Arithmetic Reasoning GSM8K +2

A Survey on Self-Evolution of Large Language Models

1 code implementation22 Apr 2024 Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, DaCheng Tao, Jingren Zhou

To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

1 code implementation16 Apr 2024 Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong liu, Guansong Pang, DaCheng Tao

Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods.

Anomaly Detection object-detection +2

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

no code implementations CVPR 2024 Haimei Zhao, Jing Zhang, Zhuo Chen, Shanshan Zhao, DaCheng Tao

We devote UniMix to two main setups: 1) unsupervised domain adaption, adapting the model from the clear weather source domain to the adverse weather target domain; 2) domain generalization, learning a model that generalizes well to unseen scenes in adverse weather.

Autonomous Driving Domain Generalization +2

Deepfake Generation and Detection: A Benchmark and Survey

1 code implementation26 Mar 2024 Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

Attribute Face Reenactment +2

Object Detectors in the Open Environment: Challenges, Solutions, and Outlook

1 code implementation24 Mar 2024 Siyuan Liang, Wei Wang, Ruoyu Chen, Aishan Liu, Boxi Wu, Ee-Chien Chang, Xiaochun Cao, DaCheng Tao

This paper aims to bridge this gap by conducting a comprehensive review and analysis of object detectors in open environments.

Incremental Learning Object

Contact-aware Human Motion Generation from Textual Descriptions

no code implementations23 Mar 2024 Sihan Ma, Qiong Cao, Jing Zhang, DaCheng Tao

This paper addresses the problem of generating 3D interactive human motion from text.

Motion Synthesis

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

1 code implementation21 Mar 2024 Changtong Zan, Liang Ding, Li Shen, Yibing Zhen, Weifeng Liu, DaCheng Tao

In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.

In-Context Learning Instruction Following +1

Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis

1 code implementation CVPR 2024 Yiyang Chen, Lunhao Duan, Shanshan Zhao, Changxing Ding, DaCheng Tao

Equipped with LCRF and RPR, our LocoTrans is capable of learning local-consistent transformation and preserving local geometry, which benefits rotation invariance learning.

Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction

1 code implementation15 Mar 2024 Ziyang Xu, Keqin Peng, Liang Ding, DaCheng Tao, Xiliang Lu

Experiments across various prompts, PLMs, and benchmarks show that our approach can not only correct the overfitted performance caused by prompt bias, but also significantly improve the prompt retrieval capability (up to 10% absolute performance gain).

When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability

no code implementations1 Mar 2024 Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao

Subsequently, to enhance controllability with inexplicit masks, an advanced Shape-aware ControlNet consisting of a deterioration estimator and a shape-prior modulation block is devised.

Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping

1 code implementation29 Feb 2024 Jianbin Zheng, Minghui Hu, Zhongyi Fan, Chaoyue Wang, Changxing Ding, DaCheng Tao, Tat-Jen Cham

Consequently, we introduce Trajectory Consistency Distillation (TCD), which encompasses trajectory consistency function and strategic stochastic sampling.

Image Generation

Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation

no code implementations20 Feb 2024 Zhiyao Ren, Yibing Zhan, Baosheng Yu, Liang Ding, DaCheng Tao

The copilot framework, which aims to enhance and tailor large language models (LLMs) for specific complex tasks without requiring fine-tuning, is gaining increasing attention from the community.

A Survey on Knowledge Distillation of Large Language Models

1 code implementation20 Feb 2024 Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.

Data Augmentation Knowledge Distillation +1

Towards Theoretical Understandings of Self-Consuming Generative Models

no code implementations19 Feb 2024 Shi Fu, Sen Zhang, Yingjie Wang, Xinmei Tian, DaCheng Tao

This paper tackles the emerging challenge of training generative models within a self-consuming loop, wherein successive generations of models are recursively trained on mixtures of real and synthetic data from previous generations.

ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

With the development of instruction-tuned large language models (LLMs), improving the safety of LLMs has become more critical.

Revisiting Knowledge Distillation for Autoregressive Language Models

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, DaCheng Tao

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model.

Knowledge Distillation

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

no code implementations19 Feb 2024 Hong Chen, Chengtao Lv, Liang Ding, Haotong Qin, Xiabin Zhou, Yifu Ding, Xuebo Liu, Min Zhang, Jinyang Guo, Xianglong Liu, DaCheng Tao

Large language models (LLMs) have significantly advanced the field of natural language processing, while the expensive memory and computation consumption impede their practical deployment.

Binarization Computational Efficiency +1

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

no code implementations19 Feb 2024 Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, DaCheng Tao, Enhong Chen

However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate.

Continual Learning on Graphs: Challenges, Solutions, and Opportunities

1 code implementation18 Feb 2024 Xikun Zhang, Dongjin Song, DaCheng Tao

To bridge the gap, we provide a comprehensive review of existing continual graph learning (CGL) algorithms by elucidating the different task settings and categorizing the existing methods based on their characteristics.

Continual Learning Graph Learning

InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling

no code implementations14 Feb 2024 Yuchun Miao, Sen Zhang, Liang Ding, Rong Bao, Lefei Zhang, DaCheng Tao

Inspired by this finding, we propose the Cluster Separation Index (CSI), which quantifies deviations in the IB latent space, as an indicator of reward overoptimization to facilitate the development of online mitigation strategies.

Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases

1 code implementation13 Feb 2024 Ziyi Zhang, Sen Zhang, Yibing Zhan, Yong Luo, Yonggang Wen, DaCheng Tao

Then, we surprisingly discover that dormant neurons in our critic model act as a regularization against reward overoptimization while active neurons reflect primacy bias.

Denoising Inductive Bias

Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning

no code implementations6 Feb 2024 Yanfang Zhang, Yiliu Sun, Yibing Zhan, Dapeng Tao, DaCheng Tao, Chen Gong

The experimental results on popular LLMs, such as GPT-3. 5-turbo and Gemini-pro, show that our IR method enhances the overall accuracy of factual reasoning by 27. 33% and mathematical proof by 31. 43%, when compared with traditional DR methods.

A Survey on Transformer Compression

no code implementations5 Feb 2024 Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, DaCheng Tao

Model compression methods reduce the memory and computational cost of Transformer, which is a necessary step to implement large language/vision models on practical devices.

Knowledge Distillation Model Compression +1

Poisson Process for Bayesian Optimization

no code implementations5 Feb 2024 Xiaoxing Wang, Jiaxing Li, Chao Xue, Wei Liu, Weifeng Liu, Xiaokang Yang, Junchi Yan, DaCheng Tao

BayesianOptimization(BO) is a sample-efficient black-box optimizer, and extensive methods have been proposed to build the absolute function response of the black-box function through a probabilistic surrogate model, including Tree-structured Parzen Estimator (TPE), random forest (SMAC), and Gaussian process (GP).

Bayesian Optimization Hyperparameter Optimization +2

Representation Surgery for Multi-Task Model Merging

1 code implementation5 Feb 2024 Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xiaojun Chen, Xingwei Wang, DaCheng Tao

That is, there is a significant discrepancy in the representation distribution between the merged and individual models, resulting in poor performance of merged MTL.

Computational Efficiency Multi-Task Learning

FreDF: Learning to Forecast in Frequency Domain

1 code implementation4 Feb 2024 Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, DaCheng Tao

Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences.

Time Series

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

1 code implementation1 Feb 2024 Anke Tang, Li Shen, Yong Luo, Nan Yin, Lefei Zhang, DaCheng Tao

A notable challenge is mitigating the interference between parameters of different models, which can substantially deteriorate performance.

Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects

1 code implementation31 Jan 2024 Jingcai Guo, Zhijie Rao, Zhi Chen, Jingren Zhou, DaCheng Tao

To enrich the literature of this domain and provide a sound basis for its future development, in this paper, we present a broad review of recent advances for fine-grained analysis in ZSL.

Zero-Shot Learning

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

1 code implementation31 Jan 2024 Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, BaoCai Yin, Cong Liu, Bo Du, DaCheng Tao

In terms of the AMG mode, Hi-SAM segments text stroke foreground masks initially, then samples foreground points for hierarchical text mask generation and achieves layout analysis in passing.

Hierarchical Text Segmentation Segmentation +1

Topology-aware Embedding Memory for Continual Learning on Expanding Networks

no code implementations24 Jan 2024 Xikun Zhang, Dongjin Song, Yixin Chen, DaCheng Tao

Memory replay based techniques have shown great success for continual learning with incrementally accumulated Euclidean data.

Continual Learning

TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph Generation

no code implementations23 Jan 2024 Xin Lin, Chong Shi, Yibing Zhan, Zuopeng Yang, Yaqi Wu, DaCheng Tao

To address the above problems, in this paper, we introduce a network named TD$^2$-Net that aims at denoising and debiasing for dynamic SGG.

Denoising Graph Generation +3

Revisiting Demonstration Selection Strategies in In-Context Learning

1 code implementation22 Jan 2024 Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.

In-Context Learning

Solving Continual Offline Reinforcement Learning with Decision Transformer

no code implementations16 Jan 2024 Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, DaCheng Tao

We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues.

Offline RL reinforcement-learning +1

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

1 code implementation13 Jan 2024 Haibin He, Maoyuan Ye, Jing Zhang, Juhua Liu, DaCheng Tao

In response to this issue, we propose to efficiently turn an off-the-shelf query-based image text spotter into a specialist on video and present a simple baseline termed GoMatching, which focuses the training efforts on tracking while maintaining strong recognition performance.

Text Detection Text Spotting

Intention Analysis Makes LLMs A Good Jailbreak Defender

1 code implementation12 Jan 2024 Yuqi Zhang, Liang Ding, Lefei Zhang, DaCheng Tao

Aligning large language models (LLMs) with human values, particularly in the face of complex and stealthy jailbreak attacks, presents a formidable challenge.

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

1 code implementation12 Jan 2024 Shuai Wang, Liang Ding, Li Shen, Yong Luo, Bo Du, DaCheng Tao

Advancing automated programming necessitates robust and comprehensive code generation benchmarks, yet current evaluation frameworks largely neglect object-oriented programming (OOP) in favor of functional programming (FP), e. g., HumanEval and MBPP.

Code Generation

Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity Monocular Dense Mapping

no code implementations6 Jan 2024 Tongyan Hua, Haotian Bai, Zidong Cao, Ming Liu, DaCheng Tao, Lin Wang

In this paper, we introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF).

Depth Estimation

Sheared Backpropagation for Fine-tuning Foundation Models

no code implementations CVPR 2024 Zhiyuan Yu, Li Shen, Liang Ding, Xinmei Tian, Yixin Chen, DaCheng Tao

To address these challenges we introduce PreBackRazor a novel activation pruning scheme offering both computational and memory efficiency through a sparsified backpropagation strategy which judiciously avoids unnecessary activation pruning and storage and gradient computation.


PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations27 Dec 2023 Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

1 code implementation11 Dec 2023 Anke Tang, Li Shen, Yong Luo, Liang Ding, Han Hu, Bo Du, DaCheng Tao

At the upper level, we focus on learning a shared Concrete mask to identify the subspace, while at the inner level, model merging is performed to maximize the performance of the merged model.


Good Questions Help Zero-Shot Image Reasoning

1 code implementation4 Dec 2023 Kaiwen Yang, Tao Shen, Xinmei Tian, Xiubo Geng, Chongyang Tao, DaCheng Tao, Tianyi Zhou

QVix enables a wider exploration of visual scenes, improving the LVLMs' reasoning accuracy and depth in tasks such as visual question answering and visual entailment.

Fine-Grained Image Classification Question Answering +2

Disentangled Interaction Representation for One-Stage Human-Object Interaction Detection

no code implementations4 Dec 2023 Xubin Zhong, Changxing Ding, Yupeng Hu, DaCheng Tao

In this paper, we improve the performance of one-stage methods by enabling them to extract disentangled interaction representations.

Decoder Human-Object Interaction Detection +1

HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting

1 code implementation29 Nov 2023 Wenquan Lu, Yufei Xu, Jing Zhang, Chaoyue Wang, DaCheng Tao

Given a generated failed image due to malformed hands, we utilize ControlNet modules to re-inject such correct hand information.

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls

no code implementations CVPR 2024 Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham

By integrating a compact network and incorporating an additional simple yet effective step during inference, OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.


Task-Distributionally Robust Data-Free Meta-Learning

no code implementations23 Nov 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Yongxian Wei, Baoyuan Wu, Chun Yuan, DaCheng Tao

TDS leads to a biased meta-learner because of the skewed task distribution towards newly generated tasks.

Meta-Learning Model Selection

DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

1 code implementation22 Nov 2023 Zhe Zhang, Gaochang Wu, Jing Zhang, Chunhua Shen, DaCheng Tao, Tianyou Chai

To solve the challenge, we propose a novel DA-STC method for domain adaptive video semantic segmentation, which incorporates a bidirectional multi-level spatio-temporal fusion module and a category-aware spatio-temporal feature alignment module to facilitate consistent learning for domain-invariant features.

Representation Learning Segmentation +2

Optical Quantum Sensing for Agnostic Environments via Deep Learning

no code implementations13 Nov 2023 Zeqiao Zhou, Yuxuan Du, Xu-Fei Yin, Shanshan Zhao, Xinmei Tian, DaCheng Tao

DQS incorporates two essential components: a Graph Neural Network (GNN) predictor and a trigonometric interpolation algorithm.

Graph Neural Network

Multimodal deep representation learning for quantum cross-platform verification

no code implementations7 Nov 2023 Yang Qian, Yuxuan Du, Zhenliang He, Min-Hsiu Hsieh, DaCheng Tao

Cross-platform verification, a critical undertaking in the realm of early-stage quantum computing, endeavors to characterize the similarity of two imperfect quantum devices executing identical algorithms, utilizing minimal measurements.

Representation Learning

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

no code implementations20 Oct 2023 Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem.

Language Modelling Quantization

Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function

no code implementations18 Oct 2023 Liu Liu, Xuanqing Liu, Cho-Jui Hsieh, DaCheng Tao

In this paper, we explore a family of stochastic TR and ARC methods that can simultaneously provide inexact computations of the Hessian matrix, gradient, and function values.

Second-order methods Stochastic Optimization

Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer

no code implementations15 Oct 2023 Boan Liu, Liang Ding, Li Shen, Keqin Peng, Yu Cao, Dazhao Cheng, DaCheng Tao

The Mixture of Experts (MoE) has emerged as a highly successful technique in deep learning, based on the principle of divide-and-conquer to maximize model capacity without significant additional computational cost.

Question Answering

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

1 code implementation15 Oct 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.

Computational Efficiency

Learn From Model Beyond Fine-Tuning: A Survey

1 code implementation12 Oct 2023 Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, DaCheng Tao

LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks.

Meta-Learning Model Editing

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

1 code implementation11 Oct 2023 Guozheng Ma, Lu Li, Sen Zhang, Zixuan Liu, Zhen Wang, Yixin Chen, Li Shen, Xueqian Wang, DaCheng Tao

Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning (VRL).

Data Augmentation reinforcement-learning

PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation

1 code implementation11 Oct 2023 Haibo Qiu, Baosheng Yu, Yixin Chen, DaCheng Tao

Significant progress has been made recently in point cloud segmentation utilizing an encoder-decoder framework, which initially encodes point clouds into low-resolution representations and subsequently decodes high-resolution predictions.

Decoder Point Cloud Segmentation +1

Parameter Efficient Multi-task Model Fusion with Partial Linearization

1 code implementation7 Oct 2023 Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, DaCheng Tao

We demonstrate that our partial linearization technique enables a more effective fusion of multiple tasks into a single model, outperforming standard adapter tuning and task arithmetic alone.

Which mode is better for federated learning? Centralized or Decentralized

no code implementations5 Oct 2023 Yan Sun, Li Shen, DaCheng Tao

Both centralized and decentralized approaches have shown excellent performance and great application value in federated learning (FL).

Federated Learning valid

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models

no code implementations4 Oct 2023 Zihao Lin, Yan Sun, Yifan Shi, Xueqian Wang, Lifu Huang, Li Shen, DaCheng Tao

With the blowout development of pre-trained models (PTMs), the efficient tuning of these models for diverse downstream applications has emerged as a pivotal research concern.

AdaMerging: Adaptive Model Merging for Multi-Task Learning

1 code implementation4 Oct 2023 Enneng Yang, Zhenyi Wang, Li Shen, Shiwei Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

This approach aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.

Multi-Task Learning

One for All: Towards Training One Graph Model for All Classification Tasks

1 code implementation29 Sep 2023 Hao liu, Jiarui Feng, Lecheng Kong, Ningyue Liang, DaCheng Tao, Yixin Chen, Muhan Zhang

For in-context learning on graphs, OFA introduces a novel graph prompting paradigm that appends prompting substructures to the input graph, which enables it to address varied tasks without fine-tuning.

Graph Classification Graph Learning +3

Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation

1 code implementation28 Sep 2023 Changtong Zan, Liang Ding, Li Shen, Yibin Lei, Yibing Zhan, Weifeng Liu, DaCheng Tao

Zero-shot translation (ZST), which is generally based on a multilingual neural machine translation model, aims to translate between unseen language pairs in training data.

Machine Translation Navigate +2

CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation

no code implementations22 Sep 2023 Xiaoheng Jiang, Kaiyi Guo, Yang Lu, Feng Yan, Hao liu, Jiale Cao, Mingliang Xu, DaCheng Tao

To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer.

Defect Detection

Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects

no code implementations22 Sep 2023 Feng Yan, Xiaoheng Jiang, Yang Lu, Lisha Cui, Shupan Li, Jiale Cao, Mingliang Xu, DaCheng Tao

To this end, we develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.

Decoder Defect Detection +1

Graph Contrastive Learning Meets Graph Meta Learning: A Unified Method for Few-shot Node Tasks

1 code implementation19 Sep 2023 Hao liu, Jiarui Feng, Lecheng Kong, DaCheng Tao, Yixin Chen, Muhan Zhang

In our study, we first identify two crucial advantages of contrastive learning compared to meta learning, including (1) the comprehensive utilization of graph nodes and (2) the power of graph augmentations.

CoLA Contrastive Learning +3

FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data

no code implementations18 Sep 2023 Hao Sun, Li Shen, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data.

Federated Learning Scheduling

DARC: Distribution-Aware Re-Coloring Model for Generalizable Nucleus Segmentation

1 code implementation1 Sep 2023 Shengcong Chen, Changxing Ding, DaCheng Tao, Hao Chen

Second, we propose a new instance normalization method that is robust to the variation in foreground-background ratios.


Continual Learning From a Stream of APIs

no code implementations31 Aug 2023 Enneng Yang, Zhenyi Wang, Li Shen, Nan Yin, Tongliang Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model.

Continual Learning

CktGNN: Circuit Graph Neural Network for Electronic Design Automation

1 code implementation31 Aug 2023 Zehao Dong, Weidong Cao, Muhan Zhang, DaCheng Tao, Yixin Chen, Xuan Zhang

The electronic design automation of analog circuits has been a longstanding challenge in the integrated circuit field due to the huge design space and complex design trade-offs among circuit specifications.

Bayesian Optimization Graph Learning +1

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations30 Aug 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

no code implementations29 Aug 2023 Qingyue Wang, Liang Ding, Yanan Cao, Zhiliang Tian, Shi Wang, DaCheng Tao, Li Guo

We evaluate our method on both open and closed LLMs, and the experiments on the widely-used public dataset show that our method can generate more consistent responses in a long-context conversation.

16k 8k +1

ShadowNet for Data-Centric Quantum System Learning

no code implementations22 Aug 2023 Yuxuan Du, Yibo Yang, Tongliang Liu, Zhouchen Lin, Bernard Ghanem, DaCheng Tao

Understanding the dynamics of large quantum systems is hindered by the curse of dimensionality.

Quantum State Tomography

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

no code implementations18 Aug 2023 Xiaoge Deng, Li Shen, Shengwei Li, Tao Sun, Dongsheng Li, DaCheng Tao

Stochastic gradient descent (SGD) performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.

DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning

no code implementations16 Aug 2023 Qinglun Li, Li Shen, Guanghao Li, Quanjun Yin, DaCheng Tao

To address the communication burden issues associated with federated learning (FL), decentralized federated learning (DFL) discards the central server and establishes a decentralized communication network, where each client communicates only with neighboring clients.

Federated Learning

Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction

1 code implementation10 Aug 2023 Yangyang Xu, Yibo Yang, Bernard Ghanem, Lefei Zhang, Du Bo, DaCheng Tao

In this work, we present a novel MTL model by combining both merits of deformable CNN and query-based Transformer with shared gating for multi-task learning of dense prediction.

Decoder Multi-Task Learning

Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation

no code implementations5 Aug 2023 Yiyang Chen, Shanshan Zhao, Changxing Ding, Liyao Tang, Chaoyue Wang, DaCheng Tao

In recent years, cross-modal domain adaptation has been studied on the paired 2D image and 3D LiDAR data to ease the labeling costs for 3D LiDAR semantic segmentation (3DLSS) in the target domain.

Domain Adaptation LIDAR Semantic Segmentation +1

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants

2 code implementations3 Aug 2023 Yibo Yang, Haobo Yuan, Xiangtai Li, Jianlong Wu, Lefei Zhang, Zhouchen Lin, Philip Torr, DaCheng Tao, Bernard Ghanem

Beyond the normal case, long-tail class incremental learning and few-shot class incremental learning are also proposed to consider the data imbalance and data scarcity, respectively, which are common in real-world implementations and further exacerbate the well-known problem of catastrophic forgetting.

Few-Shot Class-Incremental Learning Incremental Learning

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

no code implementations30 Jul 2023 Yan Sun, Li Shen, Hao Sun, Liang Ding, DaCheng Tao

Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global adaptive optimizer; (ii) client drifts exacerbated by local over-fitting with the local adaptive optimizer.

Federated Learning

PNT-Edge: Towards Robust Edge Detection with Noisy Labels by Learning Pixel-level Noise Transitions

1 code implementation26 Jul 2023 Wenjie Xuan, Shanshan Zhao, Yu Yao, Juhua Liu, Tongliang Liu, Yixin Chen, Bo Du, DaCheng Tao

Exploiting the estimated noise transitions, our model, named PNT-Edge, is able to fit the prediction to clean labels.

Edge Detection

Understanding Deep Neural Networks via Linear Separability of Hidden Layers

no code implementations26 Jul 2023 Chao Zhang, Xinyu Chen, Wensheng Li, Lixue Liu, Wei Wu, DaCheng Tao

In this paper, we measure the linear separability of hidden layer outputs to study the characteristics of deep neural networks.

Patch-Wise Point Cloud Generation: A Divide-and-Conquer Approach

1 code implementation22 Jul 2023 Cheng Wen, Baosheng Yu, Rao Fu, DaCheng Tao

A generative model for high-fidelity point clouds is of great importance in synthesizing 3d environments for applications such as autonomous driving and robotics.

Autonomous Driving Point Cloud Generation

Heterogeneous Federated Learning: State-of-the-art and Research Challenges

2 code implementations20 Jul 2023 Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, DaCheng Tao

Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential.

Federated Learning

Image Captions are Natural Prompts for Text-to-Image Models

1 code implementation17 Jul 2023 Shiye Lei, Hao Chen, Sen Zhang, Bo Zhao, DaCheng Tao

With the rapid development of Artificial Intelligence Generated Content (AIGC), it has become common practice in many learning tasks to train or fine-tune large models on synthetic data due to the data-scarcity and privacy leakage problems.

Image Captioning Image Generation

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

1 code implementation30 Jun 2023 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao

Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.

GraMMaR: Ground-aware Motion Model for 3D Human Motion Reconstruction

1 code implementation29 Jun 2023 Sihan Ma, Qiong Cao, Hongwei Yi, Jing Zhang, DaCheng Tao

Demystifying complex human-ground interactions is essential for accurate and realistic 3D human motion reconstruction from RGB videos, as it ensures consistency between the humans and the ground plane.

FHA-Kitchens: A Novel Dataset for Fine-Grained Hand Action Recognition in Kitchen Scenes

1 code implementation19 Jun 2023 Ting Zhe, YongQian Li, Jing Zhang, Yong Luo, Han Hu, Bo Du, Yonggang Wen, DaCheng Tao

We represent the action information in each hand interaction region as a triplet, resulting in a total of 878 action triplets.

Action Recognition Domain Generalization +3

Structured Cooperative Learning with Graphical Model Priors

1 code implementation16 Jun 2023 Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao

We propose "Structured Cooperative Learning (SCooL)", in which a cooperation graph across devices is generated by a graphical model prior to automatically coordinate mutual learning between devices.

Stochastic Block Model Variational Inference

Learning Geometric Transformation for Point Cloud Completion

2 code implementations International Journal of Computer Vision 2023 Shengping Zhang, Xianzhu Liu, Haozhe Xie, Liqiang Nie, Huiyu Zhou, DaCheng Tao, Xuelong Li

It exploits the repetitive geometric structures in common 3D objects to recover the complete shapes, which contains three sub-networks: geometric patch network, structure transformation network, and detail refinement network.

Decoder Point Cloud Completion

Transition Role of Entangled Data in Quantum Machine Learning

1 code implementation6 Jun 2023 Xinbiao Wang, Yuxuan Du, Zhuozhuo Tu, Yong Luo, Xiao Yuan, DaCheng Tao

Recent progress has highlighted its positive impact on learning quantum dynamics, wherein the integration of entanglement into quantum operations or measurements of quantum machine learning (QML) models leads to substantial reductions in training data size, surpassing a specified prediction error threshold.

Quantum Machine Learning

Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman

1 code implementation NeurIPS 2023 Jiarui Feng, Lecheng Kong, Hao liu, DaCheng Tao, Fuhai Li, Muhan Zhang, Yixin Chen

We theoretically prove that even if we fix the space complexity to $O(n^k)$ (for any $k\geq 2$) in $(k, t)$-FWL, we can construct an expressiveness hierarchy up to solving the graph isomorphism problem.

Graph Regression

MotionTrack: Learning Motion Predictor for Multiple Object Tracking

no code implementations5 Jun 2023 Changcheng Xiao, Qiong Cao, Yujie Zhong, Long Lan, Xiang Zhang, Zhigang Luo, DaCheng Tao

This challenge arises from two main factors: the insufficient discriminability of ReID features and the predominant utilization of linear motion models in MOT.

motion prediction Multi-Object Tracking +2

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

1 code implementation5 Jun 2023 Tongtian Zhu, Fengxiang He, KaiXuan Chen, Mingli Song, DaCheng Tao

Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server.

Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training

1 code implementation5 Jun 2023 Yibin Lei, Liang Ding, Yu Cao, Changtong Zan, Andrew Yates, DaCheng Tao

Dense retrievers have achieved impressive performance, but their demand for abundant training data limits their application scenarios.

Contrastive Learning Retrieval

Collect-and-Distribute Transformer for 3D Point Cloud Analysis

1 code implementation2 Jun 2023 Haibo Qiu, Baosheng Yu, DaCheng Tao

In this paper, we propose a new transformer network equipped with a collect-and-distribute mechanism to communicate short- and long-range contexts of point clouds, which we refer to as CDFormer.

Point Cloud Classification Position

Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation

no code implementations1 Jun 2023 Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham

In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.

Conditional Image Generation

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

no code implementations1 Jun 2023 Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, DaCheng Tao, Li Guo

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data.

Dialogue State Tracking Transfer Learning

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

1 code implementation31 May 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

Decoder Scene Text Detection +2

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Few-Shot Learning Knowledge Distillation

Self-Evolution Learning for Discriminative Language Model Pretraining

1 code implementation24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Masked language modeling, widely used in discriminative language model (e. g., BERT) pretraining, commonly adopts a random masking strategy.

Language Modelling Masked Language Modeling +1

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

1 code implementation24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, DaCheng Tao

Token dropping is a recently-proposed strategy to speed up the pretraining of masked language models, such as BERT, by skipping the computation of a subset of the input tokens at several middle layers.

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training

no code implementations24 May 2023 Yifan Shi, Yingqi Liu, Yan Sun, Zihao Lin, Li Shen, Xueqian Wang, DaCheng Tao

Personalized federated learning (PFL) aims to produce the greatest personalized model for each client to face an insurmountable problem--data heterogeneity in real FL systems.

Personalized Federated Learning

Improving Heterogeneous Model Reuse by Density Estimation

1 code implementation23 May 2023 Anke Tang, Yong Luo, Han Hu, Fengxiang He, Kehua Su, Bo Du, Yixin Chen, DaCheng Tao

This paper studies multiparty learning, aiming to learn a model using the private data of different participants.

Density Estimation Selection bias

VanillaNet: the Power of Minimalism in Deep Learning

4 code implementations NeurIPS 2023 Hanting Chen, Yunhe Wang, Jianyuan Guo, DaCheng Tao

In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design.


Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

no code implementations22 May 2023 Haoqi Zheng, Qihuang Zhong, Liang Ding, Zhiliang Tian, Xin Niu, Dongsheng Li, DaCheng Tao

However, most of the mixup methods do not consider the varying degree of learning difficulty in different stages of training and generate new samples with one hot labels, resulting in the model over confidence.

Data Augmentation Few-Shot Text Classification +1

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

1 code implementation19 May 2023 Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao

In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.

Federated Learning

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

no code implementations10 May 2023 Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, DaCheng Tao

To this end, we propose to generate images conditioned on the compositions of multimodal control signals, where modalities are imperfectly complementary, i. e., composed multimodal conditional image synthesis (CMCIS).

Image Generation

Revolutionizing Agrifood Systems with Artificial Intelligence: A Survey

no code implementations3 May 2023 Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, DaCheng Tao

With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages.

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

2 code implementations NeurIPS 2023 Di Wang, Jing Zhang, Bo Du, Minqiang Xu, Lin Liu, DaCheng Tao, Liangpei Zhang

In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS.

Instance Segmentation Object +4

Scalable Mask Annotation for Video Text Spotting

1 code implementation2 May 2023 Haibin He, Jing Zhang, Mengyang Xu, Juhua Liu, Bo Du, DaCheng Tao

Video text spotting refers to localizing, recognizing, and tracking textual elements such as captions, logos, license plates, signs, and other forms of text within consecutive video frames.

Text Spotting

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

1 code implementation1 May 2023 Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Deep Graph Reprogramming

no code implementations CVPR 2023 Yongcheng Jing, Chongbin Yuan, Li Ju, Yiding Yang, Xinchao Wang, DaCheng Tao

In this paper, we explore a novel model reusing task tailored for graph neural networks (GNNs), termed as "deep graph reprogramming".

3D Object Recognition Action Recognition +1