Search Results for author: Bo Du

Found 221 papers, 141 papers with code

FuzzyLight: A Robust Two-Stage Fuzzy Approach for Traffic Signal Control Works in Real Cities

no code implementations27 Jan 2025 Mingyuan Li, Jiahao Wang, Bo Du, Jun Shen, Qiang Wu

FuzzyLight offers several key contributions: (1) It employs fuzzy logic and compressed sensing to address sensor noise and enhances the efficiency of TSP decisions.

Reinforcement Learning (RL) Traffic Signal Control

UniUIR: Considering Underwater Image Restoration as An All-in-One Learner

no code implementations22 Jan 2025 Xu Zhang, huan zhang, Guoli Wang, Qian Zhang, Lefei Zhang, Bo Du

Existing underwater image restoration (UIR) methods generally only handle color distortion or jointly address color and haze issues, but they often overlook the more complex degradations that can occur in underwater scenes.

Depth Estimation Depth Prediction +2

MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching

no code implementations20 Jan 2025 Yepeng Liu, Zhichao Sun, Baosheng Yu, Yitian Zhao, Bo Du, Yongchao Xu, Jun Cheng

Extending such methods to multimodal image matching often requires well-aligned multimodal data to learn modality-invariant descriptors.

Keypoint Detection Zero-shot Generalization

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

1 code implementation16 Jan 2025 Anke Tang, Enneng Yang, Li Shen, Yong Luo, Han Hu, Bo Du, DaCheng Tao

In this study, we propose a training-free projection-based continual merging method that processes models sequentially through orthogonal projections of weight matrices and adaptive scaling mechanisms.

MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification

1 code implementation9 Jan 2025 Yapeng Li, Yong Luo, Lefei Zhang, Zengmao Wang, Bo Du

To remedy these drawbacks, we propose a novel HSI classification model based on a Mamba model, named MambaHSI, which can simultaneously model long-range interaction of the whole image and integrate spatial and spectral information in an adaptive manner.

Classification Hyperspectral Image Classification +1

Color Correction Meets Cross-Spectral Refinement: A Distribution-Aware Diffusion for Underwater Image Restoration

no code implementations8 Jan 2025 Laibin Chang, Yunke Wang, Bo Du, Chang Xu

For the sacrificed image details caused by underwater scattering, we further present the Cross-Spectral Detail Refinement (CSDR) to enhance the high-frequency details, which are integrated with the low-frequency signal as input conditions for guiding the diffusion.

Computational Efficiency Denoising +3

Efficient Relational Context Perception for Knowledge Graph Completion

no code implementations31 Dec 2024 Wenkai Tu, Guojia Wan, Zhengchun Shang, Bo Du

These approaches also assign a single static embedding to each entity and relation, disregarding the fact that entities and relations can exhibit different behaviors in varying graph contexts.

Knowledge Graph Embedding Link Prediction +2

Detect Changes like Humans: Incorporating Semantic Priors for Improved Change Detection

no code implementations22 Dec 2024 Yuhang Gan, Wenjie Xuan, Zhiming Luo, Lei Fang, Zengmao Wang, Juhua Liu, Bo Du

Thus, these methods primarily emphasize the difference-aware features between bi-temporal images and neglect the semantic understanding of the changed landscapes, which undermines the accuracy in the presence of noise and illumination variations.

Change Detection Semantic Segmentation

Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization

2 code implementations18 Dec 2024 Xuekang Zhu, Xiaochen Ma, Lei Su, Zhuohang Jiang, Bo Du, Xiwen Wang, Zeyu Lei, Wentao Feng, Chi-Man Pun, Jizhe Zhou

Inspired by this, our paper explores how to simultaneously construct mesoscopic representations of micro and macro information for IML and introduces the Mesorch architecture to orchestrate both.

Image Manipulation Image Manipulation Localization

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

no code implementations11 Dec 2024 Yihan Cao, Jiazhao Zhang, Zhinan Yu, Shuzhen Liu, Zheng Qin, Qin Zou, Bo Du, Kai Xu

Inspired by neuroscientific evidence that humans consistently update their cognitive states while searching for objects in unseen environments, we present CogNav, which attempts to model this cognitive process with the help of large language models.

Large Language Model

CellSeg1: Robust Cell Segmentation with One Training Image

1 code implementation2 Dec 2024 Peilin Zhou, Bo Du, Yongchao Xu

We introduce CellSeg1, a practical solution for segmenting cells of arbitrary morphology and modality with a few dozen cell annotations in 1 image.

Cell Segmentation Segmentation

Privacy-Preserving Federated Foundation Model for Generalist Ultrasound Artificial Intelligence

no code implementations25 Nov 2024 Yuncheng Jiang, Chun-Mei Feng, Jinke Ren, Jun Wei, Zixun Zhang, Yiwen Hu, Yunbi Liu, Rui Sun, Xuemei Tang, Juan Du, Xiang Wan, Yong Xu, Bo Du, Xin Gao, Guangyu Wang, Shaohua Zhou, Shuguang Cui, Rick Siow Mong Goh, Yong liu, Zhen Li

Notably, UltraFedFM surpasses the diagnostic accuracy of mid-level ultrasonographers and matches the performance of expert-level sonographers in the joint diagnosis of 8 common systemic diseases.

Federated Learning Lesion Segmentation +1

Aligning Few-Step Diffusion Models with Dense Reward Difference Learning

1 code implementation18 Nov 2024 Ziyi Zhang, Li Shen, Sen Zhang, Deheng Ye, Yong Luo, Miaojing Shi, Bo Du, DaCheng Tao

Experimental results demonstrate that SDPO consistently outperforms prior methods in reward-based alignment across diverse step configurations, underscoring its robust step generalization capabilities.

Denoising

Learn from Downstream and Be Yourself in Multimodal Large Language Model Fine-Tuning

2 code implementations17 Nov 2024 Wenke Huang, Jian Liang, Zekun Shi, Didi Zhu, Guancheng Wan, He Li, Bo Du, DaCheng Tao, Mang Ye

To balance the trade-off between generalization and specialization, we propose measuring the parameter importance for both pre-trained and fine-tuning distributions, based on frozen pre-trained weight magnitude and accumulated fine-tuning gradient values.

Image Captioning Language Modeling +5

Stability and Generalization for Distributed SGDA

no code implementations14 Nov 2024 Miaoxi Zhu, Yan Sun, Li Shen, Bo Du, DaCheng Tao

Our theoretical results reveal the trade-off between the generalization gap and optimization error and suggest hyperparameters choice to obtain the optimal population risk.

CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

1 code implementation30 Oct 2024 Ziyang Gong, Zhixiang Wei, Di Wang, Xianzheng Ma, Hongruixuan Chen, Yuru Jia, Yupeng Deng, Zhenming Ji, Xiangwei Zhu, Naoto Yokoya, Jing Zhang, Bo Du, Liangpei Zhang

The field of Remote Sensing Domain Generalization (RSDG) has emerged as a critical and valuable research frontier, focusing on developing models that generalize effectively across diverse scenarios.

Domain Generalization Segmentation +1

Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging

no code implementations29 Oct 2024 Li Shen, Anke Tang, Enneng Yang, Guibing Guo, Yong Luo, Lefei Zhang, Xiaochun Cao, Bo Du, DaCheng Tao

Building on WEMoE, we further introduce an efficient-and-effective WEMoE (E-WEMoE) method, whose core mechanism involves eliminating non-essential elements in the critical modules of WEMoE and implementing shared routing across multiple MoE modules, thereby significantly reducing both the trainable parameters, the overall parameter count, and computational overhead of the merged model by WEMoE.

Task Arithmetic

What If the Input is Expanded in OOD Detection?

1 code implementation24 Oct 2024 Boxuan Zhang, Jianing Zhu, Zengmao Wang, Tongliang Liu, Bo Du, Bo Han

Based on that, we formalize a new scoring method, namely, Confidence aVerage (CoVer), which can capture the dynamic differences by simply averaging the scores obtained from different corrupted inputs and the original ones, making the OOD and ID distributions more separable in detection tasks.

Out of Distribution (OOD) Detection

Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

no code implementations17 Oct 2024 Yida Xiong, Kun Li, Weiwei Liu, Jia Wu, Bo Du, Shirui Pan, Wenbin Hu

TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions, thereby preventing error propagation during diffusion process.

Drug Discovery Language Modeling +2

Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL

no code implementations15 Oct 2024 Qihuang Zhong, Kunfeng Chen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Large Language Models (LLMs) have shown promising performance in text-to-SQL, which involves translating natural language questions into SQL queries.

Knowledge Distillation Text-To-SQL

Tracking Everything in Robotic-Assisted Surgery

no code implementations29 Sep 2024 Bohan Zhan, Wang Zhao, Yi Fang, Bo Du, Francisco Vasconcelos, Danail Stoyanov, Daniel S. Elson, Baoru Huang

However, its efficacy in surgical scenarios remains untested, largely due to the lack of a comprehensive surgical tracking dataset for evaluation.

Benchmarking

Shape-intensity knowledge distillation for robust medical image segmentation

1 code implementation26 Sep 2024 Wenhui Dong, Bo Du, Yongchao Xu

In this paper, we propose a novel approach to incorporate joint shape-intensity prior information into the segmentation network.

Image Segmentation Knowledge Distillation +3

Progressive Retinal Image Registration via Global and Local Deformable Transformations

1 code implementation2 Sep 2024 Yepeng Liu, Baosheng Yu, Tian Chen, Yuliang Gu, Bo Du, Yongchao Xu, Jun Cheng

For that, we use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation, respectively.

Image Registration

SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models

1 code implementation19 Aug 2024 Anke Tang, Li Shen, Yong Luo, Shuai Xie, Han Hu, Lefei Zhang, Bo Du, DaCheng Tao

Deep model training on extensive datasets is increasingly becoming cost-prohibitive, prompting the widespread adoption of deep model fusion techniques to leverage knowledge from pre-existing models.

Image Classification Text Generation

Fragment-Masked Molecular Optimization

no code implementations17 Aug 2024 Kun Li, Xiantao Cai, Jia Wu, Bo Du, Wenbin Hu

Molecular optimization is a crucial aspect of drug discovery, aimed at refining molecular structures to enhance drug efficacy and minimize side effects, ultimately accelerating the overall drug development process.

Drug Discovery

Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

1 code implementation15 Aug 2024 Wenxuan Li, Qin Zou, Chi Chen, Bo Du, Long Chen, Jian Zhou, Hongkai Yu

3D object detection in driving scenarios faces the challenge of complex road environments, which can lead to the loss or incompleteness of key features, thereby affecting perception performance.

3D Object Detection Autonomous Driving +1

FADE: A Dataset for Detecting Falling Objects around Buildings in Video

1 code implementation11 Aug 2024 Zhigang Tu, Zitao Gao, Zhengbo Zhang, Chunluan Zhou, Junsong Yuan, Bo Du

Falling objects from buildings can cause severe injuries to pedestrians due to the great impact force they exert.

Moving Object Detection Object +2

TextIM: Part-aware Interactive Motion Synthesis from Text

no code implementations6 Aug 2024 Siyuan Fan, Bo Du, Xiantao Cai, Bo Peng, Longling Sun

In this work, we propose TextIM, a novel framework for synthesizing TEXT-driven human Interactive Motions, with a focus on the precise alignment of part-level semantics.

Motion Synthesis

SAT3D: Image-driven Semantic Attribute Transfer in 3D

no code implementations3 Aug 2024 Zhijun Zhai, Zengmao Wang, Xiaoxiao Long, Kaixuan Zhou, Bo Du

In this paper, we propose an image-driven Semantic Attribute Transfer method in 3D (SAT3D) by editing semantic attributes from a reference image.

Attribute Reading Comprehension

M^3:Manipulation Mask Manufacturer for Arbitrary-Scale Super-Resolution Mask

no code implementations4 Jul 2024 Xinyu Yang, Xiaochen Ma, Xuekang Zhu, Bo Du, Lei Su, Bingkui Tong, Zeyu Lei, Jizhe Zhou

Additionally, we created the Manipulation Mask Manufacturer Dataset (MMMD), a dataset that covers a wide range of manipulation techniques.

Change Detection Image Forensics +3

Iterative Data Generation with Large Language Models for Aspect-based Sentiment Analysis

no code implementations29 Jun 2024 Qihuang Zhong, Haiyun Li, Luyao Zhuang, Juhua Liu, Bo Du

Aspect-based Sentiment Analysis (ABSA) is an important sentiment analysis task, which aims to determine the sentiment polarity towards an aspect in a sentence.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +6

Federated Graph Semantic and Structural Learning

1 code implementation27 Jun 2024 Wenke Huang, Guancheng Wan, Mang Ye, Bo Du

First, for node-level semantics, we find that contrasting nodes from distinct classes is beneficial to provide a well-performing discrimination.

Graph Learning Graph Neural Network

EmoLLM: Multimodal Emotional Understanding Meets Large Language Models

1 code implementation24 Jun 2024 Qu Yang, Mang Ye, Bo Du

Experimental results demonstrate that EmoLLM significantly elevates multimodal emotional understanding performance, with an average improvement of 12. 1% across multiple foundation models on EmoBench.

Emotional Intelligence

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

no code implementations18 Jun 2024 Shuo Xu, Sai Wang, Xinyue Hu, Yutian Lin, Bo Du, Yu Wu

Compositional Zero-Shot Learning (CZSL) aims to learn semantic primitives (attributes and objects) from seen compositions and recognize unseen attribute-object compositions.

Attribute Compositional Zero-Shot Learning

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

1 code implementation17 Jun 2024 Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, DaCheng Tao, Liangpei Zhang

To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA.

model

Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient

no code implementations15 Jun 2024 Yuan Gao, Zujing Liu, Weizhong Zhang, Bo Du, Gui-Song Xia

We instead propose a novel optimization-based structural pruning that learns the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.

Network Pruning

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion

1 code implementation14 Jun 2024 Anke Tang, Li Shen, Yong Luo, Shiwei Liu, Han Hu, Bo Du

Once the routers are learned and a preference vector is set, the MoE module can be unloaded, thus no additional computational cost is introduced during inference.

Multi-Task Learning

FusionBench: A Comprehensive Benchmark of Deep Model Fusion

1 code implementation5 Jun 2024 Anke Tang, Li Shen, Yong Luo, Han Hu, Bo Du, DaCheng Tao

These techniques range from model ensemble methods, which combine the predictions to improve the overall performance, to model merging, which integrates different models into a single one, and model mixing methods, which upscale or recombine the components of the original models.

Image Classification model +4

MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence

1 code implementation29 May 2024 Hongduan Tian, Feng Liu, Tongliang Liu, Bo Du, Yiu-ming Cheung, Bo Han

In cross-domain few-shot classification, \emph{nearest centroid classifier} (NCC) aims to learn representations to construct a metric space where few-shot classification can be performed by measuring the similarities between samples and the prototype of each class.

Cross-Domain Few-Shot

Vertical Federated Learning for Effectiveness, Security, Applicability: A Survey

1 code implementation25 May 2024 Mang Ye, Wei Shen, Bo Du, Eduard Snezhko, Vassili Kovalev, Pong C. Yuen

Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models using partitioned features of shared samples, without leaking private data.

Privacy Preserving Survey +1

Regressor-free Molecule Generation to Support Drug Response Prediction

no code implementations23 May 2024 Kun Li, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP.

Common Sense Reasoning Drug Response Prediction +2

A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction

no code implementations23 May 2024 Hongzhi Zhang, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

In the drug development engineering field, predicting novel drug-target interactions is extremely crucial. However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction.

Drug Discovery Prediction

Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image

1 code implementation21 May 2024 Zerui Zhang, Zhichao Sun, Zelong Liu, Bo Du, Rui Yu, Zhou Zhao, Yongchao Xu

Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis. Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly.

Generative Adversarial Network Image Restoration +2

Hi-GMAE: Hierarchical Graph Masked Autoencoders

1 code implementation17 May 2024 Chuang Liu, Zelin Yao, Yibing Zhan, Xueqi Ma, Dapeng Tao, Jia Wu, Wenbin Hu, Shirui Pan, Bo Du

To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales.

Graph Neural Network Self-Supervised Learning

LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation

1 code implementation16 May 2024 Wentao Jiang, Jing Zhang, Di Wang, Qiming Zhang, Zengmao Wang, Bo Du

Experimental results in classification and dense prediction tasks show that LeMeViT has a significant $1. 7 \times$ speedup, fewer parameters, and competitive performance compared to the baseline models, and achieves a better trade-off between efficiency and performance.

Separable Power of Classical and Quantum Learning Protocols Through the Lens of No-Free-Lunch Theorem

no code implementations12 May 2024 Xinbiao Wang, Yuxuan Du, Kecheng Liu, Yong Luo, Bo Du, DaCheng Tao

The No-Free-Lunch (NFL) theorem, which quantifies problem- and data-independent generalization errors regardless of the optimization process, provides a foundational framework for comprehending diverse learning protocols' potential.

Attribute Quantum Machine Learning

Exploring Text-Guided Single Image Editing for Remote Sensing Images

1 code implementation9 May 2024 Fangzhou Han, Lingyu Si, Hongwei Dong, Lamei Zhang, Hao Chen, Bo Du

However, the equally important area of remote sensing image (RSI) editing has not received sufficient attention.

Image Generation

All in One Framework for Multimodal Re-identification in the Wild

no code implementations CVPR 2024 He Li, Mang Ye, Ming Zhang, Bo Du

In Re-identification (ReID), recent advancements yield noteworthy progress in both unimodal and cross-modal retrieval tasks.

Cross-Modal Retrieval Domain Generalization +1

Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning

no code implementations2 May 2024 Tianle Xia, Liang Ding, Guojia Wan, Yibing Zhan, Bo Du, DaCheng Tao

Specifically, we augment the arbitrary first-order logical queries via binary tree decomposition, to stimulate the reasoning capability of LLMs.

Knowledge Graphs Logical Reasoning +2

Visual Mamba: A Survey and New Outlooks

1 code implementation29 Apr 2024 Rui Xu, Shu Yang, Yihui Wang, Yu Cai, Bo Du, Hao Chen

Mamba, a recent selective structured state space model, excels in long sequence modeling, which is vital in the large model era.

Mamba Survey

RFL-CDNet: Towards Accurate Change Detection via Richer Feature Learning

1 code implementation27 Apr 2024 Yuhang Gan, Wenjie Xuan, Hang Chen, Juhua Liu, Bo Du

The C2FG module aims to seamlessly integrate the side prediction from the previous coarse-scale into the current fine-scale prediction in a coarse-to-fine manner, while LF module assumes that the contribution of each stage and each spatial location is independent, thus designing a learnable module to fuse multiple predictions.

Change Detection

Federated Learning with Only Positive Labels by Exploring Label Correlations

no code implementations24 Apr 2024 Xuming An, Dui Wang, Li Shen, Yong Luo, Han Hu, Bo Du, Yonggang Wen, DaCheng Tao

Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training.

Federated Learning Multi-Label Classification +1

Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems

1 code implementation23 Apr 2024 Qihuang Zhong, Kang Wang, Ziyang Xu, Juhua Liu, Liang Ding, Bo Du

To this end, we propose a simple-yet-effective method, namely Deeply Understanding the Problems (DUP), to improve the LLMs' math problem-solving ability by addressing semantic misunderstanding errors.

 Ranked #1 on Math Word Problem Solving on SVAMP (Accuracy metric)

Arithmetic Reasoning GSM8K +2

Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

1 code implementation6 Apr 2024 Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li, Bo Du

It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts.

Domain Generalization Image Retrieval +4

Improving Bird's Eye View Semantic Segmentation by Task Decomposition

no code implementations CVPR 2024 Tianhao Zhao, Yongcan Chen, Yu Wu, Tianyang Liu, Bo Du, Peilun Xiao, Shi Qiu, Hongda Yang, Guozhen Li, Yi Yang, Yutian Lin

In the first stage, we train a BEV autoencoder to reconstruct the BEV segmentation maps given corrupted noisy latent representation, which urges the decoder to learn fundamental knowledge of typical BEV patterns.

Autonomous Driving Bird's-Eye View Semantic Segmentation +2

A Universal Knowledge Embedded Contrastive Learning Framework for Hyperspectral Image Classification

1 code implementation2 Apr 2024 Quanwei Liu, Yanni Dong, Tao Huang, Lefei Zhang, Bo Du

Therefore, we propose a universal knowledge embedded contrastive learning framework (KnowCL) for supervised, unsupervised, and semisupervised HSI classification, which largely closes the gap between HSI classification models between pocket models and standard vision backbones.

Classification Contrastive Learning +1

WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

1 code implementation18 Mar 2024 Haoyu Zhao, Yuliang Gu, Zhou Zhao, Bo Du, Yongchao Xu, Rui Yu

Second, to better capture high-frequency components and detailed information, Frequency-Aware Multi-scale Loss (FAM) is proposed by effectively utilizing multi-scale feature space.

Image Denoising

Online GNN Evaluation Under Test-time Graph Distribution Shifts

1 code implementation15 Mar 2024 Xin Zheng, Dongjin Song, Qingsong Wen, Bo Du, Shirui Pan

This enables the effective evaluation of the well-trained GNNs' ability to capture test node semantics and structural representations, making it an expressive metric for estimating the generalization error in online GNN evaluation.

Multi-modal Auto-regressive Modeling via Visual Words

1 code implementation12 Mar 2024 Tianshuo Peng, Zuchao Li, Lefei Zhang, Hai Zhao, Ping Wang, Bo Du

Large Language Models (LLMs), benefiting from the auto-regressive modelling approach performed on massive unannotated texts corpora, demonstrates powerful perceptual and reasoning capabilities.

Visual Question Answering

When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability

1 code implementation1 Mar 2024 Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao

Subsequently, to enhance controllability with inexplicit masks, an advanced Shape-aware ControlNet consisting of a deterioration estimator and a shape-prior modulation block is devised.

Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

no code implementations29 Feb 2024 Boxuan Zhang, Zengmao Wang, Bo Du

The lack of object-level annotations poses a significant challenge for object detection in remote sensing images (RSIs).

Active Learning Object +3

Revisiting Knowledge Distillation for Autoregressive Language Models

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, DaCheng Tao

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model.

Knowledge Distillation

ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

With the development of instruction-tuned large language models (LLMs), improving the safety of LLMs has become more critical.

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

1 code implementation31 Jan 2024 Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, BaoCai Yin, Cong Liu, Bo Du, DaCheng Tao

We use this TS model to iteratively generate the pixel-level text labels in a semi-automatical manner, unifying labels across the four text hierarchies in the HierText dataset.

Hierarchical Text Segmentation parameter-efficient fine-tuning +2

Visual Imitation Learning with Calibrated Contrastive Representation

no code implementations21 Jan 2024 Yunke Wang, Linwei Tao, Bo Du, Yutian Lin, Chang Xu

Adversarial Imitation Learning (AIL) allows the agent to reproduce expert behavior with low-dimensional states and actions.

Contrastive Learning Imitation Learning

MAEDiff: Masked Autoencoder-enhanced Diffusion Models for Unsupervised Anomaly Detection in Brain Images

no code implementations19 Jan 2024 Rui Xu, Yunke Wang, Bo Du

To address these two issues, we propose a novel Masked Autoencoder-enhanced Diffusion Model (MAEDiff) for unsupervised anomaly detection in brain images.

Unsupervised Anomaly Detection

Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models

1 code implementation17 Jan 2024 HaoNan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang, Deren Li

Recently, the flourishing large language models(LLM), especially ChatGPT, have shown exceptional performance in language understanding, reasoning, and interaction, attracting users and researchers from multiple fields and domains.

Task Planning

Transformer for Object Re-Identification: A Survey

1 code implementation13 Jan 2024 Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du

Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision.

Object Survey

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

1 code implementation13 Jan 2024 Haibin He, Maoyuan Ye, Jing Zhang, Juhua Liu, Bo Du, DaCheng Tao

In response to this issue, we propose to efficiently turn an off-the-shelf query-based image text spotter into a specialist on video and present a simple baseline termed GoMatching, which focuses the training efforts on tracking while maintaining strong recognition performance.

Text Detection Text Spotting

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

1 code implementation12 Jan 2024 Shuai Wang, Liang Ding, Li Shen, Yong Luo, Bo Du, DaCheng Tao

Advancing automated programming necessitates robust and comprehensive code generation benchmarks, yet current evaluation frameworks largely neglect object-oriented programming (OOP) in favor of functional programming (FP), e. g., HumanEval and MBPP.

Code Generation HumanEval

ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning

1 code implementation journal 2024 Sijun Dong, Libo Wang, Bo Du, Xiaoliang Meng

Following this trend, in this study, we introduce ChangeCLIP, a novel framework that leverages robust semantic information from image-text pairs, specifically tailored for Remote Sensing Change Detection (RSCD).

Change Detection Decoder +2

Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names

1 code implementation CVPR 2024 Yapeng Li, Yong Luo, Zengmao Wang, Bo Du

This motivates us to study GZSL in the more practical setting where unseen classes can be either similar or dissimilar to seen classes.

Generalized Zero-Shot Learning Test unseen

XAI for In-hospital Mortality Prediction via Multimodal ICU Data

1 code implementation29 Dec 2023 Xingqiao Li, Jindong Gu, Zhiyong Wang, Yancheng Yuan, Bo Du, Fengxiang He

To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.

Decision Making Mortality Prediction

Joint Learning Neuronal Skeleton and Brain Circuit Topology with Permutation Invariant Encoders for Neuron Classification

1 code implementation22 Dec 2023 Minghui Liao, Guojia Wan, Bo Du

Skeleton Encoder integrates the local information of neurons in a bottom-up manner, with a one-dimensional convolution in neural skeleton's point data; Connectome Encoder uses a graph neural network to capture the topological information of neural circuit; finally, Readout Layer fuses the above two information and outputs classification results.

Graph Neural Network

Sparse is Enough in Fine-tuning Pre-trained Large Language Models

1 code implementation19 Dec 2023 Weixi Song, Zuchao Li, Lefei Zhang, Hai Zhao, Bo Du

With the prevalence of pre-training-fine-tuning paradigm, how to efficiently adapt the pre-trained model to the downstream tasks has been an intriguing issue.

Language Modelling Large Language Model +1

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

1 code implementation11 Dec 2023 Anke Tang, Li Shen, Yong Luo, Liang Ding, Han Hu, Bo Du, DaCheng Tao

At the upper level, we focus on learning a shared Concrete mask to identify the subspace, while at the inner level, model merging is performed to maximize the performance of the merged model.

Meta-Learning Task Arithmetic

Exploring Sparsity in Graph Transformers

no code implementations9 Dec 2023 Chuang Liu, Yibing Zhan, Xueqi Ma, Liang Ding, Dapeng Tao, Jia Wu, Wenbin Hu, Bo Du

Graph Transformers (GTs) have achieved impressive results on various graph-related tasks.

Adapting Vision Transformer for Efficient Change Detection

no code implementations8 Dec 2023 Yang Zhao, Yuxiang Zhang, Yanni Dong, Bo Du

Most change detection models based on vision transformers currently follow a "pretraining then fine-tuning" strategy.

Change Detection

UniGS: Unified Representation for Image Generation and Segmentation

1 code implementation CVPR 2024 Lu Qi, Lehan Yang, Weidong Guo, Yu Xu, Bo Du, Varun Jampani, Ming-Hsuan Yang

On the other hand, the progressive dichotomy module can efficiently decode the synthesized colormap to high-quality entity-level masks in a depth-first binary search without knowing the cluster numbers.

Image Generation Segmentation

Careful Selection and Thoughtful Discarding: Graph Explicit Pooling Utilizing Discarded Nodes

no code implementations21 Nov 2023 Chuang Liu, Wenhang Yu, Kuang Gao, Xueqi Ma, Yibing Zhan, Jia Wu, Bo Du, Wenbin Hu

Graph pooling has been increasingly recognized as crucial for Graph Neural Networks (GNNs) to facilitate hierarchical graph representation learning.

Graph Representation Learning

Learning transformer-based heterogeneously salient graph representation for multimodal remote sensing image classification

no code implementations17 Nov 2023 Jiaqi Yang, Bo Du, Liangpei Zhang

Data collected by different modalities can provide a wealth of complementary information, such as hyperspectral image (HSI) to offer rich spectral-spatial properties, synthetic aperture radar (SAR) to provide structural information about the Earth's surface, and light detection and ranging (LiDAR) to cover altitude information about ground elevation.

Image Classification Remote Sensing Image Classification

Federated Learning for Generalization, Robustness, Fairness: A Survey and Benchmark

1 code implementation12 Nov 2023 Wenke Huang, Mang Ye, Zekun Shi, Guancheng Wan, He Li, Bo Du, Qiang Yang

In this survey, we provide a systematic overview of the important and recent developments of research on federated learning.

Fairness Federated Learning +2

Rotation Invariant Transformer for Recognizing Object in UAVs

3 code implementations ACM Multimedia 2022 Shuoyi Chen, Mang Ye, Bo Du

Existing methods are usually designed for city cameras, incapable of handing the rotation issue in UAV scenarios.

Object Person Re-Identification +1

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

no code implementations20 Oct 2023 Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem.

Language Modeling Language Modelling +1

Learn From Model Beyond Fine-Tuning: A Survey

1 code implementation12 Oct 2023 Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, DaCheng Tao

LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks.

Meta-Learning model +2

Imitation Learning from Purified Demonstrations

1 code implementation11 Oct 2023 Yunke Wang, Minjing Dong, Yukun Zhao, Bo Du, Chang Xu

In the first step, we apply a forward diffusion process to smooth potential noises in imperfect demonstrations by introducing additional noise.

Imitation Learning MuJoCo +1

Parameter Efficient Multi-task Model Fusion with Partial Linearization

1 code implementation7 Oct 2023 Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, DaCheng Tao

We demonstrate that our partial linearization technique enables a more effective fusion of multiple tasks into a single model, outperforming standard adapter tuning and task arithmetic alone.

parameter-efficient fine-tuning Task Arithmetic

Exchange means change: an unsupervised single-temporal change detection framework based on intra- and inter-image patch exchange

1 code implementation1 Oct 2023 Hongruixuan Chen, Jian Song, Chen Wu, Bo Du, Naoto Yokoya

Change detection (CD) is a critical task in studying the dynamics of ecosystems and human activities using multi-temporal remote sensing images.

Change Detection Image Enhancement +1

Generalizable Heterogeneous Federated Cross-Correlation and Instance Similarity Learning

2 code implementations28 Sep 2023 Wenke Huang, Mang Ye, Zekun Shi, Bo Du

Federated learning is an important privacy-preserving multi-party learning paradigm, involving collaborative learning with others and local updating on private data.

Domain Generalization Federated Learning +1

BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

1 code implementation31 Aug 2023 Qiang Huang, Jiawei Jiang, Xi Susie Rao, Ce Zhang, Zhichao Han, Zitao Zhang, Xin Wang, Yongjun He, Quanqing Xu, Yang Zhao, Chuang Hu, Shuo Shang, Bo Du

To handle graphs in which features or connectivities are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed.

Diversity Link Prediction +1

SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images

no code implementations28 Aug 2023 HaoNan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang

These CD methods, however, still perform far from satisfactorily as we observe that 1) deep encoder layers focus on irrelevant background regions and 2) the models' confidence in the change regions is inconsistent at different decoder stages.

Change Detection Decoder +1

Enhancing Visually-Rich Document Understanding via Layout Structure Modeling

1 code implementation15 Aug 2023 Qiwei Li, Zuchao Li, Xiantao Cai, Bo Du, Hai Zhao

In this paper, we propose GraphLayoutLM, a novel document understanding model that leverages the modeling of layout structure graph to inject document layout knowledge into the model.

document understanding

Rethinking the Localization in Weakly Supervised Object Localization

no code implementations11 Aug 2023 Rui Xu, Yong Luo, Han Hu, Bo Du, Jialie Shen, Yonggang Wen

Weakly supervised object localization (WSOL) is one of the most popular and challenging tasks in computer vision.

Object Weakly-Supervised Object Localization

Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation

1 code implementation28 Jul 2023 Zhihao LI, Jiancheng Yang, Yongchao Xu, Li Zhang, Wenhui Dong, Bo Du

Extensive experiments on both open-source and in-house datasets consistently demonstrate the effectiveness of the proposed method over some CNN and Transformer-based segmentation methods.

Image Segmentation Management +4

IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer

1 code implementation27 Jul 2023 Xiaochen Ma, Bo Du, Zhuohang Jiang, Xia Du, Ahmed Y. Al Hammadi, Jizhe Zhou

We term this simple but effective ViT paradigm IML-ViT, which has significant potential to become a new benchmark for IML.

Benchmarking Image Manipulation +1

PNT-Edge: Towards Robust Edge Detection with Noisy Labels by Learning Pixel-level Noise Transitions

1 code implementation26 Jul 2023 Wenjie Xuan, Shanshan Zhao, Yu Yao, Juhua Liu, Tongliang Liu, Yixin Chen, Bo Du, DaCheng Tao

Exploiting the estimated noise transitions, our model, named PNT-Edge, is able to fit the prediction to clean labels.

Edge Detection

DeepCL: Deep Change Feature Learning on Remote Sensing Images in the Metric Space

1 code implementation23 Jul 2023 HaoNan Guo, Bo Du, Chen Wu, Chengxi Han, Liangpei Zhang

To address these issues, we complement the strong temporal modeling ability of metric learning with the prominent fitting ability of segmentation and propose a deep change feature learning (DeepCL) framework for robust and explainable CD.

Change Detection Earth Observation +1

Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction

no code implementations23 Jul 2023 HaoNan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang

Compared with many existing methods that train each task individually, the proposed collaborative extraction method can utilize the complementary advantages between buildings and roads by the proposed inter-task and inter-scale feature interactions, and automatically select the optimal reception field for different tasks.

Expediting Building Footprint Extraction from High-resolution Remote Sensing Images via progressive lenient supervision

1 code implementation23 Jul 2023 HaoNan Guo, Bo Du, Chen Wu, Xin Su, Liangpei Zhang

The efficacy of building footprint segmentation from remotely sensed images has been hindered by model transfer effectiveness.

Decoder Segmentation

Heterogeneous Federated Learning: State-of-the-art and Research Challenges

2 code implementations20 Jul 2023 Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, DaCheng Tao

Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential.

Federated Learning Survey

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

1 code implementation2 Jul 2023 Yineng Chen, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao

SGD and Adam are two classical and effective optimizers on which researchers have proposed many variants, such as SGDM and RAdam.

On Exploring Node-feature and Graph-structure Diversities for Node Drop Graph Pooling

1 code implementation22 Jun 2023 Chuang Liu, Yibing Zhan, Baosheng Yu, Liu Liu, Bo Du, Wenbin Hu, Tongliang Liu

A pooling operation is essential for effective graph-level representation learning, where the node drop pooling has become one mainstream graph pooling technology.

Graph Classification Representation Learning

FSUIE: A Novel Fuzzy Span Mechanism for Universal Information Extraction

1 code implementation19 Jun 2023 Tianshuo Peng, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao

To address these deficiencies, we propose the Fuzzy Span Universal Information Extraction (FSUIE) framework.

UIE

Symmetric Uncertainty-Aware Feature Transmission for Depth Super-Resolution

1 code implementation1 Jun 2023 Wuxuan Shi, Mang Ye, Bo Du

(2) For the cross-modality gap, we propose a novel Symmetric Uncertainty scheme to remove parts of RGB information harmful to the recovery of HR depth maps.

Super-Resolution

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

1 code implementation31 May 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

Decoder Scene Text Detection +2

AIMS: All-Inclusive Multi-Level Segmentation

1 code implementation28 May 2023 Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.

Image Segmentation Segmentation +1

Self-Evolution Learning for Discriminative Language Model Pretraining

1 code implementation24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Masked language modeling, widely used in discriminative language model (e. g., BERT) pretraining, commonly adopts a random masking strategy.

Language Modeling Language Modelling +3

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

1 code implementation24 May 2023 Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, DaCheng Tao

Token dropping is a recently-proposed strategy to speed up the pretraining of masked language models, such as BERT, by skipping the computation of a subset of the input tokens at several middle layers.

Improving Heterogeneous Model Reuse by Density Estimation

1 code implementation23 May 2023 Anke Tang, Yong Luo, Han Hu, Fengxiang He, Kehua Su, Bo Du, Yixin Chen, DaCheng Tao

This paper studies multiparty learning, aiming to learn a model using the private data of different participants.

Density Estimation model +1

Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities

1 code implementation3 May 2023 Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, DaCheng Tao

With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages.

Survey

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

2 code implementations NeurIPS 2023 Di Wang, Jing Zhang, Bo Du, Minqiang Xu, Lin Liu, DaCheng Tao, Liangpei Zhang

In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS.

Instance Segmentation Object +4

Scalable Mask Annotation for Video Text Spotting

1 code implementation2 May 2023 Haibin He, Jing Zhang, Mengyang Xu, Juhua Liu, Bo Du, DaCheng Tao

Video text spotting refers to localizing, recognizing, and tracking textual elements such as captions, logos, license plates, signs, and other forms of text within consecutive video frames.

Text Spotting

HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search

1 code implementation23 Apr 2023 Di Wang, Bo Du, Liangpei Zhang, DaCheng Tao

Recent neural architecture search (NAS) based approaches have made great progress in hyperspectral image (HSI) classification tasks.

Neural Architecture Search

DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification

2 code implementations19 Apr 2023 Di Wang, Jing Zhang, Bo Du, Liangpei Zhang, DaCheng Tao

Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.

Hyperspectral Image Classification Image Generation

Dsfer-Net: A Deep Supervision and Feature Retrieval Network for Bitemporal Change Detection Using Modern Hopfield Networks

1 code implementation3 Apr 2023 Shizhen Chang, Michael Kopp, Pedram Ghamisi, Bo Du

Based on the sequential geographical information of the bitemporal images, we designed a feature retrieval module to extract difference features and leverage discriminative information in a deeply supervised manner.

Change Detection Retrieval

Unsupervised Cross-domain Pulmonary Nodule Detection without Source Data

1 code implementation3 Apr 2023 Rui Xu, Yong Luo, Bo Du

Cross-domain pulmonary nodule detection suffers from performance degradation due to a large shift of data distributions between the source and target domain.

Contrastive Learning object-detection +2

EMS-Net: Efficient Multi-Temporal Self-Attention For Hyperspectral Change Detection

no code implementations24 Mar 2023 Meiqi Hu, Chen Wu, Bo Du

Hyperspectral change detection plays an essential role of monitoring the dynamic urban development and detecting precise fine object evolution and alteration.

Change Detection Clustering

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

4 code implementations CVPR 2023 Zhuo Huang, Miaoxi Zhu, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Bo Du, Tongliang Liu

Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.