Search Results for author: Lei Zhu

Found 213 papers, 105 papers with code

AdvMIM: Adversarial Masked Image Modeling for Semi-Supervised Medical Image Segmentation

no code implementations25 Jun 2025 Lei Zhu, Jun Zhou, Rick Siow Mong Goh, Yong liu

To this end, we propose to construct an auxiliary masked domain from original domain with masked image modeling and train the transformer to predict the entire segmentation mask with masked inputs to increase supervision signal.

Image Segmentation Segmentation +2

Generalizing Vision-Language Models to Novel Domains: A Comprehensive Survey

no code implementations23 Jun 2025 Xinyao Li, Jingjing Li, Fengling Li, Lei Zhu, Yang Yang, Heng Tao Shen

Popular benchmarks for VLM generalization are further introduced with thorough performance comparisons among the reviewed methods.

Benchmarking Survey +1

Long Coalition Leads to Shrink? The Roles of Tipping and Technology-Sharing in Climate Clubs

no code implementations19 Jun 2025 Lei Zhu, Zhihao Yan, Hongbo Duan, Yongyang Cai, Xiaobing Zhang

This framework highlights the critical role of technology-sharing in fostering long-term climate cooperation under climate tipping uncertainties.

PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

no code implementations12 Jun 2025 Sixiang Chen, Jianyu Lai, Jialin Gao, Tian Ye, Haoyu Chen, Hengyu Shi, Shitong Shao, Yunlong Lin, Song Fei, Zhaohu Xing, Yeying Jin, Junfeng Luo, Xiaoming Wei, Lei Zhu

Generating aesthetic posters is more challenging than simple design images: it requires not only precise text rendering but also the seamless integration of abstract artistic content, striking layouts, and overall stylistic harmony.

Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining

no code implementations5 Jun 2025 Yong Sun, Yipeng Wang, Junyu Shi, Zhiyuan Zhang, Yanmei Xiao, Lei Zhu, Manxi Jiang, Qiang Nie

To bridge this gap, we propose a new task called Video-Based Embryo Grading - the first paradigm that directly utilizes full-length time-lapse monitoring (TLM) videos to predict embryologists' overall quality assessments.

PhotoArtAgent: Intelligent Photo Retouching with Language Model-Based Artist Agents

no code implementations29 May 2025 Haoyu Chen, Keda Tao, Yizao Wang, Xinlei Wang, Lei Zhu, Jinjin Gu

Photo retouching is integral to photographic art, extending far beyond simple technical fixes to heighten emotional expression and narrative depth.

Language Modeling Language Modelling +1

MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE

no code implementations26 May 2025 Zongle Huang, Lei Zhu, Zongyuan Zhan, Ting Hu, Weikai Mao, Xianzhi Yu, Yongpan Liu, Tianyu Zhang

In this work, we first demonstrate that, under medium batch sizes, MoE surprisingly benefits more from SD than dense models.

Mixture-of-Experts

Faster and Better LLMs via Latency-Aware Test-Time Scaling

no code implementations26 May 2025 Zili Wang, Tianyu Zhang, Haoli Bai, Lu Hou, Xianzhi Yu, Wulong Liu, Shiming Xiang, Lei Zhu

By integrating these two approaches and allocating computational resources properly to each, our latency-optimal TTS enables a 32B model to reach 82. 3% accuracy on MATH-500 within 1 minute and a smaller 3B model to achieve 72. 4% within 10 seconds.

Math

Semantic-enhanced Co-attention Prompt Learning for Non-overlapping Cross-Domain Recommendation

1 code implementation25 May 2025 Lei Guo, Chenlong Song, Feng Guo, Xiaohui Han, Xiaojun Chang, Lei Zhu

Given the above challenges, we introduce the prompt learning technique for Many-to-one Non-overlapping Cross-domain Sequential Recommendation (MNCSR) and propose a Text-enhanced Co-attention Prompt Learning Paradigm (TCPLP).

Prompt Learning Sequential Recommendation +1

Rethinking Graph Out-Of-Distribution Generalization: A Learnable Random Walk Perspective

no code implementations9 May 2025 Henan Sun, Xunkai Li, Lei Zhu, Junyi Han, Guang Zeng, RongHua Li, Guoren Wang

In this paper, we advocate the learnable random walk (LRW) perspective as the instantiation of invariant knowledge, and propose LRW-OOD to realize graph OOD generalization learning.

Density Estimation Out-of-Distribution Generalization

Novel Extraction of Discriminative Fine-Grained Feature to Improve Retinal Vessel Segmentation

1 code implementation6 May 2025 Shuang Zeng, Chee Hong Lee, Micky C Nnamdi, Wenqi Shi, J Ben Tamo, Lei Zhu, Hangzhou He, Xinliang Zhang, Qian Chen, May D. Wang, Yanye Lu, Qiushi Ren

AttUKAN achieves F1 scores of 82. 50%, 81. 14%, 81. 34%, 80. 21% and 80. 09%, along with MIoU scores of 70. 24%, 68. 64%, 68. 59%, 67. 21% and 66. 94% in the above datasets, which are the highest compared to 11 networks for retinal vessel segmentation.

Kolmogorov-Arnold Networks Retinal Vessel Segmentation +1

Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis

no code implementations20 Apr 2025 Jingjing Ren, Wenbo Li, Zhongdao Wang, Haoze Sun, Bangzhen Liu, Haoyu Chen, Jiaqi Xu, Aoxue Li, Shifeng Zhang, Bin Shao, Yong Guo, Lei Zhu

Compared to existing methods, Turbo2K is up to 20$\times$ faster for inference, making high-resolution video generation more scalable and practical for real-world applications.

2k Knowledge Distillation +2

An Empirical Study of GPT-4o Image Generation Capabilities

1 code implementation8 Apr 2025 Sixiang Chen, Jinbin Bai, Zhuoran Zhao, Tian Ye, Qingyu Shi, Donghao Zhou, Wenhao Chai, Xin Lin, Jianzong Wu, Chao Tang, Shilin Xu, Tao Zhang, Haobo Yuan, Yikang Zhou, Wei Chow, Linfeng Li, Xiangtai Li, Lei Zhu, Lu Qi

The landscape of image generation has rapidly evolved, from early GAN-based approaches to diffusion models and, most recently, to unified generative architectures that seek to bridge understanding and generation tasks.

Benchmarking Image Generation +3

Federated Semantic Learning for Privacy-preserving Cross-domain Recommendation

1 code implementation29 Mar 2025 Ziang Lu, Lei Guo, Xu Yu, Zhiyong Cheng, Xiaohui Han, Lei Zhu

In the evolving landscape of recommender systems, the challenge of effectively conducting privacy-preserving Cross-Domain Recommendation (CDR), especially under strict non-overlapping constraints, has emerged as a key focus.

Privacy Preserving Recommendation Systems

POSTA: A Go-to Framework for Customized Artistic Poster Generation

no code implementations CVPR 2025 Haoyu Chen, Xiaojie Xu, Wenbo Li, Jingjing Ren, Tian Ye, Songhua Liu, Ying-Cong Chen, Lei Zhu, Xinchao Wang

To train our models, we develop the PosterArt dataset, comprising high-quality artistic posters annotated with layout, typography, and pixel-level stylized text segmentation.

Text Segmentation

Exploiting Inherent Class Label: Towards Robust Scribble Supervised Semantic Segmentation

2 code implementations18 Mar 2025 Xinliang Zhang, Lei Zhu, Shuang Zeng, Hangzhou He, Ourui Fu, Zhengjian Yao, Zhaoheng Xie, Yanye Lu

Scribble-based weakly supervised semantic segmentation leverages only a few annotated pixels as labels to train a segmentation model, presenting significant potential for reducing the human labor involved in the annotation process.

Segmentation Weakly supervised Semantic Segmentation +1

RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images

1 code implementation CVPR 2025 Junjin Xiao, Qing Zhang, Yonewei Nie, Lei Zhu, Wei-Shi Zheng

To account for possible misalignment between SMPL model and images, we propose to predict image-aligned 3D prior points by leveraging both pixel-level features and voxel-level features, from which we regress the coarse Gaussians.

Novel View Synthesis

Federated Mixture-of-Expert for Non-Overlapped Cross-Domain Sequential Recommendation

no code implementations17 Mar 2025 Yu Liu, Hanbin Jiang, Lei Zhu, Yu Zhang, Yuqi Mao, Jiangxia Cao, Shuchao Pang

In the real world, users always have multiple interests while surfing different services to enrich their daily lives, e. g., watching hot short videos/live streamings.

Federated Learning Privacy Preserving +1

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

no code implementations7 Mar 2025 Hongwei Yi, Tian Ye, Shitong Shao, Xuancheng Yang, Jiantong Zhao, Hanzhong Guo, Terrance Wang, Qingyu Yin, Zeke Xie, Lei Zhu, Wei Li, Michael Lingelbach, Daquan Zhou

We present MagicInfinite, a novel diffusion Transformer (DiT) framework that overcomes traditional portrait animation limitations, delivering high-fidelity results across diverse character types-realistic humans, full-body figures, and stylized anime characters.

Denoising Portrait Animation +1

Partially Supervised Unpaired Multi-Modal Learning for Label-Efficient Medical Image Segmentation

no code implementations7 Mar 2025 Lei Zhu, Yanyu Xu, Huazhu Fu, Xinxing Xu, Rick Siow Mong Goh, Yong liu

Specifically, our framework consists of a compact segmentation network with modality specific normalization layers for learning with partially labeled unpaired multi-modal data.

Image Segmentation Medical Image Analysis +5

Scalable Reinforcement Learning for Virtual Machine Scheduling

no code implementations1 Mar 2025 Junjie Sheng, Jiehao Wu, Haochuan Cui, Yiqiu Hu, Wenli Zhou, Lei Zhu, Qian Peng, Wenhao Li, Xiangfeng Wang

This paper introduces a scalable RL framework, called Cluster Value Decomposition Reinforcement Learning (CVD-RL), to surmount the scalability hurdles inherent in large-scale VMS.

Cloud Computing reinforcement-learning +3

TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction

no code implementations16 Feb 2025 Yunfei Liu, Lei Zhu, Lijian Lin, Ye Zhu, Ailing Zhang, Yu Li

3D facial reconstruction from a single in-the-wild image is a crucial task in human-centered computer vision tasks.

Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents

1 code implementation3 Feb 2025 Zhizhen Zhang, Lei Zhu, Zhen Fang, Zi Huang, Yadan Luo

Pre-training vision-language representations on human action videos has emerged as a promising approach to reduce reliance on large-scale expert demonstrations for training embodied agents.

Contrastive Learning Imitation Learning

Learning Semantic Facial Descriptors for Accurate Face Animation

no code implementations29 Jan 2025 Lei Zhu, Yuanqi Chen, Xiaohang Liu, Thomas H. Li, Ge Li

Our approach successfully addresses the issue of model-based methods' limitations in high-fidelity identity and the challenges faced by model-free methods in accurate motion transfer.

Toward Model-centric Heterogeneous Federated Graph Learning: A Knowledge-driven Approach

no code implementations22 Jan 2025 Huilin Lai, Guang Zeng, Xunkai Li, Xudong Shen, Yinlin Zhu, Ye Luo, Jianwei Lu, Lei Zhu

Federated graph learning (FGL) has emerged as a promising paradigm for collaborative machine learning, enabling multiple parties to jointly train models while preserving the privacy of raw graph data.

Diversity Graph Learning +1

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer

1 code implementation9 Jan 2025 Hangzhou He, Lei Zhu, Xinliang Zhang, Shuang Zeng, Qian Chen, Yanye Lu

Concept Bottleneck Models (CBMs) offer inherent interpretability by initially translating images into human-comprehensible concepts, followed by a linear combination of these concepts for classification.

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

1 code implementation CVPR 2025 Zhaohu Xing, Lihao Liu, Yijun Yang, Hongqiu Wang, Tian Ye, Sixiang Chen, Wenxue Li, Guang Liu, Lei Zhu

To effectively exploit this unlabeled dataset, we propose the first semi-supervised framework (namely an iterative data engine) consisting of four steps: (1) mirror detection model training, (2) pseudo label prediction, (3) dual guidance scoring, and (4) selection of highly reliable pseudo labels.

Mirror Detection Pseudo Label

DRIVE: Dual-Robustness via Information Variability and Entropic Consistency in Source-Free Unsupervised Domain Adaptation

no code implementations24 Nov 2024 Ruiqiang Xiao, Songning Lai, Yijun Yang, Jiemin Wu, Yutao Yue, Lei Zhu

The adaptation process has two stages: the first aligns the models on stable features using a mutual information consistency loss, and the second dynamically adjusts the perturbation level based on the loss from the first stage, encouraging the model to explore a broader range of the target domain while preserving existing performance.

Autonomous Driving Unsupervised Domain Adaptation

Revisiting the Integration of Convolution and Attention for Vision Backbone

1 code implementation21 Nov 2024 Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau

Specifically, in each layer, we use two different ways to represent an image: a fine-grained regular grid and a coarse-grained set of semantic slots.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Federated Domain Generalization via Prompt Learning and Aggregation

1 code implementation15 Nov 2024 Shuai Gong, Chaoran Cui, Chunyun Zhang, Wenna Wang, Xiushan Nie, Lei Zhu

Specifically, we propose a novel FedDG framework through Prompt Learning and AggregatioN (PLAN), which comprises two training stages to collaboratively generate local prompts and global prompts at each federated round.

Domain Generalization Privacy Preserving +2

UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation

no code implementations13 Nov 2024 ChengYuan Zhang, Yilin Zhang, Lei Zhu, Deyin Liu, Lin Wu, Bo Li, Shichao Zhang, Mohammed Bennamoun, Farid Boussaid

This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture.

Decoder Few-Shot Object Detection +5

NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction

1 code implementation25 Oct 2024 Zixuan Gong, Guangyin Bao, Qi Zhang, Zhongwei Wan, Duoqian Miao, Shoujin Wang, Lei Zhu, Changwei Wang, Rongtao Xu, Liang Hu, Ke Liu, Yu Zhang

We contend that the key to addressing these challenges lies in accurately decoding both high-level semantics and low-level perception flows, as perceived by the brain in response to video stimuli.

SSIM Video Reconstruction

Deep Class-guided Hashing for Multi-label Cross-modal Retrieval

1 code implementation20 Oct 2024 Hao Chen, Lei Zhu, Xinghui Zhu

Deep hashing, due to its low cost and efficient retrieval advantages, is widely valued in cross-modal retrieval.

Cross-Modal Retrieval Deep Hashing

A New Perspective to Boost Performance Fairness for Medical Federated Learning

1 code implementation12 Oct 2024 Yunlu Yan, Lei Zhu, Yuexiang Li, Xinxing Xu, Rick Siow Mong Goh, Yong liu, Salman Khan, Chun-Mei Feng

However, existing fair FL methods ignore the specific characteristics of medical FL applications, i. e., domain shift among the datasets from different hospitals.

Fairness Federated Learning +3

Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization

1 code implementation10 Oct 2024 Hongtao Wu, Yijun Yang, Angelica I Aviles-Rivero, Jingjing Ren, Sixiang Chen, Haoyu Chen, Lei Zhu

Specifically, we construct a real-world dataset with 85 snowy videos, and then present a Semi-supervised Video Desnowing Network (SemiVDN) equipped by a novel Distribution-driven Contrastive Regularization.

Ranked #2 on Snow Removal on RVSD (using extra training data)

Snow Removal

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

1 code implementation10 Oct 2024 Jinbin Bai, Tian Ye, Wei Chow, Enxin Song, Qing-Guo Chen, Xiangtai Li, Zhen Dong, Lei Zhu, Shuicheng Yan

We present Meissonic, which elevates non-autoregressive masked image modeling (MIM) text-to-image to a level comparable with state-of-the-art diffusion models like SDXL.

Feature Compression Image Generation

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning

1 code implementation3 Oct 2024 Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie

In this paper, we introduce a novel bilateral backdoor to fill in the missing pieces of the puzzle in the cross-modal backdoor and propose a generalized invisible backdoor framework against cross-modal learning (BadCM).

Backdoor Attack Cross-Modal Retrieval +1

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

no code implementations24 Sep 2024 Sixiang Chen, Tian Ye, Kai Zhang, Zhaohu Xing, Yunlong Lin, Lei Zhu

Recent advancements in adverse weather restoration have shown potential, yet the unpredictable and varied combinations of weather degradations in the real world pose significant challenges.

Computational Efficiency Prompt Learning

Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning

1 code implementation11 Sep 2024 Yingling Lu, Yijun Yang, Zhaohu Xing, Qiong Wang, Lei Zhu

We incorporate multi-task supervision into diffusion models to promote the discrimination of diffusion models on pixel-by-pixel segmentation.

Segmentation Video Polyp Segmentation

Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model

no code implementations6 Sep 2024 Hongqiu Wang, Yixian Chen, Wu Chen, Huihui Xu, Haoyu Zhao, Bin Sheng, Huazhu Fu, Guang Yang, Lei Zhu

Based on the above observations, we first devise a Serpentine Interwoven Adaptive (SIA) scan mechanism, which scans UWF-SLO images along curved vessel structures in a snake-like crawling manner.

Mamba Retinal Vessel Segmentation

SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing

no code implementations5 Sep 2024 Lingyu Xiong, Xize Cheng, Jintao Tan, Xianjia Wu, Xiandong Li, Lei Zhu, Fei Ma, Minglei Li, Huang Xu, Zhihu Hu

Ultimately, we inject the previously generated talking segmentation and style codes into a mask-guided StyleGAN to synthesize video frame.

Facial Editing Segmentation +1

Timeline and Boundary Guided Diffusion Network for Video Shadow Detection

1 code implementation21 Aug 2024 Haipeng Zhou, Honqiu Wang, Tian Ye, Zhaohu Xing, Jun Ma, Ping Li, Qiong Wang, Lei Zhu

Moreover, we are the first to introduce the Diffusion model for VSD in which we explore a Space-Time Encoded Embedding (STEE) to inject the temporal guidance for Diffusion to conduct shadow detection.

Shadow Detection Video Shadow Detection

Language-Driven Interactive Shadow Detection

1 code implementation16 Aug 2024 Hongqiu Wang, Wei Wang, Haipeng Zhou, Huihui Xu, Shaozhi Wu, Lei Zhu

Based on this dataset, we propose a Referring Shadow-Track Memory Network (RSM-Net) for addressing the RVSD task.

Descriptive Shadow Detection +2

Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation

no code implementations3 Aug 2024 Jintao Tan, Xize Cheng, Lingyu Xiong, Lei Zhu, Xiandong Li, Xianjia Wu, Kai Gong, Minglei Li, Yi Cai

Audio-driven talking head generation is a significant and challenging task applicable to various fields such as virtual avatars, film production, and online conferences.

Denoising Talking Head Generation

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

1 code implementation31 Jul 2024 Hongtao Wu, Yijun Yang, Huihui Xu, Weiming Wang, Jinni Zhou, Lei Zhu

Recently, the linear-complexity operator of the state space models (SSMs) has contrarily facilitated efficient long-term temporal modeling, which is crucial for rain streaks and raindrops removal in videos.

Optical Flow Estimation Rain Removal +3

RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models

no code implementations25 Jul 2024 Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Sixiang Chen, Tian Ye, Renjing Pei, Kaiwen Zhou, Fenglong Song, Lei Zhu

RestoreAgent autonomously assesses the type and extent of degradation in input images and performs restoration through (1) determining the appropriate restoration tasks, (2) optimizing the task sequence, (3) selecting the most suitable models, and (4) executing the restoration.

Image Restoration Low-Light Image Enhancement

AGLLDiff: Guiding Diffusion Models Towards Unsupervised Training-free Real-world Low-light Image Enhancement

no code implementations20 Jul 2024 Yunlong Lin, Tian Ye, Sixiang Chen, Zhenqi Fu, Yingying Wang, Wenhao Chai, Zhaohu Xing, Lei Zhu, Xinghao Ding

Existing low-light image enhancement (LIE) methods have achieved noteworthy success in solving synthetic distortions, yet they often fall short in practical applications.

Attribute Low-Light Image Enhancement

UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

no code implementations2 Jul 2024 Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu

Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands.

Computational Efficiency Denoising +1

Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation

no code implementations19 Jun 2024 Qian Chen, Lei Zhu, Hangzhou He, Xinliang Zhang, Shuang Zeng, Qiushi Ren, Yanye Lu

However, the incorrect pseudo-labels may corrupt the learned feature and lead to a new problem that the better the model is trained on the old task, the poorer the model performs on the new tasks.

Continual Learning Image Segmentation +3

ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection

1 code implementation18 Jun 2024 Junhao Lin, Lei Zhu, Jiaxing Shen, Huazhu Fu, Qing Zhang, Liansheng Wang

However, the existing salient object detection (SOD) works only focus on either static RGB-D images or RGB videos, ignoring the collaborating of RGB-D and video information.

object-detection Salient Object Detection +4

Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

1 code implementation17 Jun 2024 Lei Zhu, Fangyun Wei, Yanye Lu, Dong Chen

We demonstrate the superior performance of our model over its counterparts across a variety of tasks, including image reconstruction, image classification, auto-regressive image generation using GPT, and image creation with diffusion- and flow-based generative models.

image-classification Image Classification +3

Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner

no code implementations5 Jun 2024 Qiang Nie, WeiFu Fu, Yuhuan Lin, Jialin Li, Yifeng Zhou, Yong liu, Lei Zhu, Chengjie Wang

Two issues have to be tackled in the new IIL setting: 1) the notorious catastrophic forgetting because of no access to old data, and 2) broadening the existing decision boundary to new observations because of concept drift.

class-incremental learning Class Incremental Learning +2

SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications

no code implementations29 Apr 2024 Liang Xu, Lei Zhu, Yaotong Wu, Hang Xue

The SuperCLUE-Fin (SC-Fin) benchmark is a pioneering evaluation framework tailored for Chinese-native financial large language models (FLMs).

Computational Efficiency Logical Reasoning +1

MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction

no code implementations19 Apr 2024 Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Ke Liu, Liang Hu, Duoqian Miao

Decoding natural visual scenes from brain activity has flourished, with extensive research in single-subject tasks and, however, less in cross-subject tasks.

Image Reconstruction Text Retrieval

DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving

no code implementations19 Apr 2024 Sheng Wang, Ge Sun, Fulong Ma, Tianshuai Hu, Qiang Qin, Yongkang Song, Lei Zhu, Junwei Liang

Inspired by DragGAN in image generation, we propose DragTraffic, a generalized, interactive, and controllable traffic scene generation framework based on conditional diffusion.

Autonomous Driving Diversity +2

Disentangled Cascaded Graph Convolution Networks for Multi-Behavior Recommendation

1 code implementation17 Apr 2024 Zhiyong Cheng, Jianhua Dong, Fan Liu, Lei Zhu, Xun Yang, Meng Wang

Furthermore, these models overlook the personalized nature of user behavioral preferences by employing uniform transformation networks for all users and items.

Recommendation Systems

Dynamic Backtracking in GFlowNets: Enhancing Decision Steps with Reward-Dependent Adjustment Mechanisms

no code implementations8 Apr 2024 Shuai Guo, Jielei Chu, Lei Zhu, Zhaoyu Li, Tianrui Li

This paper introduces a novel variant of GFNs, the Dynamic Backtracking GFN (DB-GFN), which improves the adaptability of decision-making steps through a reward-based dynamic backtracking mechanism.

Decision Making

Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields

no code implementations CVPR 2024 Haoyuan Wang, WenBo Hu, Lei Zhu, Rynson W. H. Lau

Our method has two stages: the geometry of the target object and the pre-filtered environmental radiance fields are reconstructed in the first stage, and materials of the target object are estimated in the second stage with the proposed NeP and material-aware cone sampling strategy.

Inverse Rendering NeRF +1

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration

no code implementations17 Mar 2024 Zhihao Liang, Qi Zhang, WenBo Hu, Ying Feng, Lei Zhu, Kui Jia

This is because 3DGS treats each pixel as an isolated, single point rather than as an area, causing insensitivity to changes in the footprints of pixels.

3DGS

Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal

1 code implementation CVPR 2024 Yijun Yang, Hongtao Wu, Angelica I. Aviles-Rivero, Yulun Zhang, Jing Qin, Lei Zhu

Although ViWS-Net is proposed to remove adverse weather conditions in videos with a single set of pre-trained weights, it is seriously blinded by seen weather at train-time and degenerates when coming to unseen weather during test-time.

Test-time Adaptation

Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

1 code implementation CVPR 2024 Lei Zhu, Fangyun Wei, Yanye Lu

To achieve this, we present the Vision-to-Language Tokenizer, abbreviated as V2T Tokenizer, which transforms an image into a ``foreign language'' with the combined aid of an encoder-decoder, the LLM vocabulary, and a CLIP model.

Deblurring Decoder +7

Agile Multi-Source-Free Domain Adaptation

1 code implementation8 Mar 2024 Xinyao Li, Jingjing Li, Fengling Li, Lei Zhu, Ke Lu

Efficiently utilizing rich knowledge in pretrained models has become a critical topic in the era of large models.

Source-Free Domain Adaptation Specificity

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

no code implementations CVPR 2024 Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu

Leveraging unseen LR images for self-supervised learning guides the model to adapt its modeling space to the target domain, facilitating fine-tuning of SR models without requiring paired high-resolution (HR) images.

Image Super-Resolution Self-Supervised Learning

Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation

no code implementations CVPR 2024 Zhekai Du, Xinyao Li, Fengling Li, Ke Lu, Lei Zhu, Jingjing Li

Specifically, the image contextual information is utilized to prompt the language branch in a domain-agnostic and instance-conditioned way.

Transfer Learning Unsupervised Domain Adaptation

Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label

1 code implementation27 Feb 2024 Xinliang Zhang, Lei Zhu, Hangzhou He, Lujia Jin, Yanye Lu

In this study, we propose a class-driven scribble promotion network, which utilizes both scribble annotations and pseudo-labels informed by image-level classes and global semantics for supervision.

Segmentation Weakly supervised Semantic Segmentation +1

RelayAttention for Efficient Large Language Model Serving with Long System Prompts

1 code implementation22 Feb 2024 Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau

To eliminate such a redundancy, we propose RelayAttention, an attention algorithm that allows reading these hidden states from DRAM exactly once for a batch of input tokens.

Language Modeling Language Modelling +1

Data and Physics driven Deep Learning Models for Fast MRI Reconstruction: Fundamentals and Methodologies

no code implementations29 Jan 2024 Jiahao Huang, Yinzhe Wu, Fanwen Wang, Yingying Fang, Yang Nan, Cagan Alkan, Daniel Abraham, Congyu Liao, Lei Xu, Zhifan Gao, Weiwen Wu, Lei Zhu, Zhaolin Chen, Peter Lally, Neal Bangerter, Kawin Setsompop, Yike Guo, Daniel Rueckert, Ge Wang, Guang Yang

Magnetic Resonance Imaging (MRI) is a pivotal clinical diagnostic tool, yet its extended scanning times often compromise patient comfort and image quality, especially in volumetric, temporal and quantitative scans.

Diagnostic Federated Learning +1

Vivim: a Video Vision Mamba for Medical Video Segmentation

1 code implementation25 Jan 2024 Yijun Yang, Zhaohu Xing, Chunwang Huang, Lei Zhu

To this end, this paper presents a Video Vision Mamba-based framework, dubbed as Vivim, for medical video segmentation tasks.

Lesion Segmentation Mamba +5

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

1 code implementation24 Jan 2024 Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, Lei Zhu

Our SegMamba, in contrast to Transformer-based methods, excels in whole volume feature modeling from a state space model standpoint, maintaining superior processing speed, even with volume features at a resolution of {$64\times 64\times 64$}.

Image Segmentation Mamba +2

SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese

1 code implementation22 Jan 2024 Liang Xu, Hang Xue, Lei Zhu, Kangkang Zhao

We introduce SuperCLUE-Math6(SC-Math6), a new benchmark dataset to evaluate the mathematical reasoning abilities of Chinese language models.

Diversity GSM8K +2

EPA: Neural Collapse Inspired Robust Out-of-Distribution Detector

no code implementations3 Jan 2024 Jiawei Zhang, Yufan Chen, Cheng Jin, Lei Zhu, Yuantao Gu

Out-of-distribution (OOD) detection plays a crucial role in ensuring the security of neural networks.

Out of Distribution (OOD) Detection

Learning Diffusion Texture Priors for Image Restoration

no code implementations CVPR 2024 Tian Ye, Sixiang Chen, Wenhao Chai, Zhaohu Xing, Jing Qin, Ge Lin, Lei Zhu

When adopting diffusion models for image restoration the crucial challenge lies in how to preserve high-level image fidelity in the randomness diffusion process and generate accurate background structures and realistic texture details.

Image Generation Image Restoration

Towards Flexible, Scalable, and Adaptive Multi-Modal Conditioned Face Synthesis

no code implementations26 Dec 2023 Jingjing Ren, Cheng Xu, Haoyu Chen, Xinran Qin, Lei Zhu

Recent progress in multi-modal conditioned face synthesis has enabled the creation of visually striking and accurately aligned facial images.

Denoising Face Generation

Lite-Mind: Towards Efficient and Robust Brain Representation Network

1 code implementation6 Dec 2023 Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Yu Zhang, Ke Liu, Liang Hu, Duoqian Miao

The limited data availability and the low signal-to-noise ratio of fMRI signals lead to the challenging task of fMRI-to-image retrieval.

Brain Decoding Image Retrieval +3

GDTS: Goal-Guided Diffusion Model with Tree Sampling for Multi-Modal Pedestrian Trajectory Prediction

no code implementations25 Nov 2023 Ge Sun, Sheng Wang, Lei Zhu, Ming Liu, Jun Ma

To address these challenges and facilitate the use of diffusion models in multi-modal trajectory prediction, we propose GDTS, a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction.

Autonomous Driving Denoising +3

SC-Safety: A Multi-round Open-ended Question Adversarial Safety Benchmark for Large Language Models in Chinese

no code implementations9 Oct 2023 Liang Xu, Kangkang Zhao, Lei Zhu, Hang Xue

To systematically assess the safety of Chinese LLMs, we introduce SuperCLUE-Safety (SC-Safety) - a multi-round adversarial benchmark with 4912 open-ended questions covering more than 20 safety sub-dimensions.

Model Selection Natural Language Understanding

Shifting More Attention to Breast Lesion Segmentation in Ultrasound Videos

1 code implementation3 Oct 2023 Junhao Lin, Qian Dai, Lei Zhu, Huazhu Fu, Qiong Wang, Weibin Li, Wenhao Rao, Xiaoyang Huang, Liansheng Wang

We also devise a localization-based contrastive loss to reduce the lesion location distance between neighboring video frames within the same video and enlarge the location distances between frames from different ultrasound videos.

Lesion Segmentation Segmentation +1

Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation

1 code implementation ICCV 2023 Yijun Yang, Angelica I. Aviles-Rivero, Huazhu Fu, Ye Liu, Weiming Wang, Lei Zhu

In this work, we propose the first framework for restoring videos from all adverse weather conditions by developing a video adverse-weather-component suppression network (ViWS-Net).

Decoder

Multi-level Asymmetric Contrastive Learning for Volumetric Medical Image Segmentation Pre-training

no code implementations21 Sep 2023 Shuang Zeng, Lei Zhu, Xinliang Zhang, Qian Chen, Hangzhou He, Lujia Jin, Zifeng Tian, Qiushi Ren, Zhaoheng Xie, Yanye Lu

Moreover, we develop a multi-level contrastive learning strategy that integrates correspondences across feature-level, image-level, and pixel-level representations to ensure the encoder and decoder capture comprehensive details from representations of varying scales and granularities during the pre-training phase.

Contrastive Learning Decoder +4

Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning

no code implementations18 Sep 2023 Lei Zhu, Zhanghan Ke, Rynson Lau

In this work, we observe that the distribution gap between the confidence values of correct and incorrect pseudo labels emerges at the very beginning of the training, which can be utilized to filter pseudo labels.

Pseudo Label Pseudo Label Filtering

Cross-Modal Retrieval: A Systematic Review of Methods and Future Directions

1 code implementation28 Aug 2023 Tianshi Wang, Fengling Li, Lei Zhu, Jingjing Li, Zheng Zhang, Heng Tao Shen

With the exponential surge in diverse multi-modal data, traditional uni-modal retrieval methods struggle to meet the needs of users seeking access to data across various modalities.

Cross-Modal Retrieval Retrieval

Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks

1 code implementation ICCV 2023 Sixiang Chen, Tian Ye, Jinbin Bai, ErKang Chen, Jun Shi, Lei Zhu

In the real world, image degradations caused by rain often exhibit a combination of rain streaks and raindrops, thereby increasing the challenges of recovering the underlying clean image.

Rain Removal

Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction

no code implementations20 Aug 2023 Yunlu Yan, Chun-Mei Feng, Yuexiang Li, Rick Siow Mong Goh, Lei Zhu

In this paper, we propose a novel communication-efficient federated learning framework, namely Fed-PMG, to address the missing modality challenge in federated multi-modal MRI reconstruction.

Federated Learning MRI Reconstruction

Rethinking Client Drift in Federated Learning: A Logit Perspective

no code implementations20 Aug 2023 Yunlu Yan, Chun-Mei Feng, Mang Ye, WangMeng Zuo, Ping Li, Rick Siow Mong Goh, Lei Zhu, C. L. Philip Chen

Concretely, FedCSD introduces a class prototype similarity distillation to align the local logits with the refined global logits that are weighted by the similarity between local logits and the global prototype.

Federated Learning

Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery

no code implementations18 Aug 2023 Hongqiu Wang, Lei Zhu, Guang Yang, Yike Guo, Shichen Zhang, Bo Xu, Yueming Jin

Our method is verified on these datasets, and experimental results exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods.

Robot Navigation Segmentation

Branches Mutual Promotion for End-to-End Weakly Supervised Semantic Segmentation

no code implementations9 Aug 2023 Lei Zhu, Hangzhou He, Xinliang Zhang, Qian Chen, Shuang Zeng, Qiushi Ren, Yanye Lu

Existing methods adopt an online-trained classification branch to provide pseudo annotations for supervising the segmentation branch.

Classification Segmentation +3

SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark

no code implementations27 Jul 2023 Liang Xu, Anqi Li, Lei Zhu, Hang Xue, Changtai Zhu, Kangkang Zhao, Haonan He, Xuanwei Zhang, Qiyue Kang, Zhenzhong Lan

We fill this gap by proposing a comprehensive Chinese benchmark SuperCLUE, named after another popular Chinese LLM benchmark CLUE.

Language Modeling Language Modelling +2

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

no code implementations CVPR 2025 Yunlu Yan, Huazhu Fu, Yuexiang Li, Jinheng Xie, Jun Ma, Guang Yang, Lei Zhu

In this paper, we focus on the feature distribution skewed FL scenario, a common non-IID situation in real-world applications where data from different clients exhibit varying underlying distributions.

Data Augmentation Federated Learning

Cross-Modal Vertical Federated Learning for MRI Reconstruction

no code implementations5 Jun 2023 Yunlu Yan, Hong Wang, Yawen Huang, Nanjun He, Lei Zhu, Yuexiang Li, Yong Xu, Yefeng Zheng

To this end, we formulate this practical-yet-challenging cross-modal vertical federated learning task, in which shape data from multiple hospitals have different modalities with a small amount of multi-modality data collected from the same individuals.

Disentanglement MRI Reconstruction +1

Dynamic Interactive Relation Capturing via Scene Graph Learning for Robotic Surgical Report Generation

no code implementations5 Jun 2023 Hongqiu Wang, Yueming Jin, Lei Zhu

For robot-assisted surgery, an accurate surgical report reflects clinical operations during surgery and helps document entry tasks, post-operative analysis and follow-up treatment.

Graph Learning Relation

Identity-Guided Collaborative Learning for Cloth-Changing Person Reidentification

no code implementations10 Apr 2023 Zan Gao, Shenxun Wei, Weili Guan, Lei Zhu, Meng Wang, Shenyong Chen

Moreover, human semantic information and pedestrian identity information are not fully explored.

Automated Prompting for Non-overlapping Cross-domain Sequential Recommendation

no code implementations9 Apr 2023 Lei Guo, Chunxiao Wang, Xinhua Wang, Lei Zhu, Hongzhi Yin

Cross-domain Recommendation (CR) has been extensively studied in recent years to alleviate the data sparsity issue in recommender systems by utilizing different domain information.

Prompt Learning Sequential Recommendation

Multi-Behavior Recommendation with Cascading Graph Convolution Networks

1 code implementation28 Mar 2023 Zhiyong Cheng, Sai Han, Fan Liu, Lei Zhu, Zan Gao, Yuxin Peng

Most existing multi-behavior models fail to capture such dependencies in a behavior chain for embedding learning.

Masked Image Training for Generalizable Deep Image Denoising

1 code implementation CVPR 2023 Haoyu Chen, Jinjin Gu, Yihao Liu, Salma Abdel Magid, Chao Dong, Qiong Wang, Hanspeter Pfister, Lei Zhu

To address this issue, we present a novel approach to enhance the generalization performance of denoising networks, known as masked training.

Deep Learning Image Denoising

Neural Preset for Color Style Transfer

1 code implementation CVPR 2023 Zhanghan Ke, Yuhao Liu, Lei Zhu, Nanxuan Zhao, Rynson W. H. Lau

In this paper, we present a Neural Preset technique to address the limitations of existing color style transfer methods, including visual artifacts, vast memory requirement, and slow style switching speed.

4k Color Normalization +4

Distribution Aligned Diffusion and Prototype-guided network for Unsupervised Domain Adaptive Segmentation

1 code implementation22 Mar 2023 Haipeng Zhou, Lei Zhu, Yuyin Zhou

In order to explore its potential further, we have taken a step forward and considered a more complex scenario in the medical image domain, specifically, under an unsupervised adaptation condition.

DiffMIC: Dual-Guidance Diffusion Network for Medical Image Classification

1 code implementation19 Mar 2023 Yijun Yang, Huazhu Fu, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Lei Zhu

However, while a substantial amount of diffusion-based research has focused on generative tasks, few studies have applied diffusion models to general medical image classification.

Diabetic Retinopathy Grading image-classification +4

HybridMIM: A Hybrid Masked Image Modeling Framework for 3D Medical Image Segmentation

1 code implementation18 Mar 2023 Zhaohu Xing, Lei Zhu, Lequan Yu, Zhiheng Xing, Liang Wan

Masked image modeling (MIM) with transformer backbones has recently been exploited as a powerful self-supervised pre-training technique.

Contrastive Learning Image Segmentation +3

Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation

1 code implementation18 Mar 2023 Zhaohu Xing, Liang Wan, Huazhu Fu, Guang Yang, Lei Zhu

Our experimental results also indicate the universality and effectiveness of the proposed model.

Denoising Segmentation

Learning Physical-Spatio-Temporal Features for Video Shadow Removal

no code implementations16 Mar 2023 Zhihao Chen, Liang Wan, Yefan Xiao, Lei Zhu, Huazhu Fu

Then, we develop a progressive aggregation module to enhance the spatio and temporal characteristics of features maps, and effectively integrate the three kinds of features.

Shadow Removal Video Restoration

GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue

no code implementations14 Mar 2023 Zhening Huang, Xiaoyang Wu, Hengshuang Zhao, Lei Zhu, Shujun Wang, Georgios Hadjidemetriou, Ioannis Brilakis

For feature aggregation, it improves feature modeling by allowing the network to learn from both local points and neighboring geometry partitions, resulting in an enlarged data-tailored receptive field.

Point Cloud Segmentation

A Comprehensive Survey on Source-free Domain Adaptation

no code implementations23 Feb 2023 Zhiqi Yu, Jingjing Li, Zhekai Du, Lei Zhu, Heng Tao Shen

Over the past decade, domain adaptation has become a widely studied branch of transfer learning that aims to improve performance on target domains by leveraging knowledge from the source domain.

Source-Free Domain Adaptation Survey +1

One-Pot Multi-Frame Denoising

no code implementations18 Feb 2023 Lujia Jin, Shi Zhao, Lei Zhu, Qian Chen, Yanye Lu

Therefore, it is necessary to avoid the restriction of clean labels and make full use of noisy data for model training.

Denoising Diversity

Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections

2 code implementations12 Jan 2023 Dawei Wang, Weizi Li, Lei Zhu, Jia Pan

We propose a decentralized multi-agent reinforcement learning approach for the control and coordination of mixed traffic by RVs at real-world, complex intersections -- an open challenge to date.

Multi-agent Reinforcement Learning Traffic Signal Control

ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests

no code implementations29 Nov 2022 Haochuan Cui, Junjie Sheng, Bo Jin, Yiqiu Hu, Li Su, Lei Zhu, Wenli Zhou, Xiangfeng Wang

With the rapid development of cloud computing, virtual machine scheduling has become one of the most important but challenging issues for the cloud computing community, especially for practical heterogeneous request sequences.

Cloud Computing Scheduling

Who is Gambling? Finding Cryptocurrency Gamblers Using Multi-modal Retrieval Methods

1 code implementation27 Nov 2022 Zhengjie Huang, Zhenguang Liu, Jianhai Chen, Qinming He, Shuang Wu, Lei Zhu, Meng Wang

Meanwhile, decentralized applications have also attracted intense attention from the online gambling community, with more and more decentralized gambling platforms created through the help of smart contracts.

Retrieval

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking with Camera-LiDAR Fusion

no code implementations6 Sep 2022 Li Wang, Xinyu Zhang, Wenyuan Qin, Xiaoyu Li, Lei Yang, Zhiwei Li, Lei Zhu, Hong Wang, Jun Li, Huaping Liu

As such, we propose a novel camera-LiDAR fusion 3D MOT framework based on the Combined Appearance-Motion Optimization (CAMO-MOT), which uses both camera and LiDAR data and significantly reduces tracking failures caused by occlusion and false detection.

3D Multi-Object Tracking Autonomous Driving +2

Joint Prediction of Meningioma Grade and Brain Invasion via Task-Aware Contrastive Learning

1 code implementation4 Sep 2022 Tianling Liu, Wennan Liu, Lequan Yu, Liang Wan, Tong Han, Lei Zhu

Preoperative and noninvasive prediction of the meningioma grade is important in clinical practice, as it directly influences the clinical decision making.

Contrastive Learning Decision Making +2

NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation

1 code implementation31 Aug 2022 Zhaohu Xing, Lequan Yu, Liang Wan, Tong Han, Lei Zhu

Multi-modal MR imaging is routinely used in clinical practice to diagnose and investigate brain tumors by providing rich complementary information.

Brain Tumor Segmentation Decoder +3

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

1 code implementation16 Jul 2022 Lei Zhu, Qian Chen, Lujia Jin, Yunfei You, Yanye Lu

Classification activation map (CAM), utilizing the classification structure to generate pixel-wise localization maps, is a crucial mechanism for weakly supervised object localization (WSOL).

Object Weakly-Supervised Object Localization

Harmonizer: Learning to Perform White-Box Image and Video Harmonization

1 code implementation4 Jul 2022 Zhanghan Ke, Chunyi Sun, Lei Zhu, Ke Xu, Rynson W. H. Lau

Unlike prior methods that are based on black-box autoencoders, Harmonizer contains a neural network for filter argument prediction and several white-box filters (based on the predicted arguments) for image harmonization.

Image Harmonization Video Harmonization

A New Dataset and A Baseline Model for Breast Lesion Detection in Ultrasound Videos

2 code implementations1 Jul 2022 Zhi Lin, Junhao Lin, Lei Zhu, Huazhu Fu, Jing Qin, Liansheng Wang

Moreover, we learn video-level features to classify the breast lesions of the original video as benign or malignant lesions to further enhance the final breast lesion detection performance in ultrasound videos.

Lesion Classification Lesion Detection

Time Interval-enhanced Graph Neural Network for Shared-account Cross-domain Sequential Recommendation

1 code implementation16 Jun 2022 Lei Guo, Jinyu Zhang, Li Tang, Tong Chen, Lei Zhu, Hongzhi Yin

Shared-account Cross-domain Sequential Recommendation (SCSR) task aims to recommend the next item via leveraging the mixed user behaviors in multiple domains.

Graph Neural Network Representation Learning +2

Copy Motion From One to Another: Fake Motion Video Generation

no code implementations3 May 2022 Zhenguang Liu, Sifan Wu, Chejian Xu, Xiang Wang, Lei Zhu, Shuang Wu, Fuli Feng

3) To enhance texture details, we encode facial features with geometric guidance and employ local GANs to refine the face, feet, and hands.

Video Generation

RSCFed: Random Sampling Consensus Federated Semi-supervised Learning

1 code implementation CVPR 2022 Xiaoxiao Liang, Yiqun Lin, Huazhu Fu, Lei Zhu, Xiaomeng Li

In this paper, we present a Random Sampling Consensus Federated learning, namely RSCFed, by considering the uneven reliability among models from fully-labeled clients, fully-unlabeled clients or partially labeled clients.

Federated Learning

Multi-modal learning for predicting the genotype of glioma

no code implementations21 Mar 2022 Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, Carola-Bibiane Schönlieb, Stephen J. Price, Chao Li

In this study, we propose a multi-modal learning framework using three separate encoders to extract features of focal tumor image, tumor geometrics and global brain networks.

Clinical Knowledge Diffusion MRI +1

BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation

1 code implementation CVPR 2022 Wenqiao Zhang, Lei Zhu, James Hallinan, Andrew Makmur, Shengyu Zhang, Qingpeng Cai, Beng Chin Ooi

In this paper, we propose a novel semi-supervised learning (SSL) framework named BoostMIS that combines adaptive pseudo labeling and informative active annotation to unleash the potential of medical image SSL models: (1) BoostMIS can adaptively leverage the cluster assumption and consistency regularization of the unlabeled data according to the current learning status.

Active Learning

Weakly Supervised Object Localization as Domain Adaption

1 code implementation CVPR 2022 Lei Zhu, Qi She, Qian Chen, Yunfei You, Boyu Wang, Yanye Lu

To avoid this problem, this work provides a novel perspective that models WSOL as a domain adaption (DA) task, where the score estimator trained on the source/image domain is tested on the target/pixel domain to locate objects.

Classification Domain Adaptation +2

Content-Noise Complementary Learning for Medical Image Denoising

2 code implementations IEEE Transactions on Medical Imaging 2022 Mufeng Geng, Xiangxi Meng, Jiangyuan Yu, Lei Zhu, Lujia Jin, Zhe Jiang, Bin Qiu, Hui Li, Hanjing Kong, Jianmin Yuan, Kun Yang, Hongming Shan, Hongbin Han, Zhi Yang, Qiushi Ren, Yanye Lu

In this study, we propose a simple yet effective strategy, the content-noise complementary learning (CNCL) strategy, in which two deep learning predictors are used to learn the respective content and noise of the image dataset complementarily.

Generative Adversarial Network Image Denoising +1

Motion Prediction via Joint Dependency Modeling in Phase Space

no code implementations7 Jan 2022 Pengxiang Su, Zhenguang Liu, Shuang Wu, Lei Zhu, Yifang Yin, Xuanjing Shen

In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics.

Anatomy global-optimization +2

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images

1 code implementation1 Jan 2022 Xiaoqiang Wang, Lei Zhu, Siliang Tang, Huazhu Fu, Ping Li, Fei Wu, Yi Yang, Yueting Zhuang

The depth estimation branch is trained with RGB-D images and then used to estimate the pseudo depth maps for all unlabeled RGB images to form the paired data.

Depth Estimation object-detection +3

Distinguishing Unseen From Seen for Generalized Zero-Shot Learning

no code implementations CVPR 2022 Hongzu Su, Jingjing Li, Zhi Chen, Lei Zhu, Ke Lu

In this paper, we present a novel method which leverages both visual and semantic modalities to distinguish seen and unseen categories.

Generalized Zero-Shot Learning

Background-aware Classification Activation Map for Weakly Supervised Object Localization

1 code implementation29 Dec 2021 Lei Zhu, Qi She, Qian Chen, Xiangxi Meng, Mufeng Geng, Lujia Jin, Zhe Jiang, Bin Qiu, Yunfei You, Yibao Zhang, Qiushi Ren, Yanye Lu

In our B-CAM, two image-level features, aggregated by pixel-level features of potential background and object locations, are used to purify the object feature from the object-related background and to represent the feature of the pure-background sample, respectively.

Classification Object +1

VMAgent: Scheduling Simulator for Reinforcement Learning

2 code implementations9 Dec 2021 Junjie Sheng, Shengliang Cai, Haochuan Cui, Wenhao Li, Yun Hua, Bo Jin, Wenli Zhou, Yiqiu Hu, Lei Zhu, Qian Peng, Hongyuan Zha, Xiangfeng Wang

A novel simulator called VMAgent is introduced to help RL researchers better explore new methods, especially for virtual machine scheduling.

Cloud Computing reinforcement-learning +3

Network-wide Multi-step Traffic Volume Prediction using Graph Convolutional Gated Recurrent Neural Network

1 code implementation22 Nov 2021 Lei Lin, Weizi Li, Lei Zhu

For instance, our model reduces MAE by 25. 3%, RMSE by 29. 2%, and MAPE by 20. 2%, compared to the state-of-the-art Diffusion Convolutional Recurrent Neural Network (DCRNN) model using the hourly dataset.

Fast Camouflaged Object Detection via Edge-based Reversible Re-calibration Network

1 code implementation5 Nov 2021 Ge-Peng Ji, Lei Zhu, Mingchen Zhuge, Keren Fu

Camouflaged Object Detection (COD) aims to detect objects with similar patterns (e. g., texture, intensity, colour, etc) to their surroundings, and recently has attracted growing research interest.

Camouflaged Object Segmentation Image Segmentation +3

Domain Adaptive Semantic Segmentation without Source Data

1 code implementation13 Oct 2021 Fuming You, Jingjing Li, Lei Zhu, Ke Lu, Zhi Chen, Zi Huang

To address these problems, we investigate domain adaptive semantic segmentation without source data, which assumes that the model is pre-trained on the source domain, and then adapting to the target domain without accessing source data anymore.

Segmentation Semantic Segmentation

Boundary-aware Transformers for Skin Lesion Segmentation

1 code implementation8 Oct 2021 Jiacheng Wang, Lan Wei, Liansheng Wang, Qichao Zhou, Lei Zhu, Jing Qin

Skin lesion segmentation from dermoscopy images is of great importance for improving the quantitative analysis of skin cancer.

Inductive Bias Lesion Segmentation +2

HCDG: A Hierarchical Consistency Framework for Domain Generalization on Medical Image Segmentation

1 code implementation13 Sep 2021 Yijun Yang, Shujun Wang, Lei Zhu, Lequan Yu

Particularly, for the Extrinsic Consistency, we leverage the knowledge across multiple source domains to enforce data-level consistency.

Data Augmentation Domain Generalization +4

Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

no code implementations9 Sep 2021 Lei Zhu, Zhaojing Luo, Wei Wang, Meihui Zhang, Gang Chen, Kaiping Zheng

In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarce target domain, thus potentially alleviates the annotation requirement for deep learning models.

Domain Adaptation Transfer Learning

MT-ORL: Multi-Task Occlusion Relationship Learning

1 code implementation ICCV 2021 Panhe Feng, Qi She, Lei Zhu, Jiaxin Li, Lin Zhang, Zijian Feng, Changhu Wang, Chunpeng Li, Xuejing Kang, Anlong Ming

Retrieving occlusion relation among objects in a single image is challenging due to sparsity of boundaries in image.

Decoder

From Synthetic to Real: Image Dehazing Collaborating with Unlabeled Real Data

1 code implementation6 Aug 2021 Ye Liu, Lei Zhu, Shunda Pei, Huazhu Fu, Jing Qin, Qing Zhang, Liang Wan, Wei Feng

Our DID-Net predicts the three component maps by progressively integrating features across scales, and refines each map by passing an independent refinement network.

Image Dehazing Single Image Dehazing

Unifying Nonlocal Blocks for Neural Networks

1 code implementation ICCV 2021 Lei Zhu, Qi She, Duo Li, Yanye Lu, Xuejing Kang, Jie Hu, Changhu Wang

The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks.

Action Recognition image-classification +3

Adversarial Energy Disaggregation for Non-intrusive Load Monitoring

no code implementations2 Aug 2021 Zhekai Du, Jingjing Li, Lei Zhu, Ke Lu, Heng Tao Shen

Energy disaggregation, also known as non-intrusive load monitoring (NILM), challenges the problem of separating the whole-home electricity usage into appliance-specific individual consumptions, which is a typical application of data analysis.

Non-Intrusive Load Monitoring

Bayesian Statistics Guided Label Refurbishment Mechanism: Mitigating Label Noise in Medical Image Classification

1 code implementation23 Jun 2021 Mengdi Gao, Ximeng Feng, Mufeng Geng, Zhe Jiang, Lei Zhu, Xiangxi Meng, Chuanqing Zhou, Qiushi Ren, Yanye Lu

BLRM utilizes maximum a posteriori probability (MAP) in the Bayesian statistics and the exponentially time-weighted technique to selectively correct the labels of noisy images.

image-classification Image Classification +1

Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation

2 code implementations CVPR 2021 Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu

Previous bi-classifier adversarial learning methods only focus on the similarity between the outputs of two distinct classifiers.

Clustering Self-Supervised Learning +1

UGRec: Modeling Directed and Undirected Relations for Recommendation

1 code implementation10 May 2021 Xinxiao Zhao, Zhiyong Cheng, Lei Zhu, Jiecai Zheng, Xueqing Li

In particular, for a directed relation, we transform the head and tail entities into the corresponding relation space to model their relation; and for an undirected co-occurrence relation, we project head and tail entities into a unique hyperplane in the entity space to minimize their distance.

Attribute Collaborative Filtering +2

DA-GCN: A Domain-aware Attentive Graph Convolution Network for Shared-account Cross-domain Sequential Recommendation

no code implementations7 May 2021 Lei Guo, Li Tang, Tong Chen, Lei Zhu, Quoc Viet Hung Nguyen, Hongzhi Yin

Shared-account Cross-domain Sequential recommendation (SCSR) is the task of recommending the next item based on a sequence of recorded user behaviors, where multiple users share a single account, and their behaviours are available in multiple domains.

Sequential Recommendation Transfer Learning

Global Guidance Network for Breast Lesion Segmentation in Ultrasound Images

no code implementations5 Apr 2021 Cheng Xue, Lei Zhu, Huazhu Fu, Xiaowei Hu, Xiaomeng Li, Hai Zhang, Pheng Ann Heng

The BD modules learn additional breast lesion boundary map to enhance the boundary quality of a segmentation result refinement.

Boundary Detection Image Segmentation +3

Learning the Superpixel in a Non-iterative and Lifelong Manner

1 code implementation CVPR 2021 Lei Zhu, Qi She, Bin Zhang, Yanye Lu, Zhilin Lu, Duo Li, Jie Hu

Superpixel is generated by automatically clustering pixels in an image into hundreds of compact partitions, which is widely used to perceive the object contours for its excellent contour adherence.

Clustering Lifelong learning

Triple-cooperative Video Shadow Detection

1 code implementation CVPR 2021 Zhihao Chen, Liang Wan, Lei Zhu, Jia Shen, Huazhu Fu, Wennan Liu, Jing Qin

The bottleneck is the lack of a well-established dataset with high-quality annotations for video shadow detection.

Saliency Detection Semantic Segmentation +3

Feature-level Attentive ICF for Recommendation

1 code implementation22 Feb 2021 Zhiyong Cheng, Fan Liu, Shenghan Mei, Yangyang Guo, Lei Zhu, Liqiang Nie

To demonstrate the effectiveness of our method, we design a light attention neural network to integrate both item-level and feature-level attention for neural ICF models.

Collaborative Filtering Recommendation Systems

Interest-aware Message-Passing GCN for Recommendation

1 code implementation19 Feb 2021 Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, Liqiang Nie

To form the subgraphs, we design an unsupervised subgraph generation module, which can effectively identify users with common interests by exploiting both user feature and graph structure.

Deep Texture-Aware Features for Camouflaged Object Detection

no code implementations5 Feb 2021 Jingjing Ren, Xiaowei Hu, Lei Zhu, Xuemiao Xu, Yangyang Xu, Weiming Wang, Zijun Deng, Pheng-Ann Heng

Camouflaged object detection is a challenging task that aims to identify objects having similar texture to the surroundings.

Object object-detection +1

A Unified Framework to Analyze and Design the Nonlocal Blocks for Neural Networks

no code implementations1 Jan 2021 Lei Zhu, Qi She, Changhu Wang

When choosing Chebyshev graph filter, a generalized formulation can be derived for explaining the existing nonlocal-based blocks (e. g. nonlocal block, nonlocal stage, double attention block) and uses to analyze their irrationality.

Action Recognition Fine-Grained Image Classification +1

Mitigating Intensity Bias in Shadow Detection via Feature Decomposition and Reweighting

no code implementations ICCV 2021 Lei Zhu, Ke Xu, Zhanghan Ke, Rynson W.H. Lau

These two phenomenons reveal that deep shadow detectors heavily depend on the intensity cue, which we refer to as intensity bias.

Shadow Detection

MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines

no code implementations17 Oct 2020 Zhaojing Luo, Sai Ho Yeung, Meihui Zhang, Kaiping Zheng, Lei Zhu, Gang Chen, Feiyi Fan, Qian Lin, Kee Yuan Ngiam, Beng Chin Ooi

In this paper, we identify two main challenges that arise during the deployment of machine learning pipelines, and address them with the design of versioning for an end-to-end analytics system MLCask.

BIG-bench Machine Learning Management

Learning to Detect Specular Highlights from Real-world Images

no code implementations10 Oct 2020 Gang Fu, Qing Zhang, QiFeng Lin, Lei Zhu, and Chunaxia Xiao

Specular highlight detection is a challenging problem, and has many applications such as shiny object detection and light source estimation.

Highlight Detection object-detection +1

Dual-level Semantic Transfer Deep Hashing for Efficient Social Image Retrieval

1 code implementation10 Jun 2020 Lei Zhu, Hui Cui, Zhiyong Cheng, Jingjing Li, Zheng Zhang

Specifically, we design a complementary dual-level semantic transfer mechanism to efficiently discover the potential semantics of tags and seamlessly transfer them into binary hash codes.

Deep Hashing Image Retrieval +1

Constrained Multi-shape Evolution for Overlapping Cytoplasm Segmentation

no code implementations8 Apr 2020 Youyi Song, Lei Zhu, Baiying Lei, Bin Sheng, Qi Dou, Jing Qin, Kup-Sze Choi

In the shape evolution, we compensate intensity deficiency for the segmentation by introducing not only the modeled local shape priors but also global shape priors (clump--level) modeled by considering mutual shape constraints of cytoplasms in the clump.

Task-adaptive Asymmetric Deep Cross-modal Hashing

no code implementations1 Apr 2020 Fengling Li, Tong Wang, Lei Zhu, Zheng Zhang, Xinhua Wang

Unlike previous cross-modal hashing approaches, our learning framework jointly optimizes semantic preserving that transforms deep features of multimedia data into binary hash codes, and the semantic regression which directly regresses query modality representation to explicit label.

Cross-Modal Retrieval Retrieval

Multi-Feature Discrete Collaborative Filtering for Fast Cold-start Recommendation

no code implementations24 Mar 2020 Yang Xu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Jiande Sun

Additionally, we develop a fast discrete optimization algorithm to directly compute the binary hash codes with simple operations.

Collaborative Filtering Quantization

A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation

no code implementations20 Mar 2020 Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie

Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information.

Attribute Recommendation Systems

Neural Networks Weights Quantization: Target None-retraining Ternary (TNT)

no code implementations18 Dec 2019 Tianyu Zhang, Lei Zhu, Qian Zhao, Kilho Shin

Quantization of weights of deep neural networks (DNN) has proven to be an effective solution for the purpose of implementing DNNs on edge devices such as mobiles, ASICs and FPGAs, because they have no sufficient resources to support computation involving millions of high precision weights and multiply-accumulate operations.

Quantization

DDNet: Dual-path Decoder Network for Occlusion Relationship Reasoning

no code implementations26 Nov 2019 Panhe Feng, Xuejing Kang, Lizhu Ye, Lei Zhu, Chunpeng Li, Anlong Ming

Besides, considering the restriction of occlusion orientation presentation to occlusion orientation learning, we design a new orthogonal representation for occlusion orientation and proposed the Orthogonal Orientation Regression loss which can get rid of the unfitness between occlusion representation and learning and further prompt the occlusion orientation learning.

Decoder regression

A Spectral Nonlocal Block for Neural Networks

no code implementations4 Nov 2019 Lei Zhu, Qi She, Lidan Zhang, Ping Guo

The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks.

Action Recognition Fine-Grained Image Classification +4

CANet: Cross-disease Attention Network for Joint Diabetic Retinopathy and Diabetic Macular Edema Grading

1 code implementation4 Nov 2019 Xiaomeng Li, Xiao-Wei Hu, Lequan Yu, Lei Zhu, Chi-Wing Fu, Pheng-Ann Heng

In this paper, we present a novel cross-disease attention network (CANet) to jointly grade DR and DME by exploring the internal relationship between the diseases with only image-level supervision.

Distribution Matching Prototypical Network for Unsupervised Domain Adaptation

no code implementations25 Sep 2019 Lei Zhu, Wei Wang, Mei Hui Zhang, Beng Chin Ooi, Chang Yao

State-of-the-art Unsupervised Domain Adaptation (UDA) methods learn transferable features by minimizing the feature distribution discrepancy between the source and target domains.

Unsupervised Domain Adaptation

Spectral Nonlocal Block for Neural Network

no code implementations25 Sep 2019 Lei Zhu, Qi She, Lidan Zhang, Ping Guo

The nonlocal network is designed for capturing long-range spatial-temporal dependencies in several computer vision tasks.

Video Classification

Alleviating Feature Confusion for Generative Zero-shot Learning

1 code implementation17 Sep 2019 Jingjing Li, Mengmeng Jing, Ke Lu, Lei Zhu, Yang Yang, Zi Huang

An inevitable issue of such a paradigm is that the synthesized unseen features are prone to seen references and incapable to reflect the novelty and diversity of real unseen instances.

Generalized Zero-Shot Learning

Cycle-consistent Conditional Adversarial Transfer Networks

1 code implementation17 Sep 2019 Jingjing Li, Erpeng Chen, Zhengming Ding, Lei Zhu, Ke Lu, Zi Huang

Domain adaptation investigates the problem of cross-domain knowledge transfer where the labeled source domain and unlabeled target domain have distinctive data distributions.

Domain Adaptation Transfer Learning

Personalized Hashtag Recommendation for Micro-videos

1 code implementation27 Aug 2019 Yinwei Wei, Zhiyong Cheng, Xuzheng Yu, Zhou Zhao, Lei Zhu, Liqiang Nie

The hashtags, that a user provides to a post (e. g., a micro-video), are the ones which in her mind can well describe the post content where she is interested in.

Enhancing Underexposed Photos using Perceptually Bidirectional Similarity

no code implementations25 Jul 2019 Qing Zhang, Yongwei Nie, Lei Zhu, Chunxia Xiao, Wei-Shi Zheng

To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency.

Video Enhancement

Probabilistic Multilayer Regularization Network for Unsupervised 3D Brain Image Registration

no code implementations3 Jul 2019 Lihao Liu, Xiaowei Hu, Lei Zhu, Pheng-Ann Heng

This paper presents a novel framework for unsupervised 3D brain image registration by capturing the feature-level transformation relationships between the unaligned image and reference image.

Image Registration

Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound

1 code implementation3 Jul 2019 Yi Wang, Haoran Dou, Xiao-Wei Hu, Lei Zhu, Xin Yang, Ming Xu, Jing Qin, Pheng-Ann Heng, Tianfu Wang, Dong Ni

Our attention module utilizes the attention mechanism to selectively leverage the multilevel features integrated from different layers to refine the features at each individual layer, suppressing the non-prostate noise at shallow layers of the CNN and increasing more prostate details into features at deep layers.

Image Segmentation Medical Image Segmentation +2

From Zero-Shot Learning to Cold-Start Recommendation

1 code implementation20 Jun 2019 Jingjing Li, Mengmeng Jing, Ke Lu, Lei Zhu, Yang Yang, Zi Huang

This work, for the first time, formulates CSR as a ZSL problem, and a tailor-made ZSL method is proposed to handle CSR.

Decoder Recommendation Systems +1

PAC-GAN: An Effective Pose Augmentation Scheme for Unsupervised Cross-View Person Re-identification

no code implementations5 Jun 2019 Chengyuan Zhang, Lei Zhu, Shichao Zhang

In this paper, we introduce a novel unsupervised pose augmentation cross-view person Re-Id scheme called PAC-GAN to overcome these limitations.

Cross-Modal Person Re-Identification Generative Adversarial Network +2

Fusion-supervised Deep Cross-modal Hashing

no code implementations25 Apr 2019 Li Wang, Lei Zhu, En Yu, Jiande Sun, Huaxiang Zhang

Deep hashing has recently received attention in cross-modal retrieval for its impressive advantages.

Cross-Modal Retrieval Deep Hashing

Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval

no code implementations25 Apr 2019 Lei Zhu, Zi Huang, Zhihui Li, Liang Xie, Heng Tao Shen

To address the problem, in this paper, we propose a novel hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).

Content-Based Image Retrieval Retrieval

Discrete Optimal Graph Clustering

1 code implementation25 Apr 2019 Yudong Han, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiaobai Liu

2) the relaxing process of cluster labels may cause significant information loss.

Clustering Graph Clustering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.