Search Results for author: Zhiwei Xiong

Found 118 papers, 48 papers with code

Photon-Efficient 3D Imaging with A Non-Local Neural Network

no code implementations ECCV 2020 Jiayong Peng, Zhiwei Xiong, Xin Huang, Zheng-Ping Li, Dong Liu, Feihu Xu

Photon-efficient imaging has enabled a number of applications relying on single-photon sensors that can capture a 3D image with as few as one photon per pixel.

Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising

1 code implementation ECCV 2020 Guanting Dong, Yueyi Zhang, Zhiwei Xiong

In this paper, we propose a Spatial Hierarchy Aware Residual Pyramid Network, called SHARP-Net, to remove the depth noise by fully exploiting the geometry information of the scene on different scales.

Denoising

Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning

no code implementations23 Jun 2025 Yue Li, Meng Tian, Dechang Zhu, Jiangtong Zhu, Zhenyu Lin, Zhiwei Xiong, Xinhai Zhao

However, we identify two critical challenges in this direction: (1) VLMs tend to learn shortcuts by relying heavily on history input information, achieving seemingly strong planning results without genuinely understanding the visual inputs; and (2) the chain-ofthought (COT) reasoning processes are always misaligned with the motion planning outcomes, and how to effectively leverage the complex reasoning capability to enhance planning remains largely underexplored.

Autonomous Driving Motion Planning

Text-Queried Audio Source Separation via Hierarchical Modeling

no code implementations27 May 2025 Xinlei Yin, Xiulian Peng, Xue Jiang, Zhiwei Xiong, Yan Lu

We first perform global-semantic separation through a global semantic feature space aligned with text queries.

Audio Source Separation Natural Language Queries

Event-Enhanced Blurry Video Super-Resolution

1 code implementation17 Apr 2025 Dachun Kai, Yueyi Zhang, Jin Wang, Zeyu Xiao, Zhiwei Xiong, Xiaoyan Sun

In this paper, we tackle the task of blurry video super-resolution (BVSR), aiming to generate high-resolution (HR) videos from low-resolution (LR) and blurry inputs.

Deblurring Motion Estimation +2

Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving

1 code implementation27 Mar 2025 Yue Li, Meng Tian, Zhenyu Lin, Jiangtong Zhu, Dechang Zhu, Haiqiang Liu, Zining Wang, Yueyi Zhang, Zhiwei Xiong, Xinhai Zhao

To further exploit the cognitive and reasoning interactions among the 5 domains for AD understanding, we start from a small-scale VLM and train the DS models on individual domain datasets (collected from 1. 4M DS QAs across public sources).

Attribute Autonomous Driving +3

All-in-One Image Compression and Restoration

1 code implementation5 Feb 2025 Huimin Zeng, Jiacheng Li, Ziqiang Zheng, Zhiwei Xiong

Visual images corrupted by various types and levels of degradations are commonly encountered in practical image compression.

All Image Compression +1

QMamba: Post-Training Quantization for Vision State Space Models

no code implementations23 Jan 2025 Yinglong Li, Xiaoyu Liu, Jiacheng Li, Ruikang Xu, Yinda Chen, Zhiwei Xiong

State Space Models (SSMs), as key components of Mamaba, have gained increasing attention for vision models recently, thanks to their efficient long sequence modeling capability.

Quantization State Space Models

InsTex: Indoor Scenes Stylized Texture Synthesis

no code implementations22 Jan 2025 Yunfan Zhang, Zhiwei Xiong, Zhiqi Shen, Guosheng Lin, Hao Wang, Nicolas Vun

Generating high-quality textures for 3D scenes is crucial for applications in interior design, gaming, and augmented/virtual reality (AR/VR).

Texture Synthesis

Plug-and-Play Versatile Compressed Video Enhancement

no code implementations CVPR 2025 Huimin Zeng, Jiacheng Li, Zhiwei Xiong

As a widely adopted technique in data transmission, video compression effectively reduces the size of files, making it possible for real-time cloud computing.

Cloud Computing Video Compression +1

S2D-LFE: Sparse-to-Dense Light Field Event Generation

1 code implementation CVPR 2025 Yutong Liu, Wenming Weng, Yueyi Zhang, Zhiwei Xiong

For the first time to our knowledge, S2D-LFE enables controllable novel view synthesis only from sparse-view light field event (LFE) data, and addresses three critical challenges for the LFE generation task: simplicity, controllability, and consistency.

Novel View Synthesis

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

no code implementations22 Dec 2024 Xuying Zhang, Yutong Liu, Yangguang Li, Renrui Zhang, Yufei Liu, Kai Wang, Wanli Ouyang, Zhiwei Xiong, Peng Gao, Qibin Hou, Ming-Ming Cheng

We present TAR3D, a novel framework that consists of a 3D-aware Vector Quantized-Variational AutoEncoder (VQ-VAE) and a Generative Pre-trained Transformer (GPT) to generate high-quality 3D assets.

Image to 3D Text to 3D

Event-boosted Deformable 3D Gaussians for Dynamic Scene Reconstruction

no code implementations25 Nov 2024 Wenhao Xu, Wenming Weng, Yueyi Zhang, Ruikang Xu, Zhiwei Xiong

To address this, we introduce the first approach combining event cameras, which capture high-temporal-resolution, continuous motion data, with deformable 3D-GS for dynamic scene reconstruction.

3D Reconstruction Dynamic Reconstruction

Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning

1 code implementation21 Sep 2024 Qi Chen, Xiaohan Xing, Zhen Chen, Zhiwei Xiong

To exploit complementary information from the auxiliary modality, we propose a Cross-Modal Selective fusion (CMS-fusion) module that selectively incorporate the frequency and spatial features from the auxiliary modality to enhance the corresponding branch of the target modality.

MRI Reconstruction

Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors

no code implementations21 Sep 2024 Shida Sun, Yue Li, Yueyi Zhang, Zhiwei Xiong

Non-line-of-sight (NLOS) imaging, recovering the hidden volume from indirect reflections, has attracted increasing attention due to its potential applications.

DRExplainer: Quantifiable Interpretability in Drug Response Prediction with Directed Graph Convolutional Network

1 code implementation22 Aug 2024 Haoyuan Shi, Tao Xu, Xiaodi Li, Qian Gao, Zhiwei Xiong, Junfeng Xia, Zhenyu Yue

DRExplainer constructs a directed bipartite network integrating multi-omics profiles of cell lines, the chemical structure of drugs and known drug response to achieve directed prediction.

Decision Making Drug Response Prediction +1

LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation

1 code implementation13 Jul 2024 Jiacheng Li, Chang Chen, Fenglong Song, Youliang Yan, Zhiwei Xiong

Image resampling is a basic technique that is widely employed in daily applications, such as camera photo editing.

CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding

no code implementations9 Jul 2024 Wenhao Xu, Wenming Weng, Yueyi Zhang, Zhiwei Xiong

In response to this challenge, CEIA learns to align event and image data as an alternative instead of directly aligning event and text data.

Contrastive Learning Domain Adaptation +3

Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning

no code implementations23 Jun 2024 Ruisheng Gao, Zeyu Xiao, Zhiwei Xiong

Transformer-based methods have demonstrated impressive performance in 4D light field (LF) super-resolution by effectively modeling long-range spatial-angular correlations, but their quadratic complexity hinders the efficient processing of high resolution 4D inputs, resulting in slow inference speed and high memory cost.

Mamba Super-Resolution

Diffusion-Promoted HDR Video Reconstruction

no code implementations12 Jun 2024 Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed HDR-V-Diff, which incorporates a diffusion model to capture the HDR distribution.

Video Reconstruction

TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction

1 code implementation27 May 2024 Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu

Autoregressive next-token prediction is a standard pretraining method for large-scale language models, but its application to vision tasks is hindered by the non-sequential nature of image data, leading to cumulative errors.

Mamba Prediction +1

UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

no code implementations27 May 2024 Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, Jinli Suo, Qionghai Dai

In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times.

Image Compression Knowledge Distillation +1

Incorporating Degradation Estimation in Light Field Spatial Super-Resolution

no code implementations11 May 2024 Zeyu Xiao, Zhiwei Xiong

Recent advancements in light field super-resolution (SR) have yielded impressive results.

Super-Resolution

Multi-modal Learnable Queries for Image Aesthetics Assessment

no code implementations2 May 2024 Zhiwei Xiong, Yunfan Zhang, Zhiqi Shen, Peiran Ren, Han Yu

Instead of directly extracting aesthetic features solely from the image, user comments associated with an image could potentially provide complementary knowledge that is useful for IAA.

Event-assisted Low-Light Video Object Segmentation

1 code implementation CVPR 2024 Hebei Li, Jin Wang, Jiahui Yuan, Yue Li, Wenming Weng, Yansong Peng, Yueyi Zhang, Zhiwei Xiong, Xiaoyan Sun

In the realm of video object segmentation (VOS), the challenge of operating under low-light conditions persists, resulting in notably degraded image quality and compromised accuracy when comparing query and memory frames for similarity computation.

Object Semantic Segmentation +2

BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval

no code implementations24 Mar 2024 Yinda Chen, Che Liu, Xiaoyu Liu, Rossella Arcucci, Zhiwei Xiong

The burgeoning integration of 3D medical imaging into healthcare has led to a substantial increase in the workload of medical professionals.

Diagnostic Image to text +2

Towards Generalizable Tumor Synthesis

1 code implementation CVPR 2024 Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou

Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation.

Computed Tomography (CT)

Physics-Inspired Degradation Models for Hyperspectral Image Fusion

no code implementations4 Feb 2024 Jie Lian, Lizhi Wang, Lin Zhu, Renwei Dian, Zhiwei Xiong, Hua Huang

To fill this gap, we propose physics-inspired degradation models (PIDM) to model the degradation of LR-HSI and HR-MSI, which comprises a spatial degradation network (SpaDN) and a spectral degradation network (SpeDN).

Diffusion-based Light Field Synthesis

no code implementations1 Feb 2024 Ruisheng Gao, Yutong Liu, Zeyu Xiao, Zhiwei Xiong

Light fields (LFs), conducive to comprehensive scene radiance recorded across angular dimensions, find wide applications in 3D reconstruction, virtual reality, and computational photography. However, the LF acquisition is inevitably time-consuming and resource-intensive due to the mainstream acquisition strategy involving manual capture or laborious software synthesis. Given such a challenge, we introduce LFdiff, a straightforward yet effective diffusion-based generative framework tailored for LF synthesis, which adopts only a single RGB image as input. LFdiff leverages disparity estimated by a monocular depth estimation network and incorporates two distinctive components: a novel condition scheme and a noise estimation network tailored for LF data. Specifically, we design a position-aware warping condition scheme, enhancing inter-view geometry learning via a robust conditional signal. We then propose DistgUnet, a disentanglement-based noise estimation network, to harness comprehensive LF representations. Extensive experiments demonstrate that LFdiff excels in synthesizing visually pleasing and disparity-controllable light fields with enhanced generalization capability. Additionally, comprehensive results affirm the broad applicability of the generated LF data, spanning applications like LF super-resolution and refocusing.

3D Reconstruction Disentanglement +3

Style-Consistent 3D Indoor Scene Synthesis with Decoupled Objects

no code implementations24 Jan 2024 Yunfan Zhang, Hong Huang, Zhiwei Xiong, Zhiqi Shen, Guosheng Lin, Hao Wang, Nicholas Vun

The core strength of our pipeline lies in its ability to generate 3D scenes that are not only visually impressive but also exhibit features like photorealism, multi-view consistency, and diversity.

Diversity Indoor Scene Synthesis

Graph Relation Distillation for Efficient Biomedical Instance Segmentation

2 code implementations12 Jan 2024 Xiaoyu Liu, Yueyi Zhang, Zhiwei Xiong, Wei Huang, Bo Hu, Xiaoyan Sun, Feng Wu

IGD constructs a graph representing instance features and relations, transferring these two types of knowledge by enforcing instance graph consistency.

Instance Segmentation Knowledge Distillation +2

Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing

1 code implementation5 Jan 2024 Qihua Chen, Xuejin Chen, Chenxuan Wang, Yixiong Liu, Zhiwei Xiong, Feng Wu

In this work, we aim to reduce human workload by predicting connectivity between over-segmented neuron pieces, taking both microscopy image and 3D morphology features into account, similar to human proofreading workflow.

Contrastive Learning Image Segmentation +1

Look-Up Table Compression for Efficient Image Restoration

1 code implementation CVPR 2024 Yinglong Li, Jiacheng Li, Zhiwei Xiong

In this work we propose a novel LUT compression framework to achieve a better trade-off between storage size and performance for LUT-based image restoration models.

Computational Efficiency Image Restoration +1

Cross-Dimension Affinity Distillation for 3D EM Neuron Segmentation

1 code implementation CVPR 2024 Xiaoyu Liu, Miaomiao Cai, Yinda Chen, Yueyi Zhang, Te Shi, Ruobing Zhang, Xuejin Chen, Zhiwei Xiong

Recent advancements utilize 3D CNNs to predict a 3D affinity map with improved accuracy but suffer from two challenges: high computational cost and limited input size especially for practical deployment for large-scale EM volumes.

Segmentation Transfer Learning

Learning Large-Factor EM Image Super-Resolution with Generative Priors

1 code implementation CVPR 2024 Jiateng Shou, Zeyu Xiao, Shiyu Deng, Wei Huang, Peiyao Shi, Ruobing Zhang, Zhiwei Xiong, Feng Wu

As the mainstream technique for capturing images of biological specimens at nanometer resolution electron microscopy (EM) is extremely time-consuming for scanning wide field-of-view (FOV) specimens.

Image Super-Resolution Video Super-Resolution

CBQ: Cross-Block Quantization for Large Language Models

no code implementations13 Dec 2023 Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang

Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.

Quantization

Neural Degradation Representation Learning for All-In-One Image Restoration

1 code implementation19 Oct 2023 Mingde Yao, Ruikang Xu, Yuanshen Guan, Jie Huang, Zhiwei Xiong

To this end, we propose to learn a neural degradation representation (NDR) that captures the underlying characteristics of various degradations.

All Representation Learning +1

Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning

1 code implementation6 Oct 2023 Yinda Chen, Wei Huang, Shenglong Zhou, Qi Chen, Zhiwei Xiong

By extracting semantic information from unlabeled data, self-supervised methods can improve the performance of downstream tasks, among which the mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.

Multi-agent Reinforcement Learning reinforcement-learning +3

Image Aesthetics Assessment via Learnable Queries

no code implementations6 Sep 2023 Zhiwei Xiong, Yunfan Zhang, Zhiqi Shen, Peiran Ren, Han Yu

Image aesthetics assessment (IAA) aims to estimate the aesthetics of images.

Domain Adaptive Synapse Detection with Weak Point Annotations

no code implementations31 Aug 2023 Qi Chen, Wei Huang, Yueyi Zhang, Zhiwei Xiong

In the second stage, we improve model generalizability on target data by regenerating square masks to get high-quality pseudo labels.

Segmentation

Generalized Lightness Adaptation with Channel Selective Normalization

1 code implementation ICCV 2023 Mingde Yao, Jie Huang, Xin Jin, Ruikang Xu, Shenglong Zhou, Man Zhou, Zhiwei Xiong

Existing methods typically work well on their trained lightness conditions but perform poorly in unknown ones due to their limited generalization ability.

Image Retouching Low-Light Image Enhancement +1

Mutual-Guided Dynamic Network for Image Fusion

1 code implementation24 Aug 2023 Yuanshen Guan, Ruikang Xu, Mingde Yao, Lizhi Wang, Zhiwei Xiong

Image fusion aims to generate a high-quality image from multiple images captured under varying conditions.

Feature Decoupling-Recycling Network for Fast Interactive Segmentation

no code implementations7 Aug 2023 Huimin Zeng, Weinong Wang, Xin Tao, Zhiwei Xiong, Yu-Wing Tai, Wenjie Pei

First, our model decouples the learning of source image semantics from the encoding of user guidance to process two types of input domains separately.

Image Segmentation Interactive Segmentation +3

Deep Multi-Threshold Spiking-UNet for Image Processing

1 code implementation20 Jul 2023 Hebei Li, Yueyi Zhang, Zhiwei Xiong, Xiaoyan Sun

Furthermore, we adopt a flow-based training method to fine-tune the converted models, reducing time steps while preserving performance.

Denoising Image Segmentation +1

Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling

1 code implementation8 Jul 2023 Tong Li, Hansen Feng, Lizhi Wang, Zhiwei Xiong, Hua Huang

Image denoising is a fundamental problem in computational photography, where achieving high perception with low distortion is highly demanding.

Color Image Denoising +1

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

no code implementations7 Jun 2023 Yinda Chen, Che Liu, Wei Huang, Sibo Cheng, Rossella Arcucci, Zhiwei Xiong

To address these challenges, we present Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation (GTGM), a framework that extends of VLP to 3D medical images without relying on paired textual descriptions.

Computed Tomography (CT) Contrastive Learning +4

Toward Real-World Light Field Super-Resolution

1 code implementation30 May 2023 Zeyu Xiao, Ruisheng Gao, Yutong Liu, Yueyi Zhang, Zhiwei Xiong

Deep learning has opened up new possibilities for light field super-resolution (SR), but existing methods trained on synthetic datasets with simple degradations (e. g., bicubic downsampling) suffer from poor performance when applied to complex real-world scenarios.

Super-Resolution

A Dive into SAM Prior in Image Restoration

no code implementations23 May 2023 Zeyu Xiao, Jiawang Bai, Zhihe Lu, Zhiwei Xiong

This motivates the investigation and incorporation of prior knowledge in order to effectively constrain the solution space and enhance the quality of the restored images.

Color Image Denoising Image Denoising +2

Can SAM Boost Video Super-Resolution?

no code implementations11 May 2023 Zhihe Lu, Zeyu Xiao, Jiawang Bai, Zhiwei Xiong, Xinchao Wang

To use the SAM-based prior, we propose a simple yet effective module -- SAM-guidEd refinEment Module (SEEM), which can enhance both alignment and fusion procedures by the utilization of semantic information.

Optical Flow Estimation Video Super-Resolution

Region-Aware Portrait Retouching with Sparse Interactive Guidance

1 code implementation8 Apr 2023 Huimin Zeng, Jie Huang, Jiacheng Li, Zhiwei Xiong

Specifically, we propose a region-aware retouching framework with two branches: an automatic branch and an interactive branch.

Why is the winner the best?

no code implementations CVPR 2023 Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Veronika Cheplygina, Marie Daum, Marleen de Bruijne, Adrien Depeursinge, Reuben Dorent, Jan Egger, David G. Ellis, Sandy Engelhardt, Melanie Ganz, Noha Ghatwary, Gabriel Girard, Patrick Godau, Anubha Gupta, Lasse Hansen, Kanako Harada, Mattias Heinrich, Nicholas Heller, Alessa Hering, Arnaud Huaulmé, Pierre Jannin, Ali Emre Kavur, Oldřich Kodym, Michal Kozubek, Jianning Li, Hongwei Li, Jun Ma, Carlos Martín-Isla, Bjoern Menze, Alison Noble, Valentin Oreiller, Nicolas Padoy, Sarthak Pati, Kelly Payette, Tim Rädsch, Jonathan Rafael-Patiño, Vivek Singh Bawa, Stefanie Speidel, Carole H. Sudre, Kimberlin Van Wijnen, Martin Wagner, Donglai Wei, Amine Yamlahi, Moi Hoon Yap, Chun Yuan, Maximilian Zenk, Aneeq Zia, David Zimmerer, Dogu Baran Aydogan, Binod Bhattarai, Louise Bloch, Raphael Brüngel, Jihoon Cho, Chanyeol Choi, Qi Dou, Ivan Ezhov, Christoph M. Friedrich, Clifton Fuller, Rebati Raman Gaire, Adrian Galdran, Álvaro García Faura, Maria Grammatikopoulou, SeulGi Hong, Mostafa Jahanifar, Ikbeom Jang, Abdolrahim Kadkhodamohammadi, Inha Kang, Florian Kofler, Satoshi Kondo, Hugo Kuijf, Mingxing Li, Minh Huan Luu, Tomaž Martinčič, Pedro Morais, Mohamed A. Naser, Bruno Oliveira, David Owen, Subeen Pang, Jinah Park, Sung-Hong Park, Szymon Płotka, Elodie Puybareau, Nasir Rajpoot, Kanghyun Ryu, Numan Saeed, Adam Shephard, Pengcheng Shi, Dejan Štepec, Ronast Subedi, Guillaume Tochon, Helena R. Torres, Helene Urien, João L. Vilaça, Kareem Abdul Wahid, Haojie Wang, Jiacheng Wang, Liansheng Wang, Xiyue Wang, Benedikt Wiestler, Marek Wodzinski, Fangfang Xia, Juanying Xie, Zhiwei Xiong, Sen yang, Yanwu Yang, Zixuan Zhao, Klaus Maier-Hein, Paul F. Jäger, Annette Kopp-Schneider, Lena Maier-Hein

The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning.

Benchmarking Multi-Task Learning

Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple Look-Up Tables

1 code implementation25 Mar 2023 Jiacheng Li, Chang Chen, Zhen Cheng, Zhiwei Xiong

However, the size of a single LUT grows exponentially with the increase of its indexing capacity, which restricts its receptive field and thus the performance.

Demosaicking Denoising +1

Toward RAW Object Detection: A New Benchmark and a New Model

no code implementations CVPR 2023 Ruikang Xu, Chang Chen, Jingyang Peng, Cheng Li, Yibin Huang, Fenglong Song, Youliang Yan, Zhiwei Xiong

In many computer vision applications (e. g., robotics and autonomous driving), high dynamic range (HDR) data is necessary for object detection algorithms to handle a variety of lighting conditions, such as strong glare.

Autonomous Driving Object +2

Camouflaged Instance Segmentation via Explicit De-Camouflaging

no code implementations CVPR 2023 Naisong Luo, Yuwen Pan, Rui Sun, Tianzhu Zhang, Zhiwei Xiong, Feng Wu

To address these challenges, we propose a novel De-camouflaging Network (DCNet) including a pixel-level camouflage decoupling module and an instance-level camouflage suppression module.

Instance Segmentation Segmentation +1

Event-Based Blurry Frame Interpolation Under Blind Exposure

1 code implementation CVPR 2023 Wenming Weng, Yueyi Zhang, Zhiwei Xiong

Therefore, we first propose an exposure estimation strategy guided by event streams to estimate the lost exposure prior, transforming the blind exposure problem well-posed.

Zero-Shot Dual-Lens Super-Resolution

1 code implementation CVPR 2023 Ruikang Xu, Mingde Yao, Zhiwei Xiong

To overcome these two challenges, we propose a degradation-invariant alignment method and a degradation-aware training strategy to fully exploit the information within a single dual-lens pair.

Super-Resolution

Style Projected Clustering for Domain Generalized Semantic Segmentation

no code implementations CVPR 2023 Wei Huang, Chang Chen, Yong Li, Jiacheng Li, Cheng Li, Fenglong Song, Youliang Yan, Zhiwei Xiong

In contrast to existing methods, we instead utilize the difference between images to build a better representation space, where the distinct style features are extracted and stored as the bases of representation.

Clustering Semantic Segmentation

Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images

no code implementations ICCV 2023 Yuwen Pan, Naisong Luo, Rui Sun, Meng Meng, Tianzhu Zhang, Zhiwei Xiong, Yongdong Zhang

Mitochondria, as tiny structures within the cell, are of significant importance to study cell functions for biological and clinical analysis.

Unsupervised Video Deraining with An Event Camera

1 code implementation ICCV 2023 Jin Wang, Wenming Weng, Yueyi Zhang, Zhiwei Xiong

Current unsupervised video deraining methods are inefficient in modeling the intricate spatio-temporal properties of rain, which leads to unsatisfactory results.

Contrastive Learning Rain Removal +1

Learning Cross-Representation Affinity Consistency for Sparsely Supervised Biomedical Instance Segmentation

1 code implementation ICCV 2023 Xiaoyu Liu, Wei Huang, Zhiwei Xiong, Shenglong Zhou, Yueyi Zhang, Xuejin Chen, Zheng-Jun Zha, Feng Wu

Sparse instance-level supervision has recently been explored to address insufficient annotation in biomedical instance segmentation, which is easier to annotate crowded instances and better preserves instance completeness for 3D volumetric datasets compared to common semi-supervision. In this paper, we propose a sparsely supervised biomedical instance segmentation framework via cross-representation affinity consistency regularization.

Instance Segmentation Pseudo Label +1

NLOST: Non-Line-of-Sight Imaging With Transformer

no code implementations CVPR 2023 Yue Li, Jiayong Peng, Juntian Ye, Yueyi Zhang, Feihu Xu, Zhiwei Xiong

Specifically, after extracting the shallow features with the assistance of physics-based priors, we design two spatial-temporal self attention encoders to explore both local and global correlations within 3D NLOS data by splitting or downsampling the features into different scales, respectively.

Decoder

CutMIB: Boosting Light Field Super-Resolution via Multi-View Image Blending

1 code implementation CVPR 2023 Zeyu Xiao, Yutong Liu, Ruisheng Gao, Zhiwei Xiong

For the first time in light field SR, we propose a potent DA strategy called CutMIB to improve the performance of existing light field SR networks while keeping their structures unchanged.

Data Augmentation Denoising +1

Hierarchical Prompt Learning for Multi-Task Learning

no code implementations CVPR 2023 Yajing Liu, Yuning Lu, Hao liu, Yaozu An, Zhuoran Xu, Zhuokun Yao, Baofeng Zhang, Zhiwei Xiong, Chenguang Gui

Considering this, we present Hierarchical Prompt (HiPro) learning, a simple and effective method for jointly adapting a pre-trained VLM to multiple downstream tasks.

Multi-Task Learning Prompt Learning

Learning Sample Relationship for Exposure Correction

no code implementations CVPR 2023 Jie Huang, Feng Zhao, Man Zhou, Jie Xiao, Naishan Zheng, Kaiwen Zheng, Zhiwei Xiong

Exposure correction task aims to correct the underexposure and its adverse overexposure images to the normal exposure in a single network.

Exposure Correction Task 2

A Soma Segmentation Benchmark in Full Adult Fly Brain

1 code implementation CVPR 2023 Xiaoyu Liu, Bo Hu, Mingxing Li, Wei Huang, Yueyi Zhang, Zhiwei Xiong

Finally, we provide quantitative and qualitative benchmark comparisons on the testset to validate the superiority of the proposed method, as well as preliminary statistics of the reconstructed somas in the full adult fly brain from the biological perspective.

Depth Estimation From Indoor Panoramas With Neural Scene Representation

1 code implementation CVPR 2023 Wenjie Chang, Yueyi Zhang, Zhiwei Xiong

Depth estimation from indoor panoramas is challenging due to the equirectangular distortions of panoramas and inaccurate matching.

Depth Estimation Position

Progressive Spatio-Temporal Alignment for Efficient Event-Based Motion Estimation

1 code implementation CVPR 2023 Xueyan Huang, Yueyi Zhang, Zhiwei Xiong

In addition, a dynamic batch size strategy is applied to adaptively adjust the batch size so that all events in the batch are consistent with the current motion model.

Event-based Motion Estimation Motion Estimation

Towards Real World HDRTV Reconstruction: A Data Synthesis-based Approach

no code implementations6 Nov 2022 Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong

To solve this problem, we propose a learning-based data synthesis approach to learn the properties of real-world SDRTVs by integrating several tone mapping priors into both network structures and loss functions.

Tone Mapping

TridentSE: Guiding Speech Enhancement with 32 Global Tokens

no code implementations24 Oct 2022 Dacheng Yin, Zhiyuan Zhao, Chuanxin Tang, Zhiwei Xiong, Chong Luo

In this paper, we present TridentSE, a novel architecture for speech enhancement, which is capable of efficiently capturing both global information and local details.

Speech Enhancement

Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction

1 code implementation15 Sep 2022 Gang Yang, Li Zhang, Man Zhou, Aiping Liu, Xun Chen, Zhiwei Xiong, Feng Wu

Interpretable neural network models are of significant interest since they enhance the trustworthiness required in clinical practice when dealing with medical images.

Super-Resolution

RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion

no code implementations28 Jun 2022 Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo

In the proposed paradigm, global and local factors in speech are explicitly decomposed and separately manipulated to achieve high speaker similarity and continuous prosody.

Sentence

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

no code implementations CVPR 2022 Xihao Chen, Zhiwei Xiong, Zhen Cheng, Jiayong Peng, Yueyi Zhang, Zheng-Jun Zha

Interestingly, we find that, although a stereo matching network trained with the photometric loss is not optimal, its feature extractor can produce degradation-agnostic and matching-specific features.

Stereo Matching

aiWave: Volumetric Image Compression with 3-D Trained Affine Wavelet-like Transform

no code implementations11 Mar 2022 Dongmei Xue, Haichuan Ma, Li Li, Dong Liu, Zhiwei Xiong

Volumetric image compression has become an urgent task to effectively transmit and store images produced in biological research and clinical practice.

Image Compression

Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

2 code implementations ICLR 2022 Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng

Last, an innovative link attention module serves as the decoder to reconstruct data from the decomposed content and style, with the help of the linking keys.

Decoder Quantization +2

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters

1 code implementation3 Feb 2022 Mingxing Li, Shenglong Zhou, Chang Chen, Yueyi Zhang, Dong Liu, Zhiwei Xiong

Accurate retinal vessel segmentation is challenging because of the complex texture of retinal vessels and low imaging contrast.

Retinal Vessel Segmentation Segmentation

Contextual Outpainting With Object-Level Contrastive Learning

no code implementations CVPR 2022 Jiacheng Li, Chang Chen, Zhiwei Xiong

To model the contextual correlation between foreground and background contents, we incorporate an object-level contrastive loss to regularize the learning of cross-modal representations of foreground contents and the corresponding background semantic layout, facilitating accurate semantic reasoning.

Contrastive Learning Image Outpainting +1

Exploiting Rigidity Constraints for LiDAR Scene Flow Estimation

no code implementations CVPR 2022 Guanting Dong, Yueyi Zhang, HanLin Li, Xiaoyan Sun, Zhiwei Xiong

Previous LiDAR scene flow estimation methods, especially recurrent neural networks, usually suffer from structure distortion in challenging cases, such as sparse reflection and motion occlusions.

Autonomous Driving Scene Flow Estimation

Exposure Normalization and Compensation for Multiple-Exposure Correction

no code implementations CVPR 2022 Jie Huang, Yajing Liu, Xueyang Fu, Man Zhou, Yang Wang, Feng Zhao, Zhiwei Xiong

However, the procedures of correcting underexposure and overexposure to normal exposures are much different from each other, leading to large discrepancies for the network in correcting multiple exposures, thus resulting in poor performance.

Exposure Correction Image Enhancement

Continuous Spectral Reconstruction from RGB Images via Implicit Neural Representation

no code implementations24 Dec 2021 Ruikang Xu, Mingde Yao, Chang Chen, Lizhi Wang, Zhiwei Xiong

In this paper, we propose Neural Spectral Reconstruction (NeSR) to lift this limitation, by introducing a novel continuous spectral representation.

Spectral Reconstruction

Attribute Artifacts Removal for Geometry-based Point Cloud Compression

no code implementations1 Dec 2021 Xihua Sheng, Li Li, Dong Liu, Zhiwei Xiong

In this paper, we propose a Multi-Scale Graph Attention Network (MS-GAT) to remove the artifacts of point cloud attributes compressed by G-PCC.

Attribute Graph Attention +2

Contrastive Learning for Mitochondria Segmentation

no code implementations25 Sep 2021 Zhili Li, Xuejin Chen, Jie Zhao, Zhiwei Xiong

However, due to the image degradation during the imaging process, the large variety of mitochondrial structures, as well as the presence of noise, artifacts and other sub-cellular structures, mitochondria segmentation is very challenging.

Contrastive Learning Segmentation

iWave3D: End-to-end Brain Image Compression with Trainable 3-D Wavelet Transform

no code implementations18 Sep 2021 Dongmei Xue, Haichuan Ma, Li Li, Dong Liu, Zhiwei Xiong

With the rapid development of whole brain imaging technology, a large number of brain images have been produced, which puts forward a great demand for efficient brain image compression methods.

Image Compression

Towards Non-Line-of-Sight Photography

no code implementations16 Sep 2021 Jiayong Peng, Fangzhou Mu, Ji Hyun Nam, Siddeshwar Raghavan, Yin Li, Andreas Velten, Zhiwei Xiong

Non-line-of-sight (NLOS) imaging is based on capturing the multi-bounce indirect reflections from the hidden objects.

3D geometry

Unfolding Taylor's Approximations for Image Restoration

no code implementations NeurIPS 2021 Man Zhou, Zeyu Xiao, Xueyang Fu, Aiping Liu, Gang Yang, Zhiwei Xiong

Deep learning provides a new avenue for image restoration, which demands a delicate balance between fine-grained details and high-level contextualized information during recovering the latent clear image.

Image Restoration

Light Field Super-Resolution With Zero-Shot Learning

no code implementations CVPR 2021 Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha

To fill this gap, we propose a zero-shot learning framework for light field SR, which learns a mapping to super-resolve the reference view with examples extracted solely from the input low-resolution light field itself.

Super-Resolution Zero-Shot Learning

Space-Time Distillation for Video Super-Resolution

no code implementations CVPR 2021 Zeyu Xiao, Xueyang Fu, Jie Huang, Zhen Cheng, Zhiwei Xiong

In this paper, we aim to improve the performance of compact VSR networks without changing their original architectures, through a knowledge distillation approach that transfers knowledge from a complicated VSR network to a compact one.

Knowledge Distillation Video Super-Resolution

Unsupervised Visual Representation Learning by Tracking Patches in Video

1 code implementation CVPR 2021 Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

The proxy task is to estimate the position and size of the image patch in a sequence of video frames, given only the target bounding box in the first frame.

Action Classification Action Recognition +1

Advanced Deep Networks for 3D Mitochondria Instance Segmentation

1 code implementation16 Apr 2021 Mingxing Li, Chang Chen, Xiaoyu Liu, Wei Huang, Yueyi Zhang, Zhiwei Xiong

Mitochondria instance segmentation from electron microscopy (EM) images has seen notable progress since the introduction of deep learning methods.

3D Instance Segmentation Denoising +2

Phoneme-based Distribution Regularization for Speech Enhancement

no code implementations8 Apr 2021 Yajing Liu, Xiulian Peng, Zhiwei Xiong, Yan Lu

Specifically, we propose a phoneme-based distribution regularization (PbDr) for speech enhancement, which incorporates frame-wise phoneme information into speech enhancement network in a conditional manner.

Speech Enhancement

Synergy Between Semantic Segmentation and Image Denoising via Alternate Boosting

no code implementations24 Feb 2021 Shunxin Xu, Ke Sun, Dong Liu, Zhiwei Xiong, Zheng-Jun Zha

We observe that not only denoising helps combat the drop of segmentation accuracy due to noise, but also pixel-wise semantic information boosts the capability of denoising.

Image Denoising Segmentation +1

Event-Based Video Reconstruction Using Transformer

1 code implementation ICCV 2021 Wenming Weng, Yueyi Zhang, Zhiwei Xiong

Event cameras, which output events by detecting spatio-temporal brightness changes, bring a novel paradigm to image sensors with high dynamic range and low latency.

Event-based Object Segmentation Event-Based Video Reconstruction +1

Camera Trace Erasing

1 code implementation CVPR 2020 Chang Chen, Zhiwei Xiong, Xiaoming Liu, Feng Wu

To reconcile these two demands, we propose Siamese Trace Erasing (SiamTE), in which a novel hybrid loss is designed on the basis of Siamese architecture for network training.

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

no code implementations13 Mar 2020 Haochen Zhang, Dong Liu, Zhiwei Xiong

Recent advances of deep learning lead to great success of image and video super-resolution (SR) methods that are based on convolutional neural networks (CNN).

Video Super-Resolution

PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network

3 code implementations Applications of Artificial Intelligence Conference 2019 Dacheng Yin, Chong Luo, Zhiwei Xiong, Wen-Jun Zeng

We discover that the two streams should communicate with each other, and this is crucial to phase prediction.

Sound Audio and Speech Processing

On The Classification-Distortion-Perception Tradeoff

no code implementations NeurIPS 2019 Dong Liu, Haochen Zhang, Zhiwei Xiong

In this paper, we extend the previous perception-distortion tradeoff to the case of classification-distortion-perception (CDP) tradeoff, where we introduced the classification error rate of the restored signal in addition to distortion and perceptual difference.

Classification General Classification

Camera Lens Super-Resolution

1 code implementation CVPR 2019 Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, Feng Wu

Existing methods for single image super-resolution (SR) are typically evaluated with synthetic degradation models such as bicubic or Gaussian downsampling.

Image Super-Resolution

Two-Stream Action Recognition-Oriented Video Super-Resolution

1 code implementation ICCV 2019 Haochen Zhang, Dong Liu, Zhiwei Xiong

Tailored for two-stream action recognition networks, we propose two video SR methods for the spatial and temporal streams respectively.

Action Recognition Optical Flow Estimation +3

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification

no code implementations19 Nov 2018 Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, Yongdong Zhang

An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts.

Attribute Multi-Task Learning +1

Snapshot Hyperspectral Light Field Imaging

no code implementations CVPR 2017 Zhiwei Xiong, Lizhi Wang, Huiqun Li, Dong Liu, Feng Wu

This paper presents the first snapshot hyperspectral light field imager in practice.

MARLow: A Joint Multiplanar Autoregressive and Low-Rank Approach for Image Completion

no code implementations3 May 2016 Mading Li, Jiaying Liu, Zhiwei Xiong, Xiaoyan Sun, Zongming Guo

In this paper, we propose a novel multiplanar autoregressive (AR) model to exploit the correlation in cross-dimensional planes of a similar patch group collected in an image, which has long been neglected by previous AR models.

High-Speed Hyperspectral Video Acquisition With a Dual-Camera Architecture

no code implementations CVPR 2015 Lizhi Wang, Zhiwei Xiong, Dahua Gao, Guangming Shi, Wen-Jun Zeng, Feng Wu

We propose a novel dual-camera design to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution.

Vocal Bursts Intensity Prediction

Depth Acquisition from Density Modulated Binary Patterns

no code implementations CVPR 2013 Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu

First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect.

Cannot find the paper you are looking for? You can Submit a new open access paper.