Search Results for author: Yanning Zhang

Found 127 papers, 37 papers with code

C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning

no code implementations21 May 2024 Ji Ma, Wei Suo, Peng Wang, Yanning Zhang

Vision-Language Instruction Tuning (VLIT) is a critical training phase for Large Vision-Language Models (LVLMs).

Dual-Modal Prompting for Sketch-Based Image Retrieval

no code implementations29 Apr 2024 Liying Gao, Bingliang Jiao, Peng Wang, Shizhou Zhang, Hanwang Zhang, Yanning Zhang

In this study, we aim to tackle two major challenges of this task simultaneously: i) zero-shot, dealing with unseen categories, and ii) fine-grained, referring to intra-category instance-level retrieval.

Retrieval Sketch-Based Image Retrieval

NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

3 code implementations22 Apr 2024 Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi Jin, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, Jing Lin, Alan Yuille, Ben Shao, Jin Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin, Yu Zhu, Peng Wu, Wei Dong, Jinqiu Sun, Yanning Zhang, Qingsen Yan, Wenbin Zou, Weipeng Yang, Yunxiang Li, Qiaomu Wei, Tian Ye, Sixiang Chen, Zhao Zhang, Suiyi Zhao, Bo wang, Yan Luo, Zhichao Zuo, Mingshen Wang, Junhu Wang, Yanyan Wei, Xiaopeng Sun, Yu Gao, Jiancheng Huang, Hongming Chen, Xiang Chen, Hui Tang, Yuanbin Chen, Yuanbo Zhou, Xinwei Dai, Xintao Qiu, Wei Deng, Qinquan Gao, Tong Tong, Mingjia Li, Jin Hu, Xinyu He, Xiaojie Guo, sabarinathan, K Uma, A Sasithradevi, B Sathya Bama, S. Mohamed Mansoor Roomi, V. Srivatsav, Jinjuan Wang, Long Sun, Qiuying Chen, Jiahong Shao, Yizhi Zhang, Marcos V. Conde, Daniel Feijoo, Juan C. Benito, Alvaro García, Jaeho Lee, Seongwan Kim, Sharif S M A, Nodirkhuja Khujaev, Roman Tsoy, Ali Murtaza, Uswah Khairuddin, Ahmad 'Athif Mohd Faudzi, Sampada Malagi, Amogh Joshi, Nikhil Akalwadi, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudenagudi, Wenyi Lian, Wenjing Lian, Jagadeesh Kalyanshetti, Vijayalaxmi Ashok Aralikatti, Palani Yashaswini, Nitish Upasi, Dikshit Hegde, Ujwala Patil, Sujata C, Xingzhuo Yan, Wei Hao, Minghan Fu, Pooja Choksy, Anjali Sarvaiya, Kishor Upla, Kiran Raja, Hailong Yan, Yunkai Zhang, Baiang Li, Jingyi Zhang, Huan Zheng

This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results.

4k Low-Light Image Enhancement +1

CRNet: A Detail-Preserving Network for Unified Image Restoration and Enhancement Task

1 code implementation22 Apr 2024 Kangzhen Yang, Tao Hu, Kexin Dai, Genggeng Chen, Yu Cao, Wei Dong, Peng Wu, Yanning Zhang, Qingsen Yan

In real-world scenarios, images captured often suffer from blurring, noise, and other forms of image degradation, and due to sensor limitations, people usually can only obtain low dynamic range images.

Deblurring Denoising +2

GoMVS: Geometrically Consistent Cost Aggregation for Multi-View Stereo

1 code implementation11 Apr 2024 Jiang Wu, Rui Li, Haofei Xu, Wenxun Zhao, Yu Zhu, Jinqiu Sun, Yanning Zhang

More specifically, we correspond and propagate adjacent costs to the reference pixel by leveraging the local geometric smoothness in conjunction with surface normals.

Generating Content for HDR Deghosting from Frequency View

no code implementations1 Apr 2024 Tao Hu, Qingsen Yan, Yuankai Qi, Yanning Zhang

To address this challenge, we propose the Low-Frequency aware Diffusion (LF-Diff) model for ghost-free HDR imaging.

HDR Reconstruction regression

A self-supervised CNN for image watermark removal

1 code implementation9 Mar 2024 Chunwei Tian, Menghua Zheng, Tiancai Jiao, WangMeng Zuo, Yanning Zhang, Chia-Wen Lin

Popular convolutional neural networks mainly use paired images in a supervised way for image watermark removal.

Perceptive self-supervised learning network for noisy image watermark removal

1 code implementation4 Mar 2024 Chunwei Tian, Menghua Zheng, Bo Li, Yanning Zhang, Shichao Zhang, David Zhang

Specifically, mentioned paired watermark images are obtained in a self supervised way, and paired noisy images (i. e., noisy and reference images) are obtained in a supervised way.

Self-Supervised Learning

Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey

no code implementations4 Mar 2024 Lingyan Ran, YaLi Li, Guoqiang Liang, Yanning Zhang

Semantic segmentation is an important and popular research area in computer vision that focuses on classifying pixels in an image based on their semantics.

Image Segmentation Pseudo Label +2

A Heterogeneous Dynamic Convolutional Neural Network for Image Super-resolution

1 code implementation24 Feb 2024 Chunwei Tian, Xuanyu Zhang, Jia Ren, WangMeng Zuo, Yanning Zhang, Chia-Wen Lin

The lower network utilizes a symmetric architecture to enhance relations of different layers to mine more structural information, which is complementary with a upper network for image super-resolution.

Image Super-Resolution

You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement

1 code implementation8 Feb 2024 Yixu Feng, Cheng Zhang, Pei Wang, Peng Wu, Qingsen Yan, Yanning Zhang

Further, we design a novel Color and Intensity Decoupling Network (CIDNet) with two branches dedicated to processing the decoupled image brightness and color in the HVI space.

Low-light Image Deblurring and Enhancement Low-Light Image Enhancement

Instance by Instance: An Iterative Framework for Multi-instance 3D Registration

no code implementations6 Feb 2024 Xinyue Cao, Xiyu Zhang, Yuxin Cheng, Zhaoshuai Qi, Yanning Zhang, Jiaqi Yang

Multi-instance registration is a challenging problem in computer vision and robotics, where multiple instances of an object need to be registered in a standard coordinate system.

Boosting Multi-view Stereo with Late Cost Aggregation

1 code implementation22 Jan 2024 Jiang Wu, Rui Li, Yu Zhu, Wenxun Zhao, Jinqiu Sun, Yanning Zhang

To address this challenge, we present a late aggregation approach that allows for aggregating pairwise costs throughout the network feed-forward process, achieving accurate estimations with only minor changes of the plain CasMVSNet.

Blocking Geometric Matching

CrossDiff: Exploring Self-Supervised Representation of Pansharpening via Cross-Predictive Diffusion Model

no code implementations10 Jan 2024 Yinghui Xing, Litao Qu, Shizhou Zhang, Kai Zhang, Yanning Zhang

Fusion of a panchromatic (PAN) image and corresponding multispectral (MS) image is also known as pansharpening, which aims to combine abundant spatial details of PAN and spectral information of MS. Due to the absence of high-resolution MS images, available deep-learning-based methods usually follow the paradigm of training at reduced resolution and testing at both reduced and full resolution.


DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution

no code implementations30 Nov 2023 Axi Niu, Kang Zhang, Joshua Tian Jin Tee, Trung X. Pham, Jinqiu Sun, Chang D. Yoo, In So Kweon, Yanning Zhang

It is well known the adversarial optimization of GAN-based image super-resolution (SR) methods makes the preceding SR model generate unpleasant and undesirable artifacts, leading to large distortion.

Attribute Data Augmentation +1

Open-Vocabulary Video Anomaly Detection

no code implementations13 Nov 2023 Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, Yanning Zhang

Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task.

Anomaly Detection Video Anomaly Detection

Multiple Object Tracking based on Occlusion-Aware Embedding Consistency Learning

no code implementations5 Nov 2023 Yaoqi Hu, Axi Niu, Yu Zhu, Qingsen Yan, Jinqiu Sun, Yanning Zhang

The OPM predicts occlusion information for each true detection, facilitating the selection of valid samples for consistency learning of the track's visual embedding.

Multiple Object Tracking Object +1

Towards High-quality HDR Deghosting with Conditional Diffusion Models

no code implementations2 Nov 2023 Qingsen Yan, Tao Hu, Yuan Sun, Hao Tang, Yu Zhu, Wei Dong, Luc van Gool, Yanning Zhang

To address this challenge, we formulate the HDR deghosting problem as an image generation that leverages LDR features as the diffusion model's condition, consisting of the feature condition generator and the noise predictor.

Denoising Image Generation

Adapt Anything: Tailor Any Image Classifiers across Domains And Categories Using Text-to-Image Diffusion Models

no code implementations25 Oct 2023 WeiJie Chen, Haoyu Wang, Shicai Yang, Lei Zhang, Wei Wei, Yanning Zhang, Luojun Lin, Di Xie, Yueting Zhuang

Such a one-for-all adaptation paradigm allows us to adapt anything in the world using only one text-to-image generator as well as the corresponding unlabeled target data.

Domain Adaptation Image Classification

A cross Transformer for image denoising

1 code implementation16 Oct 2023 Chunwei Tian, Menghua Zheng, WangMeng Zuo, Shichao Zhang, Yanning Zhang, Chia-Wen Ling

To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes.

Image Denoising

Human-centric Behavior Description in Videos: New Benchmark and Model

no code implementations4 Oct 2023 Lingru Zhou, Yiqi Gao, Manqing Zhang, Peng Wu, Peng Wang, Yanning Zhang

To address this challenge, we construct a human-centric video surveillance captioning dataset, which provides detailed descriptions of the dynamic behaviors of 7, 820 individuals.

Video Captioning

Ground-to-Aerial Person Search: Benchmark Dataset and Approach

1 code implementation24 Aug 2023 Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang

In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.

Knowledge Distillation Person Search

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

1 code implementation22 Aug 2023 Peng Wu, Xuerong Zhou, Guansong Pang, Lingru Zhou, Qingsen Yan, Peng Wang, Yanning Zhang

With the benefit of dual branch, VadCLIP achieves both coarse-grained and fine-grained video anomaly detection by transferring pre-trained knowledge from CLIP to WSVAD task.

Anomaly Detection Binary Classification +1

Learning multi-domain feature relation for visible and Long-wave Infrared image patch matching

no code implementations9 Aug 2023 Xiuwei Zhang, Yanping Li, Zhaoshuai Qi, Yi Sun, Yanning Zhang

Recently, learning-based algorithms have achieved promising performance on cross-spectral image patch matching, which, however, is still far from satisfactory for practical application.

Patch Matching Relation

Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization

1 code implementation9 Aug 2023 Tianyu Liu, Peng Zhang, Wei Huang, Yufei zha, Tao You, Yanning Zhang

By decoupling the gradients of visual and audio modalities, the discriminative visual representations of sound sources can be learned with the designed Induction Vector in a bootstrap manner, which also enables the audio modality to be aligned with the visual modality consistently.

Contrastive Learning

All-in-one Multi-degradation Image Restoration Network via Hierarchical Degradation Representation

no code implementations6 Aug 2023 Cheng Zhang, Yu Zhu, Qingsen Yan, Jinqiu Sun, Yanning Zhang

To address this issue, we propose a novel All-in-one Multi-degradation Image Restoration Network (AMIRNet) that can effectively capture and utilize accurate degradation representation for image restoration.

Contrastive Learning Deblurring +3

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 code implementation24 Jul 2023 Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.

Anomaly Detection Retrieval +2

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

no code implementations20 Jul 2023 Yinghui Xing, Dexuan Kong, Shizhou Zhang, Geng Chen, Lingyan Ran, Peng Wang, Yanning Zhang

Camouflaged object detection (COD), aiming to segment camouflaged objects which exhibit similar patterns with the background, is a challenging task.

Multi-Task Learning object-detection +1

VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by Visual-Semantic Fusion for Egocentric Action Anticipation

no code implementations8 Jul 2023 Congqi Cao, Ze Sun, Qinyi Lv, Lingtong Min, Yanning Zhang

Egocentric action anticipation is a challenging task that aims to make advanced predictions of future actions from current and historical observations in the first-person view.

Action Anticipation Decoder

ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution

no code implementations3 Jul 2023 Axi Niu, Pham Xuan Trung, Kang Zhang, Jinqiu Sun, Yu Zhu, In So Kweon, Yanning Zhang

To speed up inference and further enhance the performance, our research revisits diffusion models in image super-resolution and proposes a straightforward yet significant diffusion model-based super-resolution method called ACDMSR (accelerated conditional diffusion model for image super-resolution).

Denoising Image Super-Resolution +1

Learning from Multi-Perception Features for Real-Word Image Super-resolution

no code implementations26 May 2023 Axi Niu, Kang Zhang, Trung X. Pham, Pei Wang, Jinqiu Sun, In So Kweon, Yanning Zhang

Currently, there are two popular approaches for addressing real-world image super-resolution problems: degradation-estimation-based and blind-based methods.

Image Super-Resolution

A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation

no code implementations CVPR 2023 Congqi Cao, Yue Lu, Peng Wang, Yanning Zhang

At present, it is the largest semi-supervised VAD dataset with the largest number of scenes and classes of anomalies, the longest duration, and the only one considering the scene-dependent anomaly.

Anomaly Detection Video Anomaly Detection

3D Registration with Maximal Cliques

1 code implementation CVPR 2023 Xiyu Zhang, Jiaqi Yang, Shikun Zhang, Yanning Zhang

The key insight is to loosen the previous maximum clique constraint, and mine more local consensus information in a graph for accurate pose hypotheses generation: 1) A compatibility graph is constructed to render the affinity relationship between initial correspondences.

Point Cloud Registration

Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes

1 code implementation CVPR 2023 Rui Li, Dong Gong, Wei Yin, Hao Chen, Yu Zhu, Kaixuan Wang, Xiaozhi Chen, Jinqiu Sun, Yanning Zhang

To let the geometric perception learned from multi-view cues in static areas propagate to the monocular representation in dynamic areas and let monocular cues enhance the representation of multi-view cost volume, we propose a cross-cue fusion (CCF) module, which includes the cross-cue attention (CCA) to encode the spatially non-local relative intra-relations from each source to enhance the representation of the other.

Autonomous Driving Depth Estimation

A Unified HDR Imaging Method with Pixel and Patch Level

no code implementations CVPR 2023 Qingsen Yan, Weiye Chen, Song Zhang, Yu Zhu, Jinqiu Sun, Yanning Zhang

The proposed HyHDRNet consists of a content alignment subnetwork and a Transformer-based fusion subnetwork.

MixCycle: Mixup Assisted Semi-Supervised 3D Single Object Tracking with Cycle Consistency

1 code implementation ICCV 2023 Qiao Wu, Jiaqi Yang, Kun Sun, Chu'ai Zhang, Yanning Zhang, Mathieu Salzmann

Specifically, we introduce two cycle-consistency strategies for supervision: 1) Self tracking cycles, which leverage labels to help the model converge better in the early stages of training; 2) forward-backward cycles, which strengthen the tracker's robustness to motion variations and the template noise caused by the template update strategy.

3D Single Object Tracking Data Augmentation +1

Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization

no code implementations15 Mar 2023 Congqi Cao, Yizhe WANG, Yue Lu, Xin Zhang, Yanning Zhang

Existing works in this field mainly suffer from two weaknesses: (1) They often neglect the multi-label case and only focus on temporal modeling.

Relation Temporal Action Localization

PSNet: a deep learning model based digital phase shifting algorithm from a single fringe image

no code implementations14 Mar 2023 Zhaoshuai Qi, Xiaojun Liu, Xiaolin Liu, Jiaqi Yang, Yanning Zhang

As the gold standard for phase retrieval, phase-shifting algorithm (PS) has been widely used in optical interferometry, fringe projection profilometry, etc.


GRAN: Ghost Residual Attention Network for Single Image Super Resolution

no code implementations28 Feb 2023 Axi Niu, Pei Wang, Yu Zhu, Jinqiu Sun, Qingsen Yan, Yanning Zhang

GRAB consists of the Ghost Module and Channel and Spatial Attention Module (CSAM) to alleviate the generation of redundant features.

Image Super-Resolution

New Insights on Relieving Task-Recency Bias for Online Class Incremental Learning

1 code implementation16 Feb 2023 Guoqiang Liang, Zhaojie Chen, Zhaoqiang Chen, Shiyu Ji, Yanning Zhang

In all settings, the online class incremental learning (OCIL), where incoming samples from data stream can be used only once, is more challenging and can be encountered more frequently in real world.

Class Incremental Learning Incremental Learning +1

Take a Prior from Other Tasks for Severe Blur Removal

no code implementations14 Feb 2023 Pei Wang, Danna Xue, Yu Zhu, Jinqiu Sun, Qingsen Yan, Sung-Eui Yoon, Yanning Zhang

For general scene deblurring, the feature space of the blurry image and corresponding sharp image under the high-level vision task is closer, which inspires us to rely on other tasks (e. g. classification) to learn a comprehensive prior in severe blur removal cases.

Deblurring Image Deblurring +1

MS-DETR: Multispectral Pedestrian Detection Transformer with Loosely Coupled Fusion and Modality-Balanced Optimization

1 code implementation1 Feb 2023 Yinghui Xing, Song Wang, Shizhou Zhang, Guoqiang Liang, Xiuwei Zhang, Yanning Zhang

Most of the available multispectral pedestrian detectors are based on non-end-to-end detectors, while in this paper, we propose MultiSpectral pedestrian DEtection TRansformer (MS-DETR), an end-to-end multispectral pedestrian detector, which extends DETR into the field of multi-modal detection.

Decoder Pedestrian Detection

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

1 code implementation CVPR 2023 Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Prototypical Network is a popular few-shot solver that aims at establishing a feature metric generalizable to novel few-shot classification (FSC) tasks using deep neural networks.

cross-domain few-shot learning Knowledge Distillation

Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance

no code implementations16 Dec 2022 Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

To enhance the discriminative power of features, we propose a batch clustering based loss to encourage a clustering branch to generate distinct normal and abnormal clusters based on a batch of data.

Anomaly Detection Clustering +1

Generalizable Person Re-Identification via Viewpoint Alignment and Fusion

no code implementations5 Dec 2022 Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Ruiqi Wu, Shizhou Zhang, Peng Wang, Yanning Zhang

The key insight of this design is that the cross-attention mechanism in the transformer could be an ideal solution to align the discriminative texture clues from the original image with the canonical view image, which could compensate for the low-quality texture information of the canonical view image.

Domain Generalization Generalizable Person Re-identification +1

A heterogeneous group CNN for image super-resolution

1 code implementation26 Sep 2022 Chunwei Tian, Yanning Zhang, WangMeng Zuo, Chia-Wen Lin, David Zhang, Yixuan Yuan

To prevent loss of original information, a multi-level enhancement mechanism guides a CNN to achieve a symmetric architecture for promoting expressive ability of HGSRCNN.

Image Super-Resolution

Multi-stage image denoising with the wavelet transform

1 code implementation26 Sep 2022 Chunwei Tian, Menghua Zheng, WangMeng Zuo, Bob Zhang, Yanning Zhang, David Zhang

In this paper, we propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i. e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and a residual block (RB).

Image Denoising

Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework for Video Anomaly Detection

1 code implementation7 Sep 2022 Congqi Cao, Yue Lu, Yanning Zhang

For the context recovery stream, we propose a spatiotemporal U-Net which can fully utilize the motion information to predict the future frame.

Anomaly Detection Retrieval +1

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

1 code implementation17 Aug 2022 Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang

To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.

General Knowledge Language Modelling +1

PC-GANs: Progressive Compensation Generative Adversarial Networks for Pan-sharpening

no code implementations29 Jul 2022 Yinghui Xing, Shuyuan Yang, Song Wang, Yan Zhang, Yanning Zhang

Most of the available deep learning-based pan-sharpening methods sharpen the multispectral images through a one-step scheme, which strongly depends on the reconstruction ability of the network.

Generative Adversarial Network Pansharpening

Pansharpening via Frequency-Aware Fusion Network with Explicit Similarity Constraints

1 code implementation18 Jul 2022 Yinghui Xing, Yan Zhang, Houjun He, Xiuwei Zhang, Yanning Zhang

The process of fusing a high spatial resolution (HR) panchromatic (PAN) image and a low spatial resolution (LR) multispectral (MS) image to obtain an HRMS image is known as pansharpening.


SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision

no code implementations13 Jul 2022 Danna Xue, Fei Yang, Pei Wang, Luis Herranz, Jinqiu Sun, Yu Zhu, Yanning Zhang

Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications.

Knowledge Distillation Segmentation +1

Going the Extra Mile in Face Image Quality Assessment: A Novel Database and Model

no code implementations11 Jul 2022 Shaolin Su, Hanhe Lin, Vlad Hosu, Oliver Wiedemann, Jinqiu Sun, Yu Zhu, Hantao Liu, Yanning Zhang, Dietmar Saupe

An accurate computational model for image quality assessment (IQA) benefits many vision applications, such as image filtering, image processing, and image generation.

Face Image Quality Face Image Quality Assessment +4

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations25 May 2022 Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Exploring and Evaluating Image Restoration Potential in Dynamic Scenes

1 code implementation CVPR 2022 Cheng Zhang, Shaolin Su, Yu Zhu, Qingsen Yan, Jinqiu Sun, Yanning Zhang

In this paper, to better study an image's potential value that can be explored for restoration, we propose a novel concept, referring to image restoration potential (IRP).

Image Restoration

An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection

no code implementations10 Mar 2022 Ganglai Wang, Peng Zhang, Lei Xie, Wei Huang, Yufei zha, Yanning Zhang

DeepFake based digital facial forgery is threatening the public media security, especially when lip manipulation has been used in talking face generation, the difficulty of fake video detection is further improved.

Decision Making Face Detection +2

Audio-visual speech separation based on joint feature representation with cross-modal attention

no code implementations5 Mar 2022 Junwen Xiong, Peng Zhang, Lei Xie, Wei Huang, Yufei zha, Yanning Zhang

Multi-modal based speech separation has exhibited a specific advantage on isolating the target character in multi-talker noisy environments.

Optical Flow Estimation Speech Separation

Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos

no code implementations14 Feb 2022 Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

For weakly supervised anomaly detection, most existing work is limited to the problem of inadequate video representation due to the inability of modeling long-term contextual information.

Graph Learning Supervised Anomaly Detection +1

Fast Adversarial Training with Noise Augmentation: A Unified Perspective on RandStart and GradAlign

no code implementations11 Feb 2022 Axi Niu, Kang Zhang, Chaoning Zhang, Chenshuang Zhang, In So Kweon, Chang D. Yoo, Yanning Zhang

The former works only for a relatively small perturbation 8/255 with the l_\infty constraint, and GradAlign improves it by extending the perturbation size to 16/255 (with the l_\infty constraint) but at the cost of being 3 to 4 times slower.

Data Augmentation

Multi-Domain Joint Training for Person Re-Identification

no code implementations6 Jan 2022 Lu Yang, Lingqiao Liu, Yunlong Wang, Peng Wang, Yanning Zhang

Our discovery is that training with such an adaptive model can better benefit from more training samples.

Person Re-Identification

Learnable Locality-Sensitive Hashing for Video Anomaly Detection

no code implementations15 Nov 2021 Yue Lu, Congqi Cao, Yanning Zhang

In this paper, we propose a novel distance-based VAD method to take advantage of all the available normal data efficiently and flexibly.

Abnormal Event Detection In Video Video Anomaly Detection

NAS-FCOS: Efficient Search for Object Detection Architectures

1 code implementation24 Oct 2021 Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang

Neural Architecture Search (NAS) has shown great potential in effectively reducing manual effort in network design by automatically discovering optimal architectures.

Neural Architecture Search Object +2

Text-based Person Search in Full Images via Semantic-Driven Proposal Generation

1 code implementation27 Sep 2021 Shizhou Zhang, De Cheng, Wenlong Luo, Yinghui Xing, Duo Long, Hao Li, Kai Niu, Guoqiang Liang, Yanning Zhang

Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.

Person Search Retrieval +3

Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking

1 code implementation31 Jul 2021 Jingxian Sun, Lichao Zhang, Yufei zha, Abel Gonzalez-Garcia, Peng Zhang, Wei Huang, Yanning Zhang

To solve this problem, we propose to distill representations of the TIR modality from the RGB modality with Cross-Modal Distillation (CMD) on a large amount of unlabeled paired RGB-TIR data.

Transfer Learning

Unsupervised Video Summarization with a Convolutional Attentive Adversarial Network

no code implementations24 May 2021 Guoqiang Liang, Yanbing Lv, Shucheng Li, Shizhou Zhang, Yanning Zhang

Specifically, the generator employs a fully convolutional sequence network to extract global representation of a video, and an attention-based network to output normalized importance scores.

Generative Adversarial Network Unsupervised Video Summarization

Center Prediction Loss for Re-identification

no code implementations30 Apr 2021 Lu Yang, Yunlong Wang, Lingqiao Liu, Peng Wang, Lu Chi, Zehuan Yuan, Changhu Wang, Yanning Zhang

In this paper, we propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples.

Dynamic Image Restoration and Fusion Based on Dynamic Degradation

no code implementations26 Apr 2021 Aiqing Fang, Xinbo Zhao, Jiaqi Yang, Yanning Zhang

In addition, a dynamic degradation kernel is proposed to improve the robustness of image restoration and fusion.

Image Restoration

Efficient Spatialtemporal Context Modeling for Action Recognition

no code implementations20 Mar 2021 Congqi Cao, Yue Lu, Yifan Zhang, Dongmei Jiang, Yanning Zhang

Inspired from 2D criss-cross attention used in segmentation task, we propose a recurrent 3D criss-cross attention (RCCA-3D) module to model the dense long-range spatiotemporal contextual information in video for action recognition.

Action Recognition Relation

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

no code implementations9 Mar 2021 Lu Yang, Hongbang Liu, Jinghao Zhou, Lingqiao Liu, Lei Zhang, Peng Wang, Yanning Zhang

Learning cross-view consistent feature representation is the key for accurate vehicle Re-identification (ReID), since the visual appearance of vehicles changes significantly under different viewpoints.

Vehicle Re-Identification

Learning Depth via Leveraging Semantics: Self-supervised Monocular Depth Estimation with Both Implicit and Explicit Semantic Guidance

no code implementations11 Feb 2021 Rui Li, Xiantuo He, Danna Xue, Shaolin Su, Qing Mao, Yu Zhu, Jinqiu Sun, Yanning Zhang

While the mappings between image and pixel-wise depth are well-studied in current methods, the correlation between image, depth and scene semantics, however, is less considered.

Monocular Depth Estimation

Non-uniform Motion Deblurring with Blurry Component Divided Guidance

no code implementations15 Jan 2021 Pei Wang, Wei Sun, Qingsen Yan, Axi Niu, Rui Li, Yu Zhu, Jinqiu Sun, Yanning Zhang

To tackle the above problems, we present a deep two-branch network to deal with blurry images via a component divided module, which divides an image into two components based on the representation of blurry degree.

Blind Image Deblurring Decoder +2

Semantic-Guided Representation Enhancement for Self-supervised Monocular Trained Depth Estimation

no code implementations15 Dec 2020 Rui Li, Qing Mao, Pei Wang, Xiantuo He, Yu Zhu, Jinqiu Sun, Yanning Zhang

Based on this framework, we enhance the local feature representation by sampling and feeding the point-based features that locate on the semantic edges to an individual Semantic-guided Edge Enhancement module (SEEM), which is specifically designed for promoting depth estimation on the challenging semantic borders.

Depth Estimation Semantic Segmentation

Meta-Generating Deep Attentive Metric for Few-shot Classification

no code implementations3 Dec 2020 Lei Zhang, Fei Zhou, Wei Wei, Yanning Zhang

To mitigate this problem, we present a novel deep metric meta-generation method that turns to an orthogonal direction, ie, learning to adaptively generate a specific metric for a new FSL task based on the task description (eg, a few labelled samples).

Classification Few-Shot Learning +1

Unsupervised Alternating Optimization for Blind Hyperspectral Imagery Super-resolution

no code implementations3 Dec 2020 Jiangtao Nie, Lei Zhang, Wei Wei, Zhiqiang Lang, Yanning Zhang

One of the main reason comes from the fact that the predefined degeneration models (e. g. blur in spatial domain) utilized by most HSI SR methods often exist great discrepancy with the real one, which results in these deep models overfit and ultimately degrade their performance on real data.

Meta-Learning Super-Resolution

On Efficient and Robust Metrics for RANSAC Hypotheses and 3D Rigid Registration

no code implementations10 Nov 2020 Jiaqi Yang, Zhiqiang Huang, Siwen Quan, Qian Zhang, Yanning Zhang, Zhiguo Cao

This paper focuses on developing efficient and robust evaluation metrics for RANSAC hypotheses to achieve accurate 3D rigid registration.

Few-shot Action Recognition with Implicit Temporal Alignment and Pair Similarity Optimization

no code implementations13 Oct 2020 Congqi Cao, Yajuan Li, Qinyi Lv, Peng Wang, Yanning Zhang

Few-shot learning aims to recognize instances from novel classes with few labeled samples, which has great value in research and application.

Few-Shot action recognition Few Shot Action Recognition +3

AE-Netv2: Optimization of Image Fusion Efficiency and Network Architecture

no code implementations5 Oct 2020 Aiqing Fang, Xinbo Zhao, Jiaqi Yang, Beibei Qin, Yanning Zhang

Finally, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of human brain in the field of image fusion.

AE-Net: Autonomous Evolution Image Fusion Method Inspired by Human Cognitive Mechanism

no code implementations17 Jul 2020 Aiqing Fang, Xinbo Zhao, Jiaqi Yang, Shihao Cao, Yanning Zhang

Firstly, the relationship between human brain cognitive mechanism and image fusion task is analyzed and a physical model is established to simulate human brain cognitive mechanism.

IllumiNet: Transferring Illumination from Planar Surfaces to Virtual Objects in Augmented Reality

no code implementations12 Jul 2020 Di Xu, Zhen Li, Yanning Zhang, Qi Cao

This paper presents an illumination estimation method for virtual objects in real environment by learning.

A Robust Attentional Framework for License Plate Recognition in the Wild

no code implementations6 Jun 2020 Linjiang Zhang, Peng Wang, Hui Li, Zhen Li, Chunhua Shen, Yanning Zhang

On the other hand, the 2D attentional based license plate recognizer with an Xception-based CNN encoder is capable of recognizing license plates with different patterns under various scenarios accurately and robustly.

Image Generation License Plate Recognition

Attention-based network for low-light image enhancement

no code implementations20 May 2020 Cheng Zhang, Qingsen Yan, Yu Zhu, Xianjun Li, Jinqiu Sun, Yanning Zhang

Extensive experiments demonstrate the superiority of the proposed network in terms of suppressing the chromatic aberration and noise artifacts in enhancement, especially when the low-light image has severe noise.

Denoising Low-Light Image Enhancement

Learning to Compare Relation: Semantic Alignment for Few-Shot Learning

no code implementations29 Feb 2020 Congqi Cao, Yanning Zhang

First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category.

Few-Shot Learning Metric Learning +1

Learning to Zoom-in via Learning to Zoom-out: Real-world Super-resolution by Generating and Adapting Degradation

no code implementations8 Jan 2020 Dong Gong, Wei Sun, Qinfeng Shi, Anton Van Den Hengel, Yanning Zhang

Most learning-based super-resolution (SR) methods aim to recover high-resolution (HR) image from a given low-resolution (LR) image via learning on LR-HR image pairs.


Cross-Modal Image Fusion Theory Guided by Subjective Visual Attention

no code implementations23 Dec 2019 Aiqing Fang, Xinbo Zhao, Yanning Zhang

In order to improve the robustness and contextual awareness of image fusion tasks, we proposed a multi-task auxiliary learning image fusion theory guided by subjective attention.

Auxiliary Learning

A Cross-Modal Image Fusion Method Guided by Human Visual Characteristics

no code implementations18 Dec 2019 Aiqing Fang, Xinbo Zhao, Jiaqi Yang, Yanning Zhang

The characteristics of feature selection, nonlinear combination and multi-task auxiliary learning mechanism of the human visual perception system play an important role in real-world scenarios, but the research of image fusion theory based on the characteristics of human visual perception is less.

Auxiliary Learning feature selection

Person Re-identification in Aerial Imagery

1 code implementation14 Aug 2019 Shizhou Zhang, Qi Zhang, Yifei Yang, Xing Wei, Peng Wang, Bingliang Jiao, Yanning Zhang

Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently.

object-detection Object Detection +1

A Performance Evaluation of Correspondence Grouping Methods for 3D Rigid Data Matching

no code implementations5 Jul 2019 Jiaqi Yang, Ke Xian, Peng Wang, Yanning Zhang

Seeking consistent point-to-point correspondences between 3D rigid data (point clouds, meshes, or depth maps) is a fundamental problem in 3D computer vision.

3D Object Recognition Point Cloud Registration +1

Evaluating Local Geometric Feature Representations for 3D Rigid Data Matching

no code implementations29 Jun 2019 Jiaqi Yang, Siwen Quan, Peng Wang, Yanning Zhang

The outcomes present interesting findings that may shed new light on this community and provide complementary perspectives to existing evaluations on the topic of local geometric feature description.

Object Recognition Point Cloud Registration +1

Vehicle Re-identification in Aerial Imagery: Dataset and Approach

no code implementations ICCV 2019 Peng Wang, Bingliang Jiao, Lu Yang, Yifei Yang, Shizhou Zhang, Wei Wei, Yanning Zhang

It is capable of explicitly detecting discriminative parts for each specific vehicle and significantly outperforms the evaluated baselines and state-of-the-art vehicle ReID approaches.

Vehicle Re-Identification

Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution

no code implementations24 Mar 2019 Lei Zhang, Zhiqiang Lang, Peng Wang, Wei Wei, Shengcai Liao, Ling Shao, Yanning Zhang

To address this problem, we propose a pixel-aware deep function-mixture network for SSR, which is composed of a new class of modules, termed function-mixture (FM) blocks.

Spectral Super-Resolution Super-Resolution

MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution

no code implementations12 Oct 2018 Dong Gong, Mingkui Tan, Qinfeng Shi, Anton Van Den Hengel, Yanning Zhang

Compared to existing methods, MPTV is less sensitive to the choice of the trade-off parameter between data fitting and regularization.

Image Deconvolution

A Pulmonary Nodule Detection Model Based on Progressive Resolution and Hierarchical Saliency

no code implementations2 Jul 2018 Jun-Jie Zhang, Yong Xia, Yanning Zhang

Detection of pulmonary nodules on chest CT is an essential step in the early diagnosis of lung cancer, which is critical for best patient care.

Accurate Spectral Super-resolution from Single RGB Image Using Multi-scale CNN

no code implementations10 Jun 2018 Yiqi Yan, Lei Zhang, Jun Li, Wei Wei, Yanning Zhang

Different from traditional hyperspectral super-resolution approaches that focus on improving the spatial resolution, spectral super-resolution aims at producing a high-resolution hyperspectral image from the RGB observation with super-resolution in spectral domain.

Spectral Reconstruction Spectral Super-Resolution +1

Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network

no code implementations5 Jun 2018 Lei Zhang, Peng Wang, Chunhua Shen, Lingqiao Liu, Wei Wei, Yanning Zhang, Anton Van Den Hengel

In this study, we revisit this problem from an orthog- onal view, and propose a novel learning strategy to maxi- mize the pixel-wise fitting capacity of a given lightweight network architecture.

Image Super-Resolution

Learning Deep Gradient Descent Optimization for Image Deconvolution

1 code implementation10 Apr 2018 Dong Gong, Zhen Zhang, Qinfeng Shi, Anton Van Den Hengel, Chunhua Shen, Yanning Zhang

Extensive experiments on synthetic benchmarks and challenging real-world images demonstrate that the proposed deep optimization method is effective and robust to produce favorable results as well as practical for real-world image deblurring applications.

Blind Image Deblurring Image Deblurring +1

Significantly Fast and Robust Fuzzy C-MeansClustering Algorithm Based on MorphologicalReconstruction and Membership Filtering

no code implementations IEEE 2018 Tao Lei, Xiaohong Jia, Yanning Zhang, Lifeng He, Hongy-ing Meng, Senior Member, and Asoke K. Nandi, Fellow, IEEE

However, the introduction oflocal spatial information often leads to a high computationalcomplexity, arising out of an iterative calculation of the distancebetween pixels within local spatial neighbors and clusteringcenters.

Clustering Image Segmentation +1

Self-Paced Kernel Estimation for Robust Blind Image Deblurring

no code implementations ICCV 2017 Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi

Rather than attempt to identify outliers to the model a priori, we instead propose to sequentially identify inliers, and gradually incorporate them into the estimation process.

Blind Image Deblurring Image Deblurring

Beyond Low Rank: A Data-Adaptive Tensor Completion Method

no code implementations3 Aug 2017 Lei Zhang, Wei Wei, Qinfeng Shi, Chunhua Shen, Anton Van Den Hengel, Yanning Zhang

The prior for the non-low-rank structure is established based on a mixture of Gaussians which is shown to be flexible enough, and powerful enough, to inform the completion process for a variety of real tensor data.

From Motion Blur to Motion Flow: a Deep Learning Solution for Removing Heterogeneous Motion Blur

no code implementations CVPR 2017 Dong Gong, Jie Yang, Lingqiao Liu, Yanning Zhang, Ian Reid, Chunhua Shen, Anton Van Den Hengel, Qinfeng Shi

The critical observation underpinning our approach is thus that learning the motion flow instead allows the model to focus on the cause of the blur, irrespective of the image content.

Tensor Power Iteration for Multi-Graph Matching

no code implementations CVPR 2016 Xinchu Shi, Haibin Ling, Weiming Hu, Junliang Xing, Yanning Zhang

Due to its wide range of applications, matching between two graphs has been extensively studied and remains an active topic.

Graph Matching

Blind Image Deconvolution by Automatic Gradient Activation

no code implementations CVPR 2016 Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi

We show here that a subset of the image gradients are adequate to estimate the blur kernel robustly, no matter the gradient image is sparse or not.

Image Deconvolution

Hyperspectral Compressive Sensing Using Manifold-Structured Sparsity Prior

no code implementations ICCV 2015 Lei Zhang, Wei Wei, Yanning Zhang, Fei Li, Chunhua Shen, Qinfeng Shi

To reconstruct hyperspectral image (HSI) accurately from a few noisy compressive measurements, we present a novel manifold-structured sparsity prior based hyperspectral compressive sensing (HCS) method in this study.

Compressive Sensing

Modeling Deformable Gradient Compositions for Single-Image Super-Resolution

no code implementations CVPR 2015 Yu Zhu, Yanning Zhang, Boyan Bonev, Alan L. Yuille

Based on the fact that singular primitive patches are more invariant to the scale change (i. e. have less ambiguity across different scales), we represent the non-singular primitives as compositions of singular ones, each of which is allowed some deformation.

Image Super-Resolution

Reweighted Laplace Prior Based Hyperspectral Compressive Sensing for Unknown Sparsity

no code implementations CVPR 2015 Lei Zhang, Wei Wei, Yanning Zhang, Chunna Tian, Fei Li

To address this problem, a novel reweighted Laplace prior based hyperspectral compressive sensing method is proposed in this study.

Compressive Sensing Noise Estimation

Constraint Reduction using Marginal Polytope Diagrams for MAP LP Relaxations

no code implementations17 Dec 2013 Zhen Zhang, Qinfeng Shi, Yanning Zhang, Chunhua Shen, Anton Van Den Hengel

We show that using Marginal Polytope Diagrams allows the number of constraints to be reduced without loosening the LP relaxations.

Part-Based Visual Tracking with Online Latent Structural Learning

no code implementations CVPR 2013 Rui Yao, Qinfeng Shi, Chunhua Shen, Yanning Zhang, Anton Van Den Hengel

Despite many advances made in the area, deformable targets and partial occlusions continue to represent key problems in visual tracking.

Structured Prediction Visual Tracking

Multi-image Blind Deblurring Using a Coupled Adaptive Sparse Prior

no code implementations CVPR 2013 Haichao Zhang, David Wipf, Yanning Zhang

This paper presents a robust algorithm for estimating a single latent sharp image given multiple blurry and/or noisy observations.


Cannot find the paper you are looking for? You can Submit a new open access paper.