Search Results for author: Feng Zheng

Found 64 papers, 31 papers with code

Enabling Deep Residual Networks for Weakly Supervised Object Detection

no code implementations ECCV 2020 Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu

Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training.

object-detection Weakly Supervised Object Detection

Accelerating Vision-Language Pretraining with Free Language Modeling

no code implementations24 Mar 2023 Teng Wang, Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, XiaoHu Qie, Ping Luo

FLM successfully frees the prediction rate from the tie-up with the corruption rate while allowing the corruption spans to be customized for each token to be predicted.

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

no code implementations22 Mar 2023 Tiantian Geng, Teng Wang, Jinming Duan, Runmin Cong, Feng Zheng

To better adapt to real-life applications, in this paper we focus on the task of dense-localizing audio-visual events, which aims to jointly localize and recognize all audio-visual events occurring in an untrimmed video.

audio-visual event localization

Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

no code implementations11 Mar 2023 Teng Wang, Jinrui Zhang, Feng Zheng, Wenhao Jiang, Ran Cheng, Ping Luo

TEG learns to adaptively ground the possible event proposals given a set of sentences by estimating the cross-modal distance in a joint semantic space.

Dense Video Captioning Text Generation

On the Stability and Generalization of Triplet Learning

no code implementations20 Feb 2023 Jun Chen, Hong Chen, Xue Jiang, Bin Gu, Weifu Li, Tieliang Gong, Feng Zheng

Triplet learning, i. e. learning from triplet data, has attracted much attention in computer vision tasks with an extremely large number of categories, e. g., face recognition and person re-identification.

Face Recognition Metric Learning +1

IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

1 code implementation31 Jan 2023 Guoyang Xie, Jinbao Wang, Jiaqi Liu, Jiayi Lyu, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

We realize that the lack of actual IM settings most probably hinders the development and usage of these methods in real-world applications.

Anomaly Detection Continual Learning +1

Pushing the Limits of Fewshot Anomaly Detection in Industry Vision: Graphcore

no code implementations28 Jan 2023 Guoyang Xie, Jingbao Wang, Jiaqi Liu, Feng Zheng, Yaochu Jin

Besides, we provide a novel model GraphCore via VIIFs that can fast implement unsupervised FSAD training and can improve the performance of anomaly detection.

Anomaly Detection

Deep Industrial Image Anomaly Detection: A Survey

1 code implementation27 Jan 2023 Jiaqi Liu, Guoyang Xie, Jingbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, Yaochu Jin

In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets.

Anomaly Detection

Learning Dual-Fused Modality-Aware Representations for RGBD Tracking

no code implementations6 Nov 2022 Shang Gao, Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song

However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored.

Object Tracking

Does Thermal Really Always Matter for RGB-T Salient Object Detection?

1 code implementation9 Oct 2022 Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

object-detection Object Detection +2

Deep Manifold Hashing: A Divide-and-Conquer Approach for Semi-Paired Unsupervised Cross-Modal Retrieval

no code implementations26 Sep 2022 Yufeng Shi, Xinge You, Jiamiao Xu, Feng Zheng, Qinmu Peng, Weihua Ou

Hashing that projects data into binary codes has shown extraordinary talents in cross-modal retrieval due to its low storage usage and high query speed.

Cross-Modal Retrieval Retrieval

Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward

1 code implementation25 Sep 2022 Yunlong Tang, Siting Xu, Teng Wang, Qin Lin, Qinglin Lu, Feng Zheng

The existing method performs well at video segmentation stages but suffers from the problems of dependencies on extra cumbersome models and poor performance at the segment assemblage stage.

Video Editing Video Segmentation +1

Pose-Aided Video-based Person Re-Identification via Recurrent Graph Convolutional Network

no code implementations23 Sep 2022 Honghu Pan, Qiao Liu, Yongyong Chen, Yunqi He, Yuan Zheng, Feng Zheng, Zhenyu He

Finally, we propose a dual-attention method consisting of node-attention and time-attention to obtain the temporal graph representation from the node embeddings, where the self-attention mechanism is employed to learn the importance of each node and each frame.

Retrieval Video-Based Person Re-Identification +1

T-Person-GAN: Text-to-Person Image Generation with Identity-Consistency and Manifold Mix-Up

1 code implementation18 Aug 2022 Lin Wu, Yang Wang, Feng Zheng, Qi Tian, Meng Wang

Our architecture is orthogonal to StackGAN++ , and focuses on person image generation, with all of them together to enrich the spectrum of GANs for the image generation task.

Text-to-Image Generation

Prompting for Multi-Modal Tracking

no code implementations29 Jul 2022 Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song

Multi-modal tracking gains attention due to its ability to be more accurate and robust in complex scenarios compared to traditional RGB-based tracking.

Exploiting Context Information for Generic Event Boundary Captioning

1 code implementation3 Jul 2022 Jinrui Zhang, Teng Wang, Feng Zheng, Ran Cheng, Ping Luo

Previous methods only process the information of a single boundary at a time, which lacks utilization of video context information.

Boundary Captioning

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

1 code implementation17 Jun 2022 Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo

Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.

Contrastive Learning Data Augmentation +1

Semantic-Aware Pretraining for Dense Video Captioning

no code implementations13 Apr 2022 Teng Wang, Zhu Liu, Feng Zheng, Zhichao Lu, Ran Cheng, Ping Luo

This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021.

Dense Captioning Dense Video Captioning

RGBD Object Tracking: An In-depth Review

1 code implementation26 Mar 2022 Jinyu Yang, Zhe Li, Song Yan, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen, Ling Shao

Particularly, we are the first to provide depth quality evaluation and analysis of tracking results in depth-friendly scenarios in RGBD tracking.

Object Tracking

Error-based Knockoffs Inference for Controlled Feature Selection

no code implementations9 Mar 2022 Xuebin Zhao, Hong Chen, Yingjie Wang, Weifu Li, Tieliang Gong, Yulong Wang, Feng Zheng

Recently, the scheme of model-X knockoffs was proposed as a promising solution to address controlled feature selection under high-dimensional finite-sample settings.

Feature Importance

Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs

1 code implementation8 Mar 2022 Jingfei Xia, Mingchen Zhuge, Tiantian Geng, Shun Fan, Yuantai Wei, Zhenyu He, Feng Zheng

Figure skating scoring is challenging because it requires judging the technical moves of the players as well as their coordination with the background music.

Representation Learning

Cross-Modality Neuroimage Synthesis: A Survey

no code implementations14 Feb 2022 Guoyang Xie, Jinbao Wang, Yawen Huang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu Jin

In this paper, we are the first one to comprehensively approach cross-modality neuroimage synthesis task from different perspectives, which include the level of the supervision (especially for weakly-supervised and unsupervised), loss function, evaluation metrics, the range of modality synthesis, datasets (aligned, private and public) and the synthesis-based downstream tasks.

Image Generation

A Survey of Visual Sensory Anomaly Detection

1 code implementation14 Feb 2022 Xi Jiang, Guoyang Xie, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

In this survey, we are the first one to provide a comprehensive review of visual sensory AD and category into three levels according to the form of anomalies.

Anomaly Detection

FedMed-ATL: Misaligned Unpaired Brain Image Synthesis via Affine Transform Loss

1 code implementation29 Jan 2022 Jinbao Wang, Guoyang Xie, Yawen Huang, Yefeng Zheng, Yaochu Jin, Feng Zheng

The proposed method demonstrates the advanced performance in both the quality of our synthesized results under a severely misaligned and unpaired data setting, and better stability than other GAN-based algorithms.

Data Augmentation Image Generation +1

FedMed-GAN: Federated Domain Translation on Unsupervised Cross-Modality Brain Image Synthesis

1 code implementation22 Jan 2022 Guoyang Xie, Jinbao Wang, Yawen Huang, Yuexiang Li, Yefeng Zheng, Feng Zheng, Yaochu Jin

There is a clear need to launch a federated learning and facilitate the integration of the dispersed data from different institutions.

Federated Learning Image Generation +1

GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled Images as Reference

no code implementations28 Dec 2021 Peng Tu, Yawen Huang, Feng Zheng, Zhenyu He, Liujun Cao, Ling Shao

In this paper, we propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net, by leveraging labeled information to guide the learning of unlabeled instances.

Semi-Supervised Semantic Segmentation

Benchmarks for Corruption Invariant Person Re-identification

1 code implementation1 Nov 2021 Minghui Chen, Zhiqiang Wang, Feng Zheng

When deploying person re-identification (ReID) model in safety-critical applications, it is pivotal to understanding the robustness of the model against a diverse array of image corruptions.

 Ranked #1 on Cross-Modal Person Re-Identification on RegDB-C (mINP (Visible to Thermal) metric)

Cross-Modal Person Re-Identification Generalizable Person Re-identification

DepthTrack : Unveiling the Power of RGBD Tracking

1 code implementation31 Aug 2021 Song Yan, Jinyu Yang, Jani Käpylä, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen

RGBD (RGB plus depth) object tracking is gaining momentum as RGBD sensors have become popular in many application fields such as robotics. However, the best RGBD trackers are extensions of the state-of-the-art deep RGB trackers.

Object Tracking

An Information-theoretic Perspective of Hierarchical Clustering

no code implementations13 Aug 2021 YiCheng Pan, Feng Zheng, Bingchen Fan

In this paper, we investigate hierarchical clustering from the \emph{information-theoretic} perspective and formulate a new objective function.

Saliency-Associated Object Tracking

1 code implementation ICCV 2021 Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He

A potential limitation of such trackers is that not all patches are equally informative for tracking.

Association Object Tracking

FREE: Feature Refinement for Generalized Zero-Shot Learning

1 code implementation ICCV 2021 Shiming Chen, Wenjie Wang, Beihao Xia, Qinmu Peng, Xinge You, Feng Zheng, Ling Shao

FREE employs a feature refinement (FR) module that incorporates \textit{semantic$\rightarrow$visual} mapping into a unified generative model to refine the visual features of seen and unseen class samples.

Generalized Zero-Shot Learning

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

no code implementations7 Jul 2021 Peidong Liu, Zibin He, Xiyu Yan, Yong Jiang, Shutao Xia, Feng Zheng, Maowei Hu

In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click.

Knowledge Distillation Model Compression +2

GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as Reference

1 code implementation29 Jun 2021 Peng Tu, Yawen Huang, Rongrong Ji, Feng Zheng, Ling Shao

To take advantage of the labeled examples and guide unlabeled data learning, we further propose a mask generation module to generate high-quality pseudo masks for the unlabeled data.

Semi-Supervised Semantic Segmentation

Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net

no code implementations CVPR 2021 Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation CVPR 2021 Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

3D Reconstruction Person Re-Identification

Brain Image Synthesis with Unsupervised Multivariate Canonical CSC$\ell_4$Net

no code implementations22 Mar 2021 Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Tiny Adversarial Mulit-Objective Oneshot Neural Architecture Search

no code implementations28 Feb 2021 Guoyang Xie, Jinbao Wang, Guo Yu, Feng Zheng, Yaochu Jin

Our work focuses on how to improve the robustness of tiny neural networks without seriously deteriorating of clean accuracy under mobile-level resources.

Neural Architecture Search

A Bayesian Federated Learning Framework with Online Laplace Approximation

no code implementations3 Feb 2021 Liangxi Liu, Feng Zheng, Hong Chen, Guo-Jun Qi, Heng Huang, Ling Shao

On the client side, a prior loss that uses the global posterior probabilistic parameters delivered from the server is designed to guide the local training.

Federated Learning

DepthTrack: Unveiling the Power of RGBD Tracking

1 code implementation ICCV 2021 Song Yan, Jinyu Yang, Jani Kapyla, Feng Zheng, Ales Leonardis, Joni-Kristian Kamarainen

This can be explained by the fact that there are no sufficiently large RGBD datasets to 1) train "deep depth trackers" and to 2) challenge RGB trackers with sequences for which the depth cue is essential.

Object Tracking

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation10 Dec 2020 Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Multi-task Additive Models for Robust Estimation and Automatic Structure Discovery

no code implementations NeurIPS 2020 Yingjie Wang, Hong Chen, Feng Zheng, Chen Xu, Tieliang Gong, Yanhong Chen

For high-dimensional observations in real environment, e. g., Coronal Mass Ejections (CMEs) data, the learning performance of previous methods may be degraded seriously due to the complex non-Gaussian noise and the insufficiency of prior knowledge on variable structure.

Additive models Bilevel Optimization +1

A Parallel Down-Up Fusion Network for Salient Object Detection in Optical Remote Sensing Images

no code implementations2 Oct 2020 Chongyi Li, Runmin Cong, Chunle Guo, Hua Li, Chunjie Zhang, Feng Zheng, Yao Zhao

In this paper, we propose a novel Parallel Down-up Fusion network (PDF-Net) for SOD in optical RSIs, which takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.

object-detection Object Detection +1

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation11 Sep 2020 Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Person Re-Identification

LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

1 code implementation3 Aug 2020 Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng

We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.

Thermal Infrared Object Tracking

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation27 Jul 2020 Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth and Single Color Image

no code implementations20 Mar 2020 Rui Xiang, Feng Zheng, Huapeng Su, Zhe Zhang

In this paper, we propose an end-to-end deep learning network named 3dDepthNet, which produces an accurate dense depth image from a single pair of sparse LiDAR depth and color image for robotics and autonomous driving tasks.

Autonomous Driving Depth Completion +1

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation3 Dec 2019 Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations28 Nov 2019 Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Supervised Online Hashing via Similarity Distribution Learning

no code implementations31 May 2019 Mingbao Lin, Rongrong Ji, Shen Chen, Feng Zheng, Xiaoshuai Sun, Baochang Zhang, Liujuan Cao, Guodong Guo, Feiyue Huang

In this paper, we propose to model the similarity distributions between the input data and the hashing codes, upon which a novel supervised online hashing method, dubbed as Similarity Distribution based Online Hashing (SDOH), is proposed, to keep the intrinsic semantic relationship in the produced Hamming space.


Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

1 code implementation CVPR 2019 Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji

Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.

Person Re-Identification

Trifo-VIO: Robust and Efficient Stereo Visual Inertial Odometry using Points and Lines

no code implementations6 Mar 2018 Feng Zheng, Grace Tsai, Zhe Zhang, Shaoshan Liu, Chen-Chi Chu, Hongbing Hu

In this paper, we present the Trifo Visual Inertial Odometry (Trifo-VIO), a tightly-coupled filtering-based stereo VIO system using both points and lines.

PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design

no code implementations2 Oct 2017 Zhe Zhang, Shaoshan Liu, Grace Tsai, Hongbing Hu, Chen-Chi Chu, Feng Zheng

In this paper, we present the PerceptIn Robotics Vision System (PIRVS) system, a visual-inertial computing hardware with embedded simultaneous localization and mapping (SLAM) algorithm.

Simultaneous Localization and Mapping

Dual-reference Face Retrieval

no code implementations2 Jun 2017 BingZhang Hu, Feng Zheng, Ling Shao

Face retrieval has received much attention over the past few decades, and many efforts have been made in retrieving face images against pose, illumination, and expression variations.


Cannot find the paper you are looking for? You can Submit a new open access paper.