no code implementations • 18 Apr 2024 • Zhong Wang, Zengyu Wan, Han Han, Bohao Liao, Yuliang Wu, Wei Zhai, Yang Cao, Zheng-Jun Zha
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy provided by the event camera.
no code implementations • 17 Apr 2024 • Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zheng-Jun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas McLelland, Oliver Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li, Hayden Kwok-Hay So, Philippe Bich, Chiara Boretti, Luciano Prono, Mircea Lică, David Dinucu-Jianu, Cătălin Grîu, Xiaopeng Lin, Hongwei Ren, Bojun Cheng, Xinan Zhang, Valentin Vial, Anthony Yezzi, James Tsai
This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge.
no code implementations • 22 Mar 2024 • Qiang Zhang, Jiawei Liu, Fanrui Zhang, Xiaoling Zhu, Zheng-Jun Zha
Existing key node identification methods usually consider node influence only from the propagation structure perspective and have insufficient generalization ability to unknown scenarios.
no code implementations • 22 Mar 2024 • Fanrui Zhang, Jiawei Liu, Qiang Zhang, Xiaoling Zhu, Zheng-Jun Zha
In this work, we propose a novel Hierarchical Information Enhancement Network (HIENet) for cascade prediction.
no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Zheng-Jun Zha, Yan Lu, Baining Guo
The development of Large Vision-Language Models (LVLMs) is striving to catch up with the success of Large Language Models (LLMs), yet it faces more challenges to be resolved.
no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Yiting Lu, Zheng-Jun Zha, Zhibo Chen, Baining Guo
In this paper, we explore this question and provide the answer "Yes!".
no code implementations • 14 Mar 2024 • Yuliang Wu, Ganchao Tan, Jinze Chen, Wei Zhai, Yang Cao, Zheng-Jun Zha
In this paper, we propose AsynHDR, a Pixel-Asynchronous HDR imaging system, based on key insights into the challenges in HDR imaging and the unique event-generating mechanism of Dynamic Vision Sensors (DVS).
no code implementations • 3 Mar 2024 • Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-Jun Zha, Haonan Lu
In contrast to vanilla consistency distillation (CD) which distills the ordinary differential equation solvers-based sampling process of a pretrained teacher model into a student, SCott explores the possibility and validates the efficacy of integrating stochastic differential equation (SDE) solvers into CD to fully unleash the potential of the teacher.
no code implementations • 14 Dec 2023 • Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha
Which underexploit certain correlations between the interaction counterparts (human and object), and struggle to address the uncertainty in interactions.
no code implementations • 12 Dec 2023 • Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha
Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.
no code implementations • 8 Dec 2023 • Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha
The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.
1 code implementation • 29 Nov 2023 • Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li
This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.
1 code implementation • ICCV 2023 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang
Change captioning aims to describe the difference between a pair of similar images.
2 code implementations • 22 Sep 2023 • Wei Zhai, Pingyu Wu, Kai Zhu, Yang Cao, Feng Wu, Zheng-Jun Zha
In addition, our method also achieves state-of-the-art weakly supervised semantic segmentation performance on the PASCAL VOC 2012 and MS COCO 2014 datasets.
1 code implementation • 5 Sep 2023 • Yuxiang Yang, Yingqi Deng, Jing Zhang, Jiahao Nie, Zheng-Jun Zha
The spatial information indicating objects' spatial adjacency across consecutive frames is crucial for effective object tracking.
no code implementations • ICCV 2023 • Kecheng Zheng, Wei Wu, Ruili Feng, Kai Zhu, Jiawei Liu, Deli Zhao, Zheng-Jun Zha, Wei Chen, Yujun Shen
To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen.
2 code implementations • ICCV 2023 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Zheng-Jun Zha, Yan Lu, Baining Guo
With this insight, we propose Adaptive Frequency Filtering (AFF) token mixer.
no code implementations • 28 Jun 2023 • Jiawei Liu, Jingyi Xie, Fanrui Zhang, Qiang Zhang, Zheng-Jun Zha
The explosive growth of rumors with text and images on social media platforms has drawn great attention.
no code implementations • 21 Jun 2023 • Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang
Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled text-to-3D content creation by optimizing a randomly initialized Neural Radiance Fields (NeRF) with score distillation.
1 code implementation • CVPR 2023 • Yucheng Zhao, Chong Luo, Chuanxin Tang, Dongdong Chen, Noel Codella, Zheng-Jun Zha
We believe that the concept of streaming video model and the implementation of S-ViT are solid steps towards a unified deep learning architecture for video understanding.
1 code implementation • ICCV 2023 • Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha
Comprehensive experiments on PIAD demonstrate the reliability of the proposed task and the superiority of our method.
1 code implementation • ICCV 2023 • Pingyu Wu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha
Specifically, a spatial token is first introduced in the input space to aggregate representations for localization task.
1 code implementation • ICCV 2023 • Xiaoyu Liu, Wei Huang, Zhiwei Xiong, Shenglong Zhou, Yueyi Zhang, Xuejin Chen, Zheng-Jun Zha, Feng Wu
Sparse instance-level supervision has recently been explored to address insufficient annotation in biomedical instance segmentation, which is easier to annotate crowded instances and better preserves instance completeness for 3D volumetric datasets compared to common semi-supervision. In this paper, we propose a sparsely supervised biomedical instance segmentation framework via cross-representation affinity consistency regularization.
no code implementations • CVPR 2023 • Yang Wang, Long Peng, Liang Li, Yang Cao, Zheng-Jun Zha
To this end, we inject the addition/difference operation into the convolution process and devise a Contrast Aware (CA) unit and a Detail Aware (DA) unit to facilitate the statistical and structural regularities modeling.
no code implementations • ICCV 2023 • Kai Zhu, Kecheng Zheng, Ruili Feng, Deli Zhao, Yang Cao, Zheng-Jun Zha
Non-exemplar class-incremental learning aims to recognize both the old and new classes without access to old class samples.
no code implementations • CVPR 2023 • Dong Li, Jiaying Zhu, Menglu Wang, Jiawei Liu, Xueyang Fu, Zheng-Jun Zha
In the second step, guided by the learnable edges, a region message passing controller is devised to weaken the message passing between the forged and authentic regions.
no code implementations • CVPR 2023 • Kunyu Wang, Xueyang Fu, Yukun Huang, Chengzhi Cao, Gege Shi, Zheng-Jun Zha
This loss enables the network to concentrate on extracting domain-invariant spectrum and domain-specific spectrum, so as to achieve better disentangling results.
1 code implementation • ICCV 2023 • Zhenhuan Liu, Liang Li, Jiayu Xiao, Zheng-Jun Zha, Qingming Huang
The experiments demonstrate the effectiveness of our method to preserve the diversity of source domain and generate high fidelity target images.
no code implementations • CVPR 2023 • Chengzhi Cao, Xueyang Fu, Hongjian Liu, Yukun Huang, Kunyu Wang, Jiebo Luo, Zheng-Jun Zha
Video-based person re-identification (Re-ID) is a prominent computer vision topic due to its wide range of video surveillance applications.
Representation Learning Video-Based Person Re-Identification
no code implementations • CVPR 2023 • Ruili Feng, Kecheng Zheng, Kai Zhu, Yujun Shen, Jian Zhao, Yukun Huang, Deli Zhao, Jingren Zhou, Michael Jordan, Zheng-Jun Zha
Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others.
1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang
Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.
no code implementations • 15 Jul 2022 • Naishan Zheng, Jie Huang, Qi Zhu, Man Zhou, Feng Zhao, Zheng-Jun Zha
Low-light image enhancement is an inherently subjective process whose targets vary with the user's aesthetic.
no code implementations • 13 Jun 2022 • Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, Zheng-Jun Zha
By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network rank in practical settings, i. e., ResNets, deep MLPs, and Transformers on ImageNet.
no code implementations • 10 Jun 2022 • Jingyi Xie, Jiawei Liu, Zheng-Jun Zha
LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data and facilitates model training by generating refined labels as weak supervision.
1 code implementation • CVPR 2022 • Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-Jun Zha, Qingming Huang
However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information.
Ranked #2 on Link Prediction on TSP/HCP Benchmark set
no code implementations • 21 May 2022 • Ruili Feng, Jie Xiao, Kecheng Zheng, Deli Zhao, Jingren Zhou, Qibin Sun, Zheng-Jun Zha
Human can extrapolate well, generalize daily knowledge into unseen scenarios, raise and answer counterfactual questions.
no code implementations • CVPR 2022 • Xihao Chen, Zhiwei Xiong, Zhen Cheng, Jiayong Peng, Yueyi Zhang, Zheng-Jun Zha
Interestingly, we find that, although a stereo matching network trained with the photometric loss is not optimal, its feature extractor can produce degradation-agnostic and matching-specific features.
1 code implementation • 2 Apr 2022 • Zhenhuan Liu, Liang Li, Huajie Jiang, Xin Jin, Dandan Tu, Shuhui Wang, Zheng-Jun Zha
Furthermore, we devise the spatio-temporal correlative map as a style-independent, global-aware regularization on the perceptual motion consistency.
no code implementations • 24 Mar 2022 • Kecheng Zheng, Yang Cao, Kai Zhu, Ruijing Zhao, Zheng-Jun Zha
However, its generalization performance to heterogeneous tasks is inferior to other architectures (e. g., CNNs and transformers) due to the extensive retention of domain information.
no code implementations • 22 Mar 2022 • Jinze Chen, Yang Wang, Yang Cao, Feng Wu, Zheng-Jun Zha
Dynamic Vision Sensor (DVS) can asynchronously output the events reflecting apparent motion of objects with microsecond resolution, and shows great application potential in monitoring and other fields.
1 code implementation • 18 Mar 2022 • Yangyang Li, Wei Zhai, Yang Cao, Zheng-Jun Zha
However, these methods struggle in 1) efficiently generating camouflage images using foreground and background with arbitrary structure; 2) camouflaging foreground objects to regions with multiple appearances (e. g. the junction of the vegetation and the mountains), which limit their practical application.
2 code implementations • CVPR 2022 • Kai Zhu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha
Non-exemplar class-incremental learning is to recognize both the old and new classes when old class samples cannot be saved.
2 code implementations • CVPR 2022 • Jiayu Xiao, Liang Li, Chaofei Wang, Zheng-Jun Zha, Qingming Huang
A feasible solution is to start with a GAN well-trained on a large scale source domain and adapt it to the target domain with a few samples, termed as few shot generative model adaption.
no code implementations • 3 Mar 2022 • Jiawei Liu, Zhipeng Huang, Liang Li, Kecheng Zheng, Zheng-Jun Zha
In this paper, we propose a novel Debiased Batch Normalization via Gaussian Process approach (GDNorm) for generalizable person re-identification, which models the feature statistic estimation from BN layers as a dynamically self-refining Gaussian process to alleviate the bias to unseen domain for improving the generalization.
Generalizable Person Re-identification Representation Learning
no code implementations • 3 Mar 2022 • Zhipeng Huang, Jiawei Liu, Liang Li, Kecheng Zheng, Zheng-Jun Zha
RGB-infrared person re-identification is an emerging cross-modality re-identification task, which is very challenging due to significant modality discrepancy between RGB and infrared images.
2 code implementations • AAAI 2022 • Yurui Zhu, Zeyu Xiao, Yanchi Fang, Xueyang Fu, Zhiwei Xiong, Zheng-Jun Zha
To address these issues, we first propose a new shadow illumination model for the shadow removal task.
2 code implementations • CVPR 2022 • Yurui Zhu, Jie Huang, Xueyang Fu, Feng Zhao, Qibin Sun, Zheng-Jun Zha
Shadow removal, which aims to restore the background in the shadow regions, is challenging due to the highly ill-posed nature.
no code implementations • CVPR 2022 • Wei Wu, Jiawei Liu, Kecheng Zheng, Qibin Sun, Zheng-Jun Zha
Image-to-video person re-identification aims to retrieve the same pedestrian as the image-based query from a video-based gallery set.
Image-To-Video Person Re-Identification reinforcement-learning +4
no code implementations • CVPR 2022 • Ganchao Tan, Yang Wang, Han Han, Yang Cao, Feng Wu, Zheng-Jun Zha
To recognize words from the event data, we propose a novel Multi-grained Spatio-Temporal Features Perceived Network (MSTP) to perceive fine-grained spatio-temporal features from microsecond time-resolved event data.
no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha
In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
Domain Adaptive Person Re-Identification Knowledge Distillation +4
1 code implementation • 27 Nov 2021 • Kecheng Zheng, Jiawei Liu, Wei Wu, Liang Li, Zheng-Jun Zha
The calibrated person representation is subtly decomposed into the identity-relevant feature, domain feature, and the remaining entangled one.
Domain Generalization Generalizable Person Re-identification
1 code implementation • CVPR 2022 • Yaya Shi, Xu Yang, Haiyang Xu, Chunfeng Yuan, Bing Li, Weiming Hu, Zheng-Jun Zha
The datasets will be released to facilitate the development of video captioning metrics.
no code implementations • 3 Sep 2021 • Shaofei Cai, Liang Li, Xinzhe Han, Zheng-Jun Zha, Qingming Huang
Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges.
1 code implementation • 30 Aug 2021 • Yucheng Zhao, Guangting Wang, Chuanxin Tang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha
Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.
no code implementations • 26 Aug 2021 • Hao Wang, Zheng-Jun Zha, Liang Li, Xuejin Chen, Jiebo Luo
We propose a novel MultiModulation Network (M2N) to learn the above correlation and leverage it as semantic guidance to modulate the related auditory, visual, and fused features.
1 code implementation • ICCV 2021 • Heliang Zheng, Huan Yang, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
And the reference space is optimized to capture deep image priors that are useful for quality assessment.
no code implementations • ICCV 2021 • Yucheng Zhao, Guangting Wang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha
In this paper, we propose a novel contrastive mask prediction (CMP) task for visual representation learning and design a mask contrast (MaskCo) framework to implement the idea.
no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha
Occluded person re-identification (ReID) aims to match person images with occlusion.
1 code implementation • 27 Jul 2021 • Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-Jun Zha, Yonggang Wen, DaCheng Tao
In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains.
1 code implementation • CVPR 2021 • Kai Zhu, Yang Cao, Wei Zhai, Jie Cheng, Zheng-Jun Zha
Few-shot class-incremental learning is to recognize the new classes given few samples and not forget the old classes.
2 code implementations • 7 Jul 2021 • Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu
Extensive experiments on MS COCO benchmark show that our approach can lead to 2. 0 mAP, 2. 4 mAP and 2. 2 mAP absolute improvements on RetinaNet, FCOS, and ATSS baselines with negligible extra overhead.
no code implementations • CVPR 2021 • Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha
The proposed model is capable of achieving superior performance on both inhomogeneous and incremental datasets, and is promising for highly compact systems to gradually learn myriad regularities of the different types of rain streaks.
no code implementations • CVPR 2021 • Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha
To fill this gap, we propose a zero-shot learning framework for light field SR, which learns a mapping to super-resolve the reference view with examples extracted solely from the input low-resolution light field itself.
no code implementations • CVPR 2021 • Hao Wang, Zheng-Jun Zha, Liang Li, Dong Liu, Jiebo Luo
In particular, for cross-modal interaction, we interact the sentence-level query with the whole moment while interact the word-level query with content and boundary, as in a coarse-to-fine manner.
no code implementations • 7 May 2021 • Jiawei Liu, Zhipeng Huang, Kecheng Zheng, Dong Liu, Xiaoyan Sun, Zheng-Jun Zha
It describes unseen target domain as a combination of the known source ones, and explicitly learns domain-specific representation with target distribution to improve the model's generalization by a meta-learning pipeline.
no code implementations • CVPR 2021 • Jiawei Liu, Zheng-Jun Zha, Wei Wu, Kecheng Zheng, Qibin Sun
The key factor for video person re-identification is to effectively exploit both spatial and temporal clues from video sequences.
Ranked #10 on Video Deinterlacing on MSU Deinterlacer Benchmark
no code implementations • 29 Mar 2021 • Rui Zhao, Kecheng Zheng, Zheng-Jun Zha, Hongtao Xie, Jiebo Luo
The cross-modal memory module is employed to record the instance embeddings of all the datasets for global negative mining.
1 code implementation • CVPR 2021 • Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang
Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space.
1 code implementation • CVPR 2021 • Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha
In this paper, we propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.
no code implementations • 24 Feb 2021 • Shunxin Xu, Ke Sun, Dong Liu, Zhiwei Xiong, Zheng-Jun Zha
We observe that not only denoising helps combat the drop of segmentation accuracy due to noise, but also pixel-wise semantic information boosts the capability of denoising.
no code implementations • 3 Feb 2021 • Yucheng Zhao, Dacheng Yin, Chong Luo, Zhiyuan Zhao, Chuanxin Tang, Wenjun Zeng, Zheng-Jun Zha
This paper presents a self-supervised learning framework, named MGF, for general-purpose speech representation learning.
no code implementations • 28 Jan 2021 • Yizhou Zhou, Chong Luo, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng
We believe that VAE$^2$ is also applicable to other stochastic sequence prediction problems where training data are lack of stochasticity.
no code implementations • ICCV 2021 • Yukun Huang, Xueyang Fu, Zheng-Jun Zha
In unconstrained real-world surveillance scenarios, person re-identification (Re-ID) models usually suffer from different low-level perceptual variations, e. g., cross-resolution and insufficient lighting.
no code implementations • ICCV 2021 • Jie Xiao, Man Zhou, Xueyang Fu, Aiping Liu, Zheng-Jun Zha
Equipped with our NR algorithm, the deep model can be trained on a list of synthetic rainy datasets by overcoming catastrophic forgetting, making it a general-version de-raining network.
no code implementations • ICCV 2021 • Yao Li, Xueyang Fu, Zheng-Jun Zha
However, the real noisy images in practical are mostly of high resolution rather than the cropped small patches and the vanilla training strategies ignore the cross-patch contextual dependency in the whole image.
no code implementations • ICCV 2021 • Xueyang Fu, Xi Wang, Aiping Liu, Junwei Han, Zheng-Jun Zha
Specifically, we design a variational model to formulate the image de-blocking problem and propose two prior terms for the image content and gradient, respectively.
1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha
Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.
1 code implementation • NeurIPS 2020 • Heliang Zheng, Jianlong Fu, Yanhong Zeng, Jiebo Luo, Zheng-Jun Zha
Such a model disentangles latent factors according to the semantic of feature channels by channel-/group- wise fusion of latent codes and feature channels.
no code implementations • NeurIPS 2020 • Shaobo Min, Hongtao Xie, Hantao Yao, Xuran Deng, Zheng-Jun Zha, Yongdong Zhang
In this paper, we introduce a new task, named Hierarchical Granularity Transfer Learning (HGTL), to recognize sub-level categories with basic-level annotations and semantic descriptions for hierarchical categories.
no code implementations • 10 Oct 2020 • Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, Tao Mei
This hard selection strategy is able to fuse the strong-relevant multi-modality features for alleviating the problem of matching redundancy.
Ranked #16 on Text based Person Retrieval on CUHK-PEDES
no code implementations • 9 Sep 2020 • Jiawei Liu, Xierong Zhu, Zheng-Jun Zha
TALNet simultaneously exploits human attributes and appearance to learn comprehensive and effective pedestrian representations from videos.
1 code implementation • 31 Aug 2020 • Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha
In this paper, we explore the task of generating photo-realistic face images from hand-drawn sketches.
1 code implementation • 10 Aug 2020 • Jing Zhang, Yang Cao, Zheng-Jun Zha, DaCheng Tao
To address this issue, we propose a novel synthetic method called 3R to simulate nighttime hazy images from daytime clear images, which first reconstructs the scene geometry, then simulates the light rays and object reflectance, and finally renders the haze effects.
1 code implementation • 17 Jul 2020 • Ganchao Tan, Daqing Liu, Meng Wang, Zheng-Jun Zha
However, existing visual reasoning methods designed for visual question answering are not appropriate to video captioning, for it requires more complex visual reasoning on videos over both space and time, and dynamic module composition along the generation process.
no code implementations • 9 May 2020 • Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Zheng-Jun Zha, Meng Wang
Metric-based few-shot learning methods concentrate on learning transferable feature embedding that generalizes well from seen categories to unseen categories under the supervision of limited number of labelled instances.
no code implementations • 12 Apr 2020 • Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.
no code implementations • 10 Apr 2020 • Rui Zhao, Kecheng Zheng, Zheng-Jun Zha
Existing dominant approaches for cross-modal video-text retrieval task are to learn a joint embedding space to measure the cross-modal similarity.
no code implementations • 10 Apr 2020 • Jiawei Liu, Zheng-Jun Zha, Xierong Zhu, Na Jiang
Person re-identification aims at identifying a certain pedestrian across non-overlapping camera networks.
no code implementations • CVPR 2020 • Yukun Huang, Zheng-Jun Zha, Xueyang Fu, Richang Hong, Liang Li
Person re-identification (Re-ID) in real-world scenarios usually suffers from various degradation factors, e. g., low-resolution, weak illumination, blurring and adverse weather.
1 code implementation • CVPR 2020 • Yuxin Wang, Hongtao Xie, Zheng-Jun Zha, Mengting Xing, Zilong Fu, Yongdong Zhang
Then a novel Local Orthogonal Texture-aware Module (LOTM) models the local texture information of proposal features in two orthogonal directions and represents text region with a set of contour points.
1 code implementation • CVPR 2020 • Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang
In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.
no code implementations • CVPR 2020 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng
Based on the probability space, we further generate new fusion strategies which achieve the state-of-the-art performance on four well-known action recognition datasets.
1 code implementation • CVPR 2020 • Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zheng-Jun Zha, Xingyu Gao, Shuhui Wang, Qingming Huang
Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario.
1 code implementation • CVPR 2020 • Dan Guo, Hui Wang, Hanwang Zhang, Zheng-Jun Zha, Meng Wang
Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts.
Ranked #12 on Visual Dialog on VisDial v0.9 val
1 code implementation • 30 Mar 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity.
1 code implementation • CVPR 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang
Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem.
no code implementations • CVPR 2020 • Ziqi Zhang, Yaya Shi, Chunfeng Yuan, Bing Li, Peijin Wang, Weiming Hu, Zheng-Jun Zha
In this paper, we propose a complete video captioning system including both a novel model and an effective training strategy.
Ranked #9 on Video Captioning on VATEX (using extra training data)
no code implementations • 17 Dec 2019 • Zhao Zhang, Yulin Sun, Yang Wang, Zheng-Jun Zha, Shuicheng Yan, Meng Wang
To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework.
no code implementations • 13 Dec 2019 • Jialing Lyu, Weichao Qiu, Xinyue Wei, Yi Zhang, Alan Yuille, Zheng-Jun Zha
This can explain why an activity classification model usually fails to generalize to datasets it is not trained on.
no code implementations • 13 Dec 2019 • Yan Zhang, Zhao Zhang, Zheng Zhang, Mingbo Zhao, Li Zhang, Zheng-Jun Zha, Meng Wang
In this paper, we investigate the unsupervised deep representation learning issue and technically propose a novel framework called Deep Self-representative Concept Factorization Network (DSCF-Net), for clustering deep features.
1 code implementation • NeurIPS 2019 • Kecheng Zheng, Zheng-Jun Zha, Wei Wei
Abstraction reasoning is a long-standing challenge in artificial intelligence.
no code implementations • 26 Nov 2019 • Yang Wang, Yang Cao, Zheng-Jun Zha, Jing Zhang, Zhiwei Xiong, Wei zhang, Feng Wu
Contrast enhancement and noise removal are coupled problems for low-light image enhancement.
1 code implementation • NeurIPS 2019 • Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks.
no code implementations • 20 Oct 2019 • Yuhang Li, Xuejin Chen, Feng Wu, Zheng-Jun Zha
The large-scale discriminator enforces the completeness of global structures and the small-scale discriminator encourages fine details, thereby enhancing the realism of generated face images.
1 code implementation • 5 Sep 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang
Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.
1 code implementation • ICCV 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang
It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.
no code implementations • 21 Aug 2019 • Zhao Zhang, Lei Wang, Sheng Li, Yang Wang, Zheng Zhang, Zheng-Jun Zha, Meng Wang
Specifically, AS-LRC performs the latent decomposition of given data into a low-rank reconstruction by a block-diagonal codes matrix, a group sparse locality-adaptive salient feature part and a sparse error part.
1 code implementation • 12 Aug 2019 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains.
no code implementations • 4 Aug 2019 • Zhao Zhang, Jiahuan Ren, Sheng Li, Richang Hong, Zheng-Jun Zha, Meng Wang
Leveraging on the Frobenius-norm based latent low-rank representation model, rBDLR jointly learns the coding coefficients and salient features, and improves the results by enhancing the robustness to outliers and errors in given data, preserving local information of salient features adaptively and ensuring the block-diagonal structures of the coefficients.
1 code implementation • 13 Jul 2019 • Xiaotian Chen, Xuejin Chen, Zheng-Jun Zha
We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details.
Ranked #55 on Monocular Depth Estimation on NYU-Depth V2
no code implementations • ACL 2019 • Jianxing Yu, Zheng-Jun Zha, Jian Yin
This paper focuses on the topic of inferential machine comprehension, which aims to fully understand the meanings of given text to answer generic questions, especially the ones needed reasoning skills.
1 code implementation • 23 Jun 2019 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng
Accordingly, a hybrid network representation is presented which enables us to leverage the Variational Dropout so that the approximation of the posterior distribution becomes fully gradient-based and highly efficient.
no code implementations • 9 Jun 2019 • Daqing Liu, Hanwang Zhang, Zheng-Jun Zha, Meng Wang, Qianru Sun
In this paper, we alleviate the missing-annotation problem and enable the joint reasoning by leveraging the language scene graph which covers both labeled referent and unlabeled contexts (other objects, attributes, and relationships).
1 code implementation • 6 Jun 2019 • Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu
With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.
no code implementations • 16 May 2019 • Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao
In this paper, we tackle one-shot texture retrieval: given an example of a new reference texture, detect and segment all the pixels of the same texture category within an arbitrary image.
no code implementations • 8 May 2019 • Liang Sun, Bing Li, Chunfeng Yuan, Zheng-Jun Zha, Weiming Hu
Inspired by the fact that different modalities in videos carry complementary information, we propose a Multimodal Semantic Attention Network(MSAN), which is a new encoder-decoder framework incorporating multimodal semantic attributes for video captioning.
1 code implementation • CVPR 2019 • Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, Feng Wu
Existing methods for single image super-resolution (SR) are typically evaluated with synthetic degradation models such as bicubic or Gaussian downsampling.
1 code implementation • CVPR 2019 • Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
Learning subtle yet discriminative features (e. g., beak and eyes for a bird) plays a significant role in fine-grained image recognition.
Ranked #1 on Fine-Grained Image Classification on iNaturalist
Fine-Grained Image Classification Fine-Grained Image Recognition
no code implementations • ICCV 2019 • Tianhao Yang, Zheng-Jun Zha, Hanwang Zhang
We study the multi-round response generation in visual dialog, where a response is generated according to a visually grounded conversational history.
Ranked #10 on Visual Dialog on VisDial v0.9 val
no code implementations • ICCV 2019 • Daqing Liu, Hanwang Zhang, Feng Wu, Zheng-Jun Zha
In particular, we develop a novel modular network called Neural Module Tree network (NMTree) that regularizes the visual grounding along the dependency parsing tree of the sentence, where each node is a neural module that calculates visual attention according to its linguistic feature, and the grounding score is accumulated in a bottom-up direction where as needed.
no code implementations • 19 Nov 2018 • Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, Yongdong Zhang
An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts.
no code implementations • ECCV 2018 • Jiafan Zhuang, Saihui Hou, Zilei Wang, Zheng-Jun Zha
License plate recognition (LPR) is a fundamental component of various intelligent transport systems, which is always expected to be accurate and efficient enough.
1 code implementation • 16 Aug 2018 • Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu
To fill the gap, we propose a Context-Aware Visual Policy network (CAVP) for sequence-level image captioning.
no code implementations • 31 Jul 2018 • Shaobo Min, Xuejin Chen, Zheng-Jun Zha, Feng Wu, Yongdong Zhang
\begin{abstract} Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation.
no code implementations • CVPR 2018 • Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wen-Jun Zeng
Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition.
1 code implementation • 28 Feb 2018 • Dong Liu, Ke Sun, Zhangyang Wang, Runsheng Liu, Zheng-Jun Zha
We propose an interpretable deep structure namely Frank-Wolfe Network (F-W Net), whose architecture is inspired by unrolling and truncating the Frank-Wolfe algorithm for solving an $L_p$-norm constrained problem with $p\geq 1$.
no code implementations • 21 Feb 2017 • Wei Zhang, Shengnan Hu, Kan Liu, Zheng-Jun Zha
This paper presents a novel approach for video-based person re-identification using multiple Convolutional Neural Networks (CNNs).
no code implementations • CVPR 2016 • Chenyi Lei, Dong Liu, Weiping Li, Zheng-Jun Zha, Houqiang Li
In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations.