1 code implementation • 18 Dec 2024 • Jiankun Zhu, Sicheng Zhao, Jing Jiang, Wenbo Tang, Zhaopan Xu, Tingting Han, Pengfei Xu, Hongxun Yao
For reducing reliance on data labeling, domain adaptation offers an alternative solution by adapting models trained on labeled source data to unlabeled target data.
1 code implementation • International Journal of Computer Vision (IJCV) 2024 • Xianzhu Liu, Haozhe Xie, Shengping Zhang, Hongxun Yao, Rongrong Ji, Liqiang Nie, DaCheng Tao
Semantic scene completion (SSC) aims to simultaneously perform scene completion (SC) and predict semantic categories of a 3D scene from a single depth and/or RGB image.
Ranked #1 on
3D Semantic Scene Completion
on NYUv2
no code implementations • 5 Sep 2024 • Xi Chen, Haosen Yang, Sheng Jin, Xiatian Zhu, Hongxun Yao
To fully exploit pre-trained knowledge while minimizing training overhead, we freeze both foundation models, focusing optimization efforts solely on a lightweight transformer decoder for mask proposal generation-the performance bottleneck.
1 code implementation • 29 Aug 2024 • Jing Jiang, Sicheng Zhao, Jiankun Zhu, Wenbo Tang, Zhaopan Xu, Jidong Yang, Guoping Liu, Tengfei Xing, Pengfei Xu, Hongxun Yao
To address these two issues, we propose a novel framework, Deformation Transform Aligner for Panoramic Semantic Segmentation (DTA4PASS), which converts all pinhole images in the source domains into distorted images and aligns the source distorted and panoramic images with the target panoramic images.
1 code implementation • CVPR 2024 • Huicong Zhang, Haozhe Xie, Hongxun Yao
Specifically, BSSTNet (1) uses a longer temporal window in the transformer, leveraging information from more distant frames to restore the blurry pixels in the current frame.
Ranked #1 on
Deblurring
on DVD
1 code implementation • 28 May 2024 • Ziheng Qin, Zhaopan Xu, Yukun Zhou, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Xiaojiang Peng, Radu Timofte, Hongxun Yao, Kai Wang, Yang You
To tackle this challenge, we propose InfoGrowth, an efficient online algorithm for data cleaning and selection, resulting in a growing dataset that keeps up to date with awareness of cleanliness and diversity.
1 code implementation • 17 Mar 2024 • Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhu
Source-free unsupervised domain adaptation (SFUDA) aims to enable the utilization of a pre-trained source model in an unlabeled target domain without access to source data.
no code implementations • CVPR 2024 • Tingting Zheng, Kui Jiang, Hongxun Yao
It involves instance sampling, feature representation, and decision-making.
1 code implementation • 3 Mar 2024 • Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao
Finally, an instance matting network decodes the image features and united semantics guidance to predict all instance-level alpha mattes.
no code implementations • 7 Dec 2023 • Yufan Chen, Lizhen Wang, Qijing Li, Hongjiang Xiao, Shengping Zhang, Hongxun Yao, Yebin Liu
In response to these challenges, we propose MonoGaussianAvatar (Monocular Gaussian Point-based Head Avatar), a novel approach that harnesses 3D Gaussian point representation coupled with a Gaussian deformation field to learn explicit head avatars from monocular portrait videos.
1 code implementation • CVPR 2023 • Shuo Yang, Zhaopan Xu, Kai Wang, Yang You, Hongxun Yao, Tongliang Liu, Min Xu
As one of the most fundamental techniques in multimodal learning, cross-modal matching aims to project various sensory modalities into a shared feature space.
Cross-modal retrieval with noisy correspondence
Image-text matching
+1
no code implementations • 26 Feb 2023 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui
The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space.
1 code implementation • 9 Oct 2022 • Haosen Yang, Deng Huang, Bin Wen, Jiannan Wu, Hongxun Yao, Yi Jiang, Xiatian Zhu, Zehuan Yuan
As a result, our model can extract effectively both static appearance and dynamic motion spontaneously, leading to superior spatiotemporal representation learning capability.
1 code implementation • 22 Jul 2022 • Huicong Zhang, Haozhe Xie, Hongxun Yao
The key success factor of the video deblurring methods is to compensate for the blurry pixels of the mid-frame with the sharp pixels of the adjacent video frames.
Ranked #6 on
Deblurring
on DVD
1 code implementation • 15 Dec 2021 • Haosen Yang, Wenhao Wu, Lining Wang, Sheng Jin, Boyang xia, Hongxun Yao, Hujie Huang
To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.
no code implementations • 25 May 2021 • Lining Wang, Haosen Yang, Wenhao Wu, Hongxun Yao, Hujie Huang
Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately.
1 code implementation • CVPR 2021 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Wenxiu Sun
For the current query frame, the query regions are tracked and predicted based on the optical flow estimated from the previous frame.
3 code implementations • 22 Jun 2020 • Haozhe Xie, Hongxun Yao, Shengping Zhang, Shangchen Zhou, Wenxiu Sun
A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D volumes to obtain a fused 3D volume.
Ranked #3 on
3D Object Reconstruction
on Data3D−R2N2
1 code implementation • ECCV 2020 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun
In particular, we devise two novel differentiable layers, named Gridding and Gridding Reverse, to convert between point clouds and 3D grids without losing structural information.
Ranked #1 on
Point Cloud Completion
on Completion3D
no code implementations • 18 Mar 2020 • Xinjie Feng, Hongxun Yao, Yuankai Qi, Jun Zhang, Shengping Zhang
Different from previous transformer based models [56, 34], which just use the decoder of the transformer to decode the convolutional attention, the proposed method use a convolutional feature maps as word embedding input into transformer.
no code implementations • 8 Mar 2020 • Shuo Yang, Wei Yu, Ying Zheng, Hongxun Yao, Tao Mei
To solve this new problem, we propose a hierarchical adaptive semantic-visual tree (ASVT) to depict the architecture of merchandise categories, which evaluates semantic similarities between different semantic levels and visual similarities within the same semantic class simultaneously.
no code implementations • 20 Nov 2019 • Sheng Jin, Shangchen Zhou, Yao Liu, Chao Chen, Xiaoshuai Sun, Hongxun Yao, Xian-Sheng Hua
In this paper, we propose a novel Semi-supervised Self-pace Adversarial Hashing method, named SSAH to solve the above problems in a unified framework.
1 code implementation • 18 Oct 2019 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Xiaoshuai Sun, Wenxiu Sun
Inferring the 3D shape of an object from an RGB image has shown impressive results, however, existing methods rely primarily on recognizing the most similar 3D model from the training set to solve the problem.
no code implementations • 14 Oct 2019 • Ying Zheng, Hongxun Yao, Xiaoshuai Sun
First, we propose a homogeneous transformation method to address the problem of domain adaptation.
no code implementations • 14 Oct 2019 • Ying Zheng, Hongxun Yao, Xiaoshuai Sun, Shengping Zhang, Sicheng Zhao, Fatih Porikli
Conventional methods for this task often rely on the availability of the temporal order of sketch strokes, additional cues acquired from different modalities and supervised augmentation of sketch datasets with real images, which also limit the applicability and feasibility of these methods in real scenarios.
5 code implementations • ICCV 2019 • Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Shengping Zhang
Then, a context-aware fusion module is introduced to adaptively select high-quality reconstructions for each part (e. g., table legs) from different coarse 3D volumes to obtain a fused 3D volume.
Ranked #4 on
3D Object Reconstruction
on Data3D−R2N2
no code implementations • 4 Jul 2018 • Sheng Jin, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Lei Zhang, Xian-Sheng Hua
As the core of DSaH, the saliency loss guides the attention network to mine discriminative regions from pairs of images.
no code implementations • ICCV 2017 • Xin Sun, Ngai-Man Cheung, Hongxun Yao, Yiluan Guo
Part-based trackers are effective in exploiting local details of the target object for robust tracking.
no code implementations • CVPR 2016 • Yuankai Qi, Shengping Zhang, Lei Qin, Hongxun Yao, Qingming Huang, Jongwoo Lim, Ming-Hsuan Yang
In recent years, several methods have been developed to utilize hierarchical features learned from a deep convolutional neural network (CNN) for visual tracking.
no code implementations • 20 Dec 2014 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui
Convolutional Neural Networks (CNNs) have achieved comparable error rates to well-trained human on ILSVRC2014 image classification task.
no code implementations • CVPR 2013 • Xiaoshuai Sun, Xin-Jing Wang, Hongxun Yao, Lei Zhang
In this paper, we propose a computational model of visual representativeness by integrating cognitive theories of representativeness heuristics with computer vision and machine learning techniques.