Search Results for author: Hongxun Yao

Found 25 papers, 11 papers with code

Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation

1 code implementation • 17 Mar 2024 • Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhu

Source-free unsupervised domain adaptation (SFUDA) aims to enable the utilization of a pre-trained source model in an unlabeled target domain without access to source data.

Contrastive Learning Memorization +3

Paper
Code

Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

no code implementations • 9 Mar 2024 • Tingting Zheng, Kui Jiang, Hongxun Yao

It involves instance sampling, feature representation, and decision-making.

Contrastive Learning Decision Making +1

Paper
Add Code

End-to-End Human Instance Matting

1 code implementation • 3 Mar 2024 • Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao

Finally, an instance matting network decodes the image features and united semantics guidance to predict all instance-level alpha mattes.

Image Matting Instance Segmentation +1

Paper
Code

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

no code implementations • 7 Dec 2023 • Yufan Chen, Lizhen Wang, Qijing Li, Hongjiang Xiao, Shengping Zhang, Hongxun Yao, Yebin Liu

In response to these challenges, we propose MonoGaussianAvatar (Monocular Gaussian Point-based Head Avatar), a novel approach that harnesses 3D Gaussian point representation coupled with a Gaussian deformation field to learn explicit head avatars from monocular portrait videos.

Paper
Add Code

BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency

1 code implementation • CVPR 2023 • Shuo Yang, Zhaopan Xu, Kai Wang, Yang You, Hongxun Yao, Tongliang Liu, Min Xu

As one of the most fundamental techniques in multimodal learning, cross-modal matching aims to project various sensory modalities into a shared feature space.

Image-text matching Text Matching

Paper
Code

Learning cross space mapping via DNN using large scale click-through logs

no code implementations • 26 Feb 2023 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space.

Image Classification Image Retrieval +1

Paper
Add Code

Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders

1 code implementation • 9 Oct 2022 • Haosen Yang, Deng Huang, Bin Wen, Jiannan Wu, Hongxun Yao, Yi Jiang, Xiatian Zhu, Zehuan Yuan

As a result, our model can extract effectively both static appearance and dynamic motion spontaneously, leading to superior spatiotemporal representation learning capability.

Representation Learning Semantic Segmentation +2

Paper
Code

Spatio-Temporal Deformable Attention Network for Video Deblurring

1 code implementation • 22 Jul 2022 • Huicong Zhang, Haozhe Xie, Hongxun Yao

The key success factor of the video deblurring methods is to compensate for the blurry pixels of the mid-frame with the sharp pixels of the adjacent video frames.

Deblurring

Paper
Code

Temporal Action Proposal Generation with Background Constraint

1 code implementation • 15 Dec 2021 • Haosen Yang, Wenhao Wu, Lining Wang, Sheng Jin, Boyang xia, Hongxun Yao, Hujie Huang

To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.

Temporal Action Proposal Generation

Paper
Code

Temporal Action Proposal Generation with Transformers

no code implementations • 25 May 2021 • Lining Wang, Haosen Yang, Wenhao Wu, Hongxun Yao, Hujie Huang

Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately.

Temporal Action Proposal Generation

Paper
Add Code

Efficient Regional Memory Network for Video Object Segmentation

1 code implementation • CVPR 2021 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Wenxiu Sun

For the current query frame, the query regions are tracked and predicted based on the optical flow estimated from the previous frame.

Ranked #9 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Object One-shot visual object segmentation +3

Paper
Code

Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images

3 code implementations • 22 Jun 2020 • Haozhe Xie, Hongxun Yao, Shengping Zhang, Shangchen Zhou, Wenxiu Sun

A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D volumes to obtain a fused 3D volume.

Ranked #3 on 3D Object Reconstruction on Data3D−R2N2

3D Object Reconstruction

439

Paper
Code

GRNet: Gridding Residual Network for Dense Point Cloud Completion

1 code implementation • ECCV 2020 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun

In particular, we devise two novel differentiable layers, named Gridding and Gridding Reverse, to convert between point clouds and 3D grids without losing structural information.

Ranked #3 on Point Cloud Completion on Completion3D

Point Cloud Completion

285

Paper
Code

Scene Text Recognition via Transformer

no code implementations • 18 Mar 2020 • Xinjie Feng, Hongxun Yao, Yuankai Qi, Jun Zhang, Shengping Zhang

Different from previous transformer based models [56, 34], which just use the decoder of the transformer to decode the convolutional attention, the proposed method use a convolutional feature maps as word embedding input into transformer.

Scene Text Recognition

Paper
Add Code

Adaptive Semantic-Visual Tree for Hierarchical Embeddings

no code implementations • 8 Mar 2020 • Shuo Yang, Wei Yu, Ying Zheng, Hongxun Yao, Tao Mei

To solve this new problem, we propose a hierarchical adaptive semantic-visual tree (ASVT) to depict the architecture of merchandise categories, which evaluates semantic similarities between different semantic levels and visual similarities within the same semantic class simultaneously.

Image Retrieval Retrieval

Paper
Add Code

SSAH: Semi-supervised Adversarial Deep Hashing with Self-paced Hard Sample Generation

no code implementations • 20 Nov 2019 • Sheng Jin, Shangchen Zhou, Yao Liu, Chao Chen, Xiaoshuai Sun, Hongxun Yao, Xian-Sheng Hua

In this paper, we propose a novel Semi-supervised Self-pace Adversarial Hashing method, named SSAH to solve the above problems in a unified framework.

Deep Hashing Generative Adversarial Network

Paper
Add Code

Toward 3D Object Reconstruction from Stereo Images

1 code implementation • 18 Oct 2019 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Xiaoshuai Sun, Wenxiu Sun

Inferring the 3D shape of an object from an RGB image has shown impressive results, however, existing methods rely primarily on recognizing the most similar 3D model from the training set to solve the problem.

3D Object Reconstruction Benchmarking +1

Paper
Code

Sketch-Specific Data Augmentation for Freehand Sketch Recognition

no code implementations • 14 Oct 2019 • Ying Zheng, Hongxun Yao, Xiaoshuai Sun, Shengping Zhang, Sicheng Zhao, Fatih Porikli

Conventional methods for this task often rely on the availability of the temporal order of sketch strokes, additional cues acquired from different modalities and supervised augmentation of sketch datasets with real images, which also limit the applicability and feasibility of these methods in real scenarios.

Data Augmentation Retrieval +2

Paper
Add Code

Deep Semantic Parsing of Freehand Sketches with Homogeneous Transformation, Soft-Weighted Loss, and Staged Learning

no code implementations • 14 Oct 2019 • Ying Zheng, Hongxun Yao, Xiaoshuai Sun

First, we propose a homogeneous transformation method to address the problem of domain adaptation.

Domain Adaptation Retrieval +2

Paper
Add Code

Pix2Vox: Context-aware 3D Reconstruction from Single and Multi-view Images

5 code implementations • ICCV 2019 • Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Shengping Zhang

Then, a context-aware fusion module is introduced to adaptively select high-quality reconstructions for each part (e. g., table legs) from different coarse 3D volumes to obtain a fused 3D volume.

Ranked #4 on 3D Object Reconstruction on Data3D−R2N2

3D Object Reconstruction 3D Reconstruction +1

439

Paper
Code

Deep Saliency Hashing

no code implementations • 4 Jul 2018 • Sheng Jin, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Lei Zhang, Xian-Sheng Hua

As the core of DSaH, the saliency loss guides the attention network to mine discriminative regions from pairs of images.

Deep Hashing Quantization

Paper
Add Code

Non-Rigid Object Tracking via Deformable Patches Using Shape-Preserved KCF and Level Sets

no code implementations • ICCV 2017 • Xin Sun, Ngai-Man Cheung, Hongxun Yao, Yiluan Guo

Part-based trackers are effective in exploiting local details of the target object for robust tracking.

Object Tracking

Paper
Add Code

Hedged Deep Tracking

no code implementations • CVPR 2016 • Yuankai Qi, Shengping Zhang, Lei Qin, Hongxun Yao, Qingming Huang, Jongwoo Lim, Ming-Hsuan Yang

In recent years, several methods have been developed to utilize hierarchical features learned from a deep convolutional neural network (CNN) for visual tracking.

Visual Tracking

Paper
Add Code

Visualizing and Comparing Convolutional Neural Networks

no code implementations • 20 Dec 2014 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

Convolutional Neural Networks (CNNs) have achieved comparable error rates to well-trained human on ILSVRC2014 image classification task.

Classification General Classification +1

Paper
Add Code

Exploring Implicit Image Statistics for Visual Representativeness Modeling

no code implementations • CVPR 2013 • Xiaoshuai Sun, Xin-Jing Wang, Hongxun Yao, Lei Zhang

In this paper, we propose a computational model of visual representativeness by integrating cognitive theories of representativeness heuristics with computer vision and machine learning techniques.

Image Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.