no code implementations • 30 Mar 2023 • Binbin Li, Xinyu Du, Yao Hu, Hao Yu, Wende Zhang
Online camera-to-ground calibration is to generate a non-rigid body transformation between the camera and the road surface in a real-time manner.
1 code implementation • 23 Jan 2023 • Keyan Chen, XiaoLong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie
In this paper, we consider the problem of simultaneously detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.
Ranked #1 on
Open Vocabulary Attribute Detection
on OVAD benchmark
(using extra training data)
no code implementations • 1 Apr 2022 • Baiqi Cui, Shaohui Zhang, Yechao Wang, Yao Hu, Qun Hao
Fourier ptychography (FP), as a computational imaging method, is a powerful tool to improve imaging resolution.
no code implementations • 4 Mar 2022 • Guocheng Zhou, Shaohui Zhang, Yao Hu, Lei Cao, Yong Huang, Qun Hao
Fourier ptychography has attracted a wide range of focus for its ability of large space-bandwidth-produce, and quantative phase measurement.
no code implementations • 2 Feb 2022 • Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, Yao Hu
Prior works propose to predict Intersection-over-Union (IoU) between bounding boxes and corresponding ground-truths to improve NMS, while accurately predicting IoU is still a challenging problem.
no code implementations • 15 Nov 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.
no code implementations • 28 Jul 2021 • Yao Hu, Guohua Geng, Kang Li, Wei Zhou, Xingxing Hao, Xin Cao
Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.
no code implementations • 26 Jul 2021 • Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo
Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.
1 code implementation • 18 Jun 2021 • Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Shiwei Zhang, Song Bai, Xiang Bai
Temporal action detection (TAD) aims to determine the semantic label and the temporal interval of every action instance in an untrimmed video.
Ranked #3 on
Temporal Action Localization
on HACS
1 code implementation • ICCV 2021 • Hao Fang, Daoxin Zhang, Yi Zhang, Minghao Chen, Jiawei Li, Yao Hu, Deng Cai, Xiaofei He
In this paper, we study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency.
1 code implementation • 5 Jun 2021 • Pan Li, Maofei Que, Zhichao Jiang, Yao Hu, Alexander Tuzhilin
Classical recommender system methods typically face the filter bubble problem when users only receive recommendations of their familiar items, making them bored and dissatisfied.
1 code implementation • 5 Jun 2021 • Pan Li, Zhichao Jiang, Maofei Que, Yao Hu, Alexander Tuzhilin
While several cross domain sequential recommendation models have been proposed to leverage information from a source domain to improve CTR predictions in a target domain, they did not take into account bidirectional latent relations of user preferences across source-target domain pairs.
1 code implementation • CVPR 2021 • Haochen Wang, XiaoLong Jiang, Haibing Ren, Yao Hu, Song Bai
In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77. 8% J &F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance.
Semantic Segmentation
Semi-Supervised Video Object Segmentation
+1
1 code implementation • 2 Feb 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.
Ranked #16 on
Video Instance Segmentation
on OVIS validation
1 code implementation • 11 Jan 2021 • Tun Zhu, Daoxin Zhang, Yao Hu, Tianran Wang, XiaoLong Jiang, Jianke Zhu, Jiawei Li
Alongside the prevalence of mobile videos, the general public leans towards consuming vertical videos on hand-held devices.
1 code implementation • CVPR 2021 • Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr
Current developments in temporal event or action localization usually target actions captured by a single camera.
Ranked #2 on
Temporal Action Localization
on MUSES
1 code implementation • 15 Dec 2020 • Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu
Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.
1 code implementation • 1 Dec 2020 • Yao Hu, Guohua Geng, Kang Li, Wei Zhou
Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.
1 code implementation • 15 Oct 2020 • Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai
We investigate this problem by study the gap of confidence between teacher and student.
no code implementations • 30 Jul 2020 • He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu
On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.
no code implementations • 8 Jan 2020 • Shu-Ting Shi, Wenhao Zheng, Jun Tang, Qing-Guo Chen, Yao Hu, Jianke Zhu, Ming Li
Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation.
1 code implementation • 21 Oct 2019 • Jialiang Zhang, Lixiang Lin, Yang Li, Yun-chen Chen, Jianke Zhu, Yao Hu, Steven C. H. Hoi
To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion.
no code implementations • 19 Oct 2019 • Shuai Zhao, Boxi Wu, Wenqing Chu, Yao Hu, Deng Cai
Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity.
2 code implementations • 24 Jun 2019 • Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song
An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.
no code implementations • 30 May 2019 • Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang
In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives.