no code implementations • 30 May 2019 • Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang
In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives.
2 code implementations • 24 Jun 2019 • Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song
An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.
no code implementations • 19 Oct 2019 • Shuai Zhao, Boxi Wu, Wenqing Chu, Yao Hu, Deng Cai
Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity.
1 code implementation • 21 Oct 2019 • Jialiang Zhang, Lixiang Lin, Yang Li, Yun-chen Chen, Jianke Zhu, Yao Hu, Steven C. H. Hoi
To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion.
no code implementations • 8 Jan 2020 • Shu-Ting Shi, Wenhao Zheng, Jun Tang, Qing-Guo Chen, Yao Hu, Jianke Zhu, Ming Li
Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation.
no code implementations • 30 Jul 2020 • He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu
On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.
1 code implementation • 15 Oct 2020 • Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai
We investigate this problem by study the gap of confidence between teacher and student.
1 code implementation • 1 Dec 2020 • Yao Hu, Guohua Geng, Kang Li, Wei Zhou
Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.
1 code implementation • 15 Dec 2020 • Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu
Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.
1 code implementation • CVPR 2021 • Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr
Current developments in temporal event or action localization usually target actions captured by a single camera.
Ranked #2 on Temporal Action Localization on MUSES
1 code implementation • 11 Jan 2021 • Tun Zhu, Daoxin Zhang, Yao Hu, Tianran Wang, XiaoLong Jiang, Jianke Zhu, Jiawei Li
Alongside the prevalence of mobile videos, the general public leans towards consuming vertical videos on hand-held devices.
2 code implementations • 2 Feb 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.
Ranked #39 on Video Instance Segmentation on OVIS validation
1 code implementation • CVPR 2021 • Haochen Wang, XiaoLong Jiang, Haibing Ren, Yao Hu, Song Bai
In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77. 8% J &F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance.
1 code implementation • 5 Jun 2021 • Pan Li, Maofei Que, Zhichao Jiang, Yao Hu, Alexander Tuzhilin
Classical recommender system methods typically face the filter bubble problem when users only receive recommendations of their familiar items, making them bored and dissatisfied.
1 code implementation • 5 Jun 2021 • Pan Li, Zhichao Jiang, Maofei Que, Yao Hu, Alexander Tuzhilin
While several cross domain sequential recommendation models have been proposed to leverage information from a source domain to improve CTR predictions in a target domain, they did not take into account bidirectional latent relations of user preferences across source-target domain pairs.
1 code implementation • ICCV 2021 • Hao Fang, Daoxin Zhang, Yi Zhang, Minghao Chen, Jiawei Li, Yao Hu, Deng Cai, Xiaofei He
In this paper, we study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency.
1 code implementation • 18 Jun 2021 • Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Shiwei Zhang, Song Bai, Xiang Bai
Temporal action detection (TAD) aims to determine the semantic label and the temporal interval of every action instance in an untrimmed video.
Ranked #8 on Temporal Action Localization on HACS
no code implementations • 26 Jul 2021 • Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo
Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.
no code implementations • 28 Jul 2021 • Yao Hu, Guohua Geng, Kang Li, Wei Zhou, Xingxing Hao, Xin Cao
Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.
no code implementations • 15 Nov 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.
no code implementations • 2 Feb 2022 • Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, Yao Hu
Prior works propose to predict Intersection-over-Union (IoU) between bounding boxes and corresponding ground-truths to improve NMS, while accurately predicting IoU is still a challenging problem.
no code implementations • 4 Mar 2022 • Guocheng Zhou, Shaohui Zhang, Yao Hu, Lei Cao, Yong Huang, Qun Hao
Fourier ptychography has attracted a wide range of focus for its ability of large space-bandwidth-produce, and quantative phase measurement.
no code implementations • 1 Apr 2022 • Baiqi Cui, Shaohui Zhang, Yechao Wang, Yao Hu, Qun Hao
Fourier ptychography (FP), as a computational imaging method, is a powerful tool to improve imaging resolution.
1 code implementation • CVPR 2023 • Keyan Chen, XiaoLong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie
In this paper, we consider the problem of simultaneously detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.
Ranked #1 on Open Vocabulary Attribute Detection on OVAD benchmark (using extra training data)
no code implementations • 30 Mar 2023 • Binbin Li, Xinyu Du, Yao Hu, Hao Yu, Wende Zhang
Online camera-to-ground calibration is to generate a non-rigid body transformation between the camera and the road surface in a real-time manner.
1 code implementation • ICCV 2023 • Haochen Wang, Cilin Yan, Shuai Wang, XiaoLong Jiang, Xu Tang, Yao Hu, Weidi Xie, Efstratios Gavves
Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categories in real-world videos.
no code implementations • 14 Apr 2023 • Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang
CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.
1 code implementation • 23 Apr 2023 • Cilin Yan, Haochen Wang, Jie Liu, XiaoLong Jiang, Yao Hu, Xu Tang, Guoliang Kang, Efstratios Gavves
Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing.
1 code implementation • 17 May 2023 • Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
no code implementations • 11 Jul 2023 • Zhouhong Gu, Lin Zhang, Jiangjie Chen, Haoning Ye, Xiaoxuan Zhu, Zihan Li, Zheyu Ye, Yan Gao, Yao Hu, Yanghua Xiao, Hongwei Feng
We introduces the DetectBench, a reading comprehension dataset designed to assess a model's ability to jointly ability in key information detection and multi-hop reasoning when facing complex and implicit information.
1 code implementation • 26 Dec 2023 • Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing
Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.
no code implementations • 28 Dec 2023 • Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang
We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.
1 code implementation • 15 Jan 2024 • Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen, Huaxia Li, Xu Tang, Yao Hu
There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA.
no code implementations • 4 Mar 2024 • Chao Zhang, Shiwei Wu, Haoxin Zhang, Tong Xu, Yan Gao, Yao Hu, Di wu, Enhong Chen
Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content.
no code implementations • 12 Mar 2024 • Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao
Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.
no code implementations • 16 Mar 2024 • Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li
In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.
1 code implementation • 20 Mar 2024 • Zhouhong Gu, Xiaoxuan Zhu, Haoran Guo, Lin Zhang, Yin Cai, Hao Shen, Jiangjie Chen, Zheyu Ye, Yifei Dai, Yan Gao, Yao Hu, Hongwei Feng, Yanghua Xiao
By configuring specific environmental settings within Agent Group Chat, we are able to assess whether agents exhibit behaviors that align with human expectations.