Search Results for author: Yao Hu

Found 37 papers, 20 papers with code

Multi-Objective Generalized Linear Bandits

no code implementations30 May 2019 Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang

In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives.

Multi-Armed Bandits

Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

2 code implementations24 Jun 2019 Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song

An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.

Correlation Maximized Structural Similarity Loss for Semantic Segmentation

no code implementations19 Oct 2019 Shuai Zhao, Boxi Wu, Wenqing Chu, Yao Hu, Deng Cai

Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity.

Generative Adversarial Network Image Quality Assessment +2

Attribute-aware Pedestrian Detection in a Crowd

1 code implementation21 Oct 2019 Jialiang Zhang, Lixiang Lin, Yang Li, Yun-chen Chen, Jianke Zhu, Yao Hu, Steven C. H. Hoi

To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion.

Attribute Pedestrian Detection

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge

no code implementations30 Jul 2020 He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu

On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.

Classification General Classification +3

Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation

1 code implementation15 Oct 2020 Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai

We investigate this problem by study the gap of confidence between teacher and student.

Knowledge Distillation

Unsupervised Segmentation for Terracotta Warrior Point Cloud (SRG-Net)

1 code implementation1 Dec 2020 Yao Hu, Guohua Geng, Kang Li, Wei Zhou

Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.

Clustering Segmentation

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

1 code implementation15 Dec 2020 Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu

Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.

Anomaly Detection Fraud Detection

Horizontal-to-Vertical Video Conversion

1 code implementation11 Jan 2021 Tun Zhu, Daoxin Zhang, Yao Hu, Tianran Wang, XiaoLong Jiang, Jianke Zhu, Jiawei Li

Alongside the prevalence of mobile videos, the general public leans towards consuming vertical videos on hand-held devices.

Boundary Detection Multi-Object Tracking

Occluded Video Instance Segmentation: A Benchmark

2 code implementations2 Feb 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.

Instance Segmentation Segmentation +3

SwiftNet: Real-time Video Object Segmentation

1 code implementation CVPR 2021 Haochen Wang, XiaoLong Jiang, Haibing Ren, Yao Hu, Song Bai

In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77. 8% J &F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance.

Object Segmentation +3

PURS: Personalized Unexpected Recommender System for Improving User Satisfaction

1 code implementation5 Jun 2021 Pan Li, Maofei Que, Zhichao Jiang, Yao Hu, Alexander Tuzhilin

Classical recommender system methods typically face the filter bubble problem when users only receive recommendations of their familiar items, making them bored and dissatisfied.

Recommendation Systems

Dual Attentive Sequential Learning for Cross-Domain Click-Through Rate Prediction

1 code implementation5 Jun 2021 Pan Li, Zhichao Jiang, Maofei Que, Yao Hu, Alexander Tuzhilin

While several cross domain sequential recommendation models have been proposed to leverage information from a source domain to improve CTR predictions in a target domain, they did not take into account bidirectional latent relations of user preferences across source-target domain pairs.

Click-Through Rate Prediction Sequential Recommendation

Salient Object Ranking with Position-Preserved Attention

1 code implementation ICCV 2021 Hao Fang, Daoxin Zhang, Yi Zhang, Minghao Chen, Jiawei Li, Yao Hu, Deng Cai, Xiaofei He

In this paper, we study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency.

Image Cropping Instance Segmentation +7

End-to-end Temporal Action Detection with Transformer

1 code implementation18 Jun 2021 Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Shiwei Zhang, Song Bai, Xiang Bai

Temporal action detection (TAD) aims to determine the semantic label and the temporal interval of every action instance in an untrimmed video.

Action Detection Temporal Action Localization +1

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph

no code implementations26 Jul 2021 Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo

Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.

Graph Attention Image Captioning +1

Unsupervised Segmentation for Terracotta Warrior with Seed-Region-Growing CNN(SRG-Net)

no code implementations28 Jul 2021 Yao Hu, Guohua Geng, Kang Li, Wei Zhou, Xingxing Hao, Xin Cao

Then we present a supervised segmentation and unsupervised reconstruction networks to learn the characteristics of 3D point clouds.

Segmentation

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge

no code implementations15 Nov 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.

Instance Segmentation Object Recognition +3

Decoupled IoU Regression for Object Detection

no code implementations2 Feb 2022 Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, Yao Hu

Prior works propose to predict Intersection-over-Union (IoU) between bounding boxes and corresponding ground-truths to improve NMS, while accurately predicting IoU is still a challenging problem.

Object object-detection +2

Parallel Fourier Ptychography reconstruction

no code implementations4 Mar 2022 Guocheng Zhou, Shaohui Zhang, Yao Hu, Lei Cao, Yong Huang, Qun Hao

Fourier ptychography has attracted a wide range of focus for its ability of large space-bandwidth-produce, and quantative phase measurement.

OvarNet: Towards Open-vocabulary Object Attribute Recognition

1 code implementation CVPR 2023 Keyan Chen, XiaoLong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie

In this paper, we consider the problem of simultaneously detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.

 Ranked #1 on Open Vocabulary Attribute Detection on OVAD benchmark (using extra training data)

Attribute Knowledge Distillation +5

Online Camera-to-ground Calibration for Autonomous Driving

no code implementations30 Mar 2023 Binbin Li, Xinyu Du, Yao Hu, Hao Yu, Wende Zhang

Online camera-to-ground calibration is to generate a non-rigid body transformation between the camera and the road surface in a real-time manner.

Autonomous Driving

Towards Open-Vocabulary Video Instance Segmentation

1 code implementation ICCV 2023 Haochen Wang, Cilin Yan, Shuai Wang, XiaoLong Jiang, Xu Tang, Yao Hu, Weidi Xie, Efstratios Gavves

Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categories in real-world videos.

Instance Segmentation Segmentation +3

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation

no code implementations14 Apr 2023 Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang

CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.

GPR Open Vocabulary Semantic Segmentation +3

PiClick: Picking the desired mask in click-based interactive segmentation

1 code implementation23 Apr 2023 Cilin Yan, Haochen Wang, Jie Liu, XiaoLong Jiang, Yao Hu, Xu Tang, Guoliang Kang, Efstratios Gavves

Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing.

Interactive Segmentation Segmentation

Controllable Mind Visual Diffusion Model

1 code implementation17 May 2023 Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.

Attribute Image Generation

Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models

no code implementations11 Jul 2023 Zhouhong Gu, Lin Zhang, Jiangjie Chen, Haoning Ye, Xiaoxuan Zhu, Zihan Li, Zheyu Ye, Yan Gao, Yao Hu, Yanghua Xiao, Hongwei Feng

We introduces the DetectBench, a reading comprehension dataset designed to assess a model's ability to jointly ability in key information detection and multi-hop reasoning when facing complex and implicit information.

Common Sense Reasoning Decision Making +2

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

1 code implementation26 Dec 2023 Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing

Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.

Image Generation

ZONE: Zero-Shot Instruction-Guided Local Editing

no code implementations28 Dec 2023 Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang

We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.

Image Generation

InstantID: Zero-shot Identity-Preserving Generation in Seconds

1 code implementation15 Jan 2024 Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen, Huaxia Li, Xu Tang, Yao Hu

There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA.

Diffusion Personalization Tuning Free Image Generation

NoteLLM: A Retrievable Large Language Model for Note Recommendation

no code implementations4 Mar 2024 Chao Zhang, Shiwei Wu, Haoxin Zhang, Tong Xu, Yan Gao, Yao Hu, Di wu, Enhong Chen

Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content.

Contrastive Learning Language Modelling +1

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

no code implementations12 Mar 2024 Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao

Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.

Text-to-Image Generation

StableGarment: Garment-Centric Generation via Stable Diffusion

no code implementations16 Mar 2024 Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li

In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.

Denoising Image Generation +1

Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior

1 code implementation20 Mar 2024 Zhouhong Gu, Xiaoxuan Zhu, Haoran Guo, Lin Zhang, Yin Cai, Hao Shen, Jiangjie Chen, Zheyu Ye, Yifei Dai, Yan Gao, Yao Hu, Hongwei Feng, Yanghua Xiao

By configuring specific environmental settings within Agent Group Chat, we are able to assess whether agents exhibit behaviors that align with human expectations.

Cannot find the paper you are looking for? You can Submit a new open access paper.