no code implementations • 18 Jun 2024 • Ziyu Ma, Chenhui Gou, Hengcan Shi, Bin Sun, Shutao Li, Hamid Rezatofighi, Jianfei Cai
Specifically, DrVideo first transforms a long video into a coarse text-based long document to initially retrieve key frames and then updates the documents with the augmented key frame information.
no code implementations • 6 Apr 2024 • Duy-Tho Le, Hengcan Shi, Jianfei Cai, Hamid Rezatofighi
Diffusion models have recently gained prominence as powerful deep generative models, demonstrating unmatched performance across various domains.
no code implementations • CVPR 2024 • Duy-Tho Le, Chenhui Gou, Stavya Datta, Hengcan Shi, Ian Reid, Jianfei Cai, Hamid Rezatofighi
JRDB-PanoTrack includes (1) various data involving indoor and outdoor crowded scenes, as well as comprehensive 2D and 3D synchronized data modalities; (2) high-quality 2D spatial panoptic segmentation and temporal tracking annotations, with additional 3D label projections for further spatial understanding; (3) diverse object classes for closed- and open-world recognition benchmarks, with OSPA-based metrics for evaluation.
no code implementations • 12 Mar 2024 • Xuhua Ren, Hengcan Shi, Jin Li
In this paper, we propose a novel open-vocabulary text recognition framework, Pseudo-OCR, to recognize OOV words.
no code implementations • 17 Jul 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai
We present a UOVN training mechanism to reduce such gaps.
1 code implementation • 10 Jul 2023 • Yicheng Wu, Zhonghua Wu, Hengcan Shi, Bjoern Picker, Winston Chong, Jianfei Cai
Moreover, a simple and effective relation regularization is proposed to ensure the longitudinal relations among the three outputs to improve the model learning.
no code implementations • 7 Jul 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai
However, they only use pairs of nouns and individual objects in VL data, while these data usually contain much more information, such as scene graphs, which are also crucial for OV detection.
no code implementations • 18 Jan 2023 • Son Duy Dao, Hengcan Shi, Dinh Phung, Jianfei Cai
Recent mask proposal models have significantly improved the performance of zero-shot semantic segmentation.
no code implementations • CVPR 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai
Effectively encoding multi-scale contextual information is crucial for accurate semantic segmentation.
no code implementations • 18 Jan 2022 • Hengcan Shi, Munawar Hayat, Jianfei Cai
To avoid the laborious annotation in conventional referring grounding, unpaired referring grounding is introduced, where the training data only contains a number of images and queries without correspondences.
no code implementations • CVPR 2022 • Hengcan Shi, Munawar Hayat, Yicheng Wu, Jianfei Cai
Firstly, we analyze CLIP for unsupervised open-category proposal generation and design an objectness score based on our empirical analysis on proposal selection.
1 code implementation • 31 Dec 2021 • Duy-Tho Le, Hengcan Shi, Hamid Rezatofighi, Jianfei Cai
Efficiently and accurately detecting people from 3D point cloud data is of great importance in many robotic and autonomous driving applications.
1 code implementation • 21 Apr 2021 • Tingtian Li, Zixun Sun, Haoruo Zhang, Jin Li, Ziming Wu, Hui Zhan, Yipeng Yu, Hengcan Shi
In this paper, we also investigate the widely added voice-overs in short videos and propose a novel framework to retrieve BGM for fine-grained short videos.
1 code implementation • CVPR 2019 • Hengcan Shi, Hongliang Li, Qingbo Wu, Zichen Song
On the one hand, the integrated classification model contains multiple classifiers, not only the general classifier but also a refinement classifier to distinguish the confusing categories.
Ranked #1 on
Scene Segmentation
on SUN-RGBD
no code implementations • ECCV 2018 • Hengcan Shi, Hongliang Li, Fanman Meng, Qingbo Wu
On the other hand, the relationships of different image regions are not considered as well, even though they are greatly important to eliminate the undesired foreground object in accordance with specific query.