no code implementations • 3 Nov 2024 • Zhenyu Wang, YaLi Li, Hengshuang Zhao, Shengjin Wang
The current trend in computer vision is to utilize one universal model to address all various tasks.
no code implementations • 14 Sep 2024 • ZhiYu Zhang, Da Liu, Shengqiang Liu, Anna Wang, Jie Gao, YaLi Li
Contrastive learning has become one of the most impressive approaches for multi-modal representation learning.
no code implementations • 6 Aug 2024 • Jichuan Zhang, YaLi Li, Xin Liu, Shengjin Wang
Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples.
no code implementations • 9 Apr 2024 • Shijie Rao, Kaiyu Cui, Yidong Huang, Jiawei Yang, YaLi Li, Shengjin Wang, Xue Feng, Fang Liu, Wei zhang
The inverse design methods proposed for these subwavelength structures are vital to the development of new photonic devices.
1 code implementation • 28 Mar 2024 • Zhenyu Wang, YaLi Li, Taichi Liu, Hengshuang Zhao, Shengjin Wang
Specifically, we propose the cycle-modality propagation, aimed at propagating knowledge bridging 2D and 3D modalities, to support the aforementioned functionalities.
no code implementations • 4 Mar 2024 • Lingyan Ran, YaLi Li, Guoqiang Liang, Yanning Zhang
Semantic segmentation is an important and popular research area in computer vision that focuses on classifying pixels in an image based on their semantics.
no code implementations • CVPR 2024 • YuAn Wang, YaLi Li, Shengjin Wang
Specifically the Position Adaptive Geometric Exploring (PAGE) unearths underlying information of 3D objects in the geometric details and spatial relationships perspectives.
no code implementations • CVPR 2024 • Eastman Z Y Wu, YaLi Li, YuAn Wang, Shengjin Wang
Towards these issues we propose a hybrid learning method based on pose-aware HOI feature refinement.
1 code implementation • NeurIPS 2023 • Zhenyu Wang, YaLi Li, Xi Chen, Hengshuang Zhao, Shengjin Wang
In this paper, we propose Uni3DETR, a unified 3D detector that addresses indoor and outdoor 3D detection within the same framework.
2 code implementations • ICCV 2023 • Zhaopeng Dou, Zhongdao Wang, YaLi Li, Shengjin Wang
To overcome the barriers of data and annotation, we propose to utilize large-scale unsupervised data for training.
Generalizable Person Re-identification Representation Learning
1 code implementation • CVPR 2023 • Zhenyu Wang, YaLi Li, Xi Chen, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao, Shengjin Wang
In this paper, we formally address universal object detection, which aims to detect every scene and predict every category.
no code implementations • 7 Nov 2022 • Zhongdao Wang, Zhaopeng Dou, Jingwei Zhang, Liang Zheng, Yifan Sun, YaLi Li, Shengjin Wang
In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos.
Domain Generalization Generalizable Person Re-identification +1
1 code implementation • 24 Oct 2022 • Zhaopeng Dou, Zhongdao Wang, Weihua Chen, YaLi Li, Shengjin Wang
(3) the data uncertainty and the model uncertainty are jointly learned in a unified network, and they serve as two fundamental criteria for the reliability assessment: if a probe is high-quality (low data uncertainty) and the model is confident in the prediction of the probe (low model uncertainty), the final ranking will be assessed as reliable.
1 code implementation • 20 Oct 2022 • Xin Liu, Zhongdao Wang, YaLi Li, Shengjin Wang
To cope with this issue, we propose Maximum Entropy Coding (MEC), a more principled objective that explicitly optimizes on the structure of the representation, so that the learned representation is less biased and thus generalizes better to unseen downstream tasks.
1 code implementation • 19 Oct 2022 • Xin Liu, Xiaofei Shao, Bo wang, YaLi Li, Shengjin Wang
First, unlike previous methods, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
no code implementations • 27 Jul 2022 • Yixuan Fan, Zhaopeng Dou, YaLi Li, Shengjin Wang
Furthermore, we focus on representation learning for portrait interpretation and propose a baseline that reflects our systematic perspective.
1 code implementation • 22 Jun 2022 • Yuhao Lu, Beixing Deng, Zhenyu Wang, Peiyuan Zhi, YaLi Li, Shengjin Wang
6-DoF grasp pose detection of multi-grasp and multi-object is a challenge task in the field of intelligent robot.
1 code implementation • CVPR 2022 • Zhenyu Wang, YaLi Li, Shengjin Wang
We construct a framework for semi-supervised instance segmentation by assigning pixel-level pseudo labels.
no code implementations • CVPR 2022 • YaLi Li, Shengjin Wang
In this paper, we propose a novel approach to combine decision trees and deep neural networks in an end-to-end learning manner for object detection.
no code implementations • CVPR 2022 • Dongchen Lu, Dongmei Li, YaLi Li, Shengjin Wang
By proposing the orientation-sensitive heatmap, OSKDet could learn the shape and direction of rotated target implicitly and has stronger modeling capabilities for rotated representation, which improves the localization accuracy and acquires high quality detection results.
no code implementations • NeurIPS 2021 • Zhenyu Wang, YaLi Li, Ye Guo, Shengjin Wang
To combat the noisy labeling, we propose noise-resistant semi-supervised learning by quantifying the region uncertainty.
1 code implementation • 20 Jul 2021 • Qinglin Zhang, Qian Chen, YaLi Li, Jiaqing Liu, Wen Wang
Evaluations are conducted on the English Wiki-727K document segmentation benchmark, a Chinese Wikipedia-based document segmentation dataset we created, and an in-house Chinese spoken document dataset.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • CVPR 2021 • Miao Hu, YaLi Li, Lu Fang, Shengjin Wang
Learning pyramidal feature representations is crucial for recognizing object instances at different scales.
no code implementations • 19 Jun 2021 • Jingtao Xu, YaLi Li, Shengjin Wang
In this paper, we propose a novel Adaptive Zoom (AdaZoom) network as a selective magnifier with flexible shape and focal length to adaptively zoom the focus regions for object detection.
no code implementations • 7 May 2021 • Miao Hu, YaLi Li, Lu Fang, Shengjin Wang
Learning pyramidal feature representations is crucial for recognizing object instances at different scales.
no code implementations • CVPR 2021 • Zhenyu Wang, YaLi Li, Ye Guo, Lu Fang, Shengjin Wang
In this paper, we delve into semi-supervised object detection where unlabeled images are leveraged to break through the upper bound of fully-supervised object detection models.
no code implementations • ICCV 2021 • Jiahe Shi, YaLi Li, Shengjin Wang
Human-oriented image captioning with both high diversity and accuracy is a challenging task in vision+language modeling.
no code implementations • ICCV 2021 • Xuege Hou, YaLi Li, Shengjin Wang
For quantitative measure of the degree of disentanglement, we verify that mutual information can represent as metric.