no code implementations • 4 Mar 2024 • Qiushan Guo, Shalini De Mello, Hongxu Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liu
Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions.
no code implementations • 29 Jan 2024 • Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li
For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the reference image's pixels.
no code implementations • 26 Jan 2024 • Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang
Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference.
1 code implementation • ICCV 2023 • Ziyuan Luo, Qing Guo, Ka Chun Cheung, Simon See, Renjie Wan
Neural Radiance Fields (NeRF) have the potential to be a major representation of media.
1 code implementation • ICCV 2023 • Xuesong Chen, Shaoshuai Shi, Chao Zhang, Benjin Zhu, Qiang Wang, Ka Chun Cheung, Simon See, Hongsheng Li
3D multi-object tracking (MOT) is vital for many applications including autonomous driving vehicles and service robots.
1 code implementation • ICCV 2023 • Xiaoyu Shi, Zhaoyang Huang, Weikang Bian, Dasong Li, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li
We first propose a TRi-frame Optical Flow (TROF) module that estimates bi-directional optical flows for the center frame in a three-frame manner.
1 code implementation • CVPR 2023 • Xiaoyu Shi, Zhaoyang Huang, Dasong Li, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li
FlowFormer introduces a transformer architecture into optical flow estimation and achieves state-of-the-art performance.
no code implementations • 16 Nov 2022 • Yihang Gao, Ka Chun Cheung, Michael K. Ng
Physics-informed neural networks (PINNs) have attracted significant attention for solving partial differential equations (PDEs) in recent years because they alleviate the curse of dimensionality that appears in traditional methods.
no code implementations • 22 Oct 2022 • Dongkyu Lee, Zhiliang Tian, Yingxiu Zhao, Ka Chun Cheung, Nevin L. Zhang
The question is answered in our work with the concept of model calibration; we view a teacher model not only as a source of knowledge but also as a gauge to detect miscalibration of a student.
no code implementations • 22 Oct 2022 • Dongkyu Lee, Ka Chun Cheung, Nevin L. Zhang
Furthermore, inspired by recent work in bridging label smoothing and knowledge distillation, our work utilizes self-knowledge as a prior label distribution in softening target labels, and presents theoretical support for the regularization effect by knowledge distillation and the dynamic smoothing parameter.
no code implementations • 19 Sep 2022 • Zhaoyang Huang, Xiaokun Pan, Weihong Pan, Weikang Bian, Yan Xu, Ka Chun Cheung, Guofeng Zhang, Hongsheng Li
We tackle the problem of estimating correspondences from a general marker, such as a movie poster, to an image that captures such a marker.
1 code implementation • 10 Aug 2022 • Dasong Li, Yi Zhang, Ka Chun Cheung, Xiaogang Wang, Hongwei Qin, Hongsheng Li
With the integration, MSDI-Net can handle various and complicated blurry patterns adaptively.
Ranked #13 on Image Deblurring on GoPro
1 code implementation • CVPR 2023 • Dasong Li, Xiaoyu Shi, Yi Zhang, Ka Chun Cheung, Simon See, Xiaogang Wang, Hongwei Qin, Hongsheng Li
In this study, we propose a simple yet effective framework for video restoration.
Ranked #1 on Deblurring on GoPro (using extra training data)
1 code implementation • 12 May 2022 • Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li
Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots.
1 code implementation • 30 Mar 2022 • Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li
We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow.
Ranked #1 on Optical Flow Estimation on Sintel-final
no code implementations • 29 Sep 2021 • Dongkyu Lee, Ka Chun Cheung, Nevin Zhang
Overconfidence has been shown to impair generalization and calibration of a neural network.
no code implementations • 27 Apr 2021 • Yixiao Ge, Xiao Zhang, Ching Lam Choi, Ka Chun Cheung, Peipei Zhao, Feng Zhu, Xiaogang Wang, Rui Zhao, Hongsheng Li
In this way, our BAKE framework achieves online knowledge ensembling across multiple samples with only a single network.
no code implementations • 7 Apr 2021 • Zhaoyang Huang, Xiaokun Pan, Runsen Xu, Yan Xu, Ka Chun Cheung, Guofeng Zhang, Hongsheng Li
However, local image contents are inevitably ambiguous and error-prone during the cross-image feature matching process, which hinders downstream tasks.
1 code implementation • 20 Nov 2019 • Shaohuai Shi, Xiaowen Chu, Ka Chun Cheung, Simon See
Distributed stochastic gradient descent (SGD) algorithms are widely deployed in training large-scale deep learning models, while the communication overhead among workers becomes the new system bottleneck.