no code implementations • 28 Jan 2024 • Yu-Jhe Li, Yan Xu, Rawal Khirodkar, Jinhyung Park, Kris Kitani
In order to evaluate our proposed pipeline, we collect three video sets of RGBD videos recorded from multiple sparse-view depth cameras and ground truth 3D poses are manually annotated.
no code implementations • CVPR 2024 • Jinhyung Park, Yu-Jhe Li, Kris Kitani
While recent depth completion methods have achieved remarkable results filling in relatively dense depth maps (e. g. projected 64-line LiDAR on KITTI or 500 sampled points on NYUv2) with RGB guidance their performance on very sparse input (e. g. 4-line LiDAR or 32 depth point measurements) is unverified.
no code implementations • 21 Mar 2023 • Yu-Jhe Li, Tao Xu, Ji Hou, Bichen Wu, Xiaoliang Dai, Albert Pumarola, Peizhao Zhang, Peter Vajda, Kris Kitani
We note that the novelty of our model lies in that we introduce contrastive learning during training the diffusion prior which enables the generation of the valid view-invariant latent code.
1 code implementation • CVPR 2023 • Yu-Jhe Li, Shawn Hunt, Jinhyung Park, Matthew O’Toole, Kris Kitani
We also propose a hybrid super-resolution model (Hybrid-SR) combining our ADC-SR with a standard RAD super-resolution model, and show that performance can be improved by a large margin.
no code implementations • 12 Nov 2022 • Yu-Jhe Li, Tao Xu, Bichen Wu, Ningyuan Zheng, Xiaoliang Dai, Albert Pumarola, Peizhao Zhang, Peter Vajda, Kris Kitani
In the first stage, we introduce a base encoder that converts the input image to a latent code.
no code implementations • 16 Mar 2022 • Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato
We aim to improve the performance of regressing hand keypoints and segmenting pixel-level hand masks under new imaging conditions (e. g., outdoors) when we only have labeled images taken under very different conditions (e. g., indoors).
no code implementations • CVPR 2022 • Yu-Jhe Li, Jinhyung Park, Matthew O'Toole, Kris Kitani
To mitigate this problem, we propose the Self-Training Multimodal Vehicle Detection Network (ST-MVDNet) which leverages a Teacher-Student mutual learning framework and a simulated sensor noise model used in strong data augmentation for Lidar and Radar.
2 code implementations • CVPR 2022 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda
To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.
no code implementations • 29 Sep 2021 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris M. Kitani, Peter Vajda
This enables the student model to capture domain-invariant features.
no code implementations • CVPR 2021 • Yan Xu, Yu-Jhe Li, Xinshuo Weng, Kris Kitani
We address the problem of estimating the 3D pose of a network of cameras for large-environment wide-baseline scenarios, e. g., cameras for construction sites, sports stadiums, and public spaces.
no code implementations • ICCV 2021 • Yu-Jhe Li, Xinshuo Weng, Yan Xu, Kris M. Kitani
We propose a inter-tracklet (person to person) attention mechanism that learns a representation for a target tracklet while taking into account other tracklets across multiple views.
no code implementations • 21 Oct 2020 • Jia-Wei Yan, Ci-Siang Lin, Fu-En Yang, Yu-Jhe Li, Yu-Chiang Frank Wang
Learning interpretable and interpolatable latent representations has been an emerging research direction, allowing researchers to understand and utilize the derived latent space for further applications such as visual synthesis or recognition.
no code implementations • 2 Oct 2020 • Chih-Ting Liu, Yu-Jhe Li, Shao-Yi Chien, Yu-Chiang Frank Wang
As a result, our approach is able to augment the labeled training data in the semi-supervised setting.
no code implementations • 2 Jun 2020 • Yen-Ting Liu, Yu-Jhe Li, Yu-Chiang Frank Wang
Video summarization is among challenging tasks in computer vision, which aims at identifying highlight frames or shots over a lengthy video input.
no code implementations • 16 Mar 2020 • Yu-Jhe Li, Zhengyi Luo, Xinshuo Weng, Kris M. Kitani
To tackle the re-ID problem in the context of clothing changes, we propose a novel representation learning model which is able to generate a body shape feature representation without being affected by clothing color or patterns.
no code implementations • 19 Feb 2020 • Yu-Jhe Li, Yun-Chun Chen, Yen-Yu Lin, Yu-Chiang Frank Wang
Person re-identification (re-ID) aims at matching images of the same person across camera views.
no code implementations • ICCV 2019 • Yu-Jhe Li, Ci-Siang Lin, Yan-Bo Lin, Yu-Chiang Frank Wang
Person re-identification (re-ID) aims at recognizing the same person from images taken across different cameras.
Ranked #17 on Unsupervised Domain Adaptation on Market to Duke
no code implementations • ICCV 2019 • Yu-Jhe Li, Yun-Chun Chen, Yen-Yu Lin, Xiaofei Du, Yu-Chiang Frank Wang
Person re-identification (re-ID) aims at matching images of the same identity across camera views.
no code implementations • 25 Jul 2019 • Yun-Chun Chen, Yu-Jhe Li, Xiaofei Du, Yu-Chiang Frank Wang
Moreover, the extension of our model for semi-supervised re-ID further confirms the scalability of our proposed method for real-world scenarios and applications.
2 code implementations • 20 Feb 2019 • Yan-Bo Lin, Yu-Jhe Li, Yu-Chiang Frank Wang
Audio-visual event localization requires one to identify theevent which is both visible and audible in a video (eitherat a frame or video level).
4 code implementations • 5 May 2018 • Yu-Jhe Li, Hsin-Yu Chang, Yu-Jing Lin, Po-Wei Wu, Yu-Chiang Frank Wang
Deep reinforcement learning has shown its success in game playing.
no code implementations • 25 Apr 2018 • Yu-Jhe Li, Fu-En Yang, Yen-Cheng Liu, Yu-Ying Yeh, Xiaofei Du, Yu-Chiang Frank Wang
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras.
Ranked #20 on Unsupervised Domain Adaptation on Duke to Market
no code implementations • 9 Feb 2018 • Yun-Chun Chen, Yu-Jhe Li, Aragorn Tseng, Tsungnan Lin
We also conduct a partial flow experiment which shows the feasibility of real-time detection and a zero-shot learning experiment which justifies the generalization capability of deep learning in cyber security.