no code implementations • 24 Apr 2025 • Xu Ma, Peize Sun, Haoyu Ma, Hao Tang, Chih-Yao Ma, Jialiang Wang, Kunpeng Li, Xiaoliang Dai, Yujun Shi, Xuan Ju, Yushi Hu, Artsiom Sanakoyeu, Felix Juefei-Xu, Ji Hou, Junjiao Tian, Tao Xu, Tingbo Hou, Yen-Cheng Liu, Zecheng He, Zijian He, Matt Feiszli, Peizhao Zhang, Peter Vajda, Sam Tsai, Yun Fu
A primary limitation is the substantial number of image tokens required for AR models, which constrains both training and inference efficiency, as well as image resolution.
no code implementations • CVPR 2025 • Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu, Ji Hou, Tao Xu, Jialiang Wang, Felix Juefei-Xu, Yaqiao Luo, Peizhao Zhang, Tingbo Hou, Peter Vajda, Niraj K. Jha, Xiaoliang Dai
We propose a Linear-complexity text-to-video Generation (LinGen) framework whose cost scales linearly in the number of pixels.
2 code implementations • 17 Oct 2024 • Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, David Yan, Dhruv Choudhary, Dingkang Wang, Geet Sethi, Guan Pang, Haoyu Ma, Ishan Misra, Ji Hou, Jialiang Wang, Kiran Jagadeesh, Kunpeng Li, Luxin Zhang, Mannat Singh, Mary Williamson, Matt Le, Matthew Yu, Mitesh Kumar Singh, Peizhao Zhang, Peter Vajda, Quentin Duval, Rohit Girdhar, Roshan Sumbaly, Sai Saketh Rambhatla, Sam Tsai, Samaneh Azadi, Samyak Datta, Sanyuan Chen, Sean Bell, Sharadh Ramaswamy, Shelly Sheynin, Siddharth Bhattacharya, Simran Motwani, Tao Xu, Tianhe Li, Tingbo Hou, Wei-Ning Hsu, Xi Yin, Xiaoliang Dai, Yaniv Taigman, Yaqiao Luo, Yen-Cheng Liu, Yi-Chiao Wu, Yue Zhao, Yuval Kirstain, Zecheng He, Zijian He, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu, Arun Mallya, Baishan Guo, Boris Araya, Breena Kerr, Carleigh Wood, Ce Liu, Cen Peng, Dimitry Vengertsev, Edgar Schonfeld, Elliot Blanchard, Felix Juefei-Xu, Fraylie Nord, Jeff Liang, John Hoffman, Jonas Kohler, Kaolin Fire, Karthik Sivakumar, Lawrence Chen, Licheng Yu, Luya Gao, Markos Georgopoulos, Rashel Moritz, Sara K. Sampson, Shikai Li, Simone Parmeggiani, Steve Fine, Tara Fowler, Vladan Petrovic, Yuming Du
Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization, video editing, video-to-audio generation, and text-to-audio generation.
2 code implementations • NeurIPS 2023 • Junjiao Tian, Yen-Cheng Liu, James Seale Smith, Zsolt Kira
FTP can be combined with existing optimizers such as AdamW, and be used in a plug-and-play fashion.
1 code implementation • 21 Apr 2023 • Harsh Maheshwari, Yen-Cheng Liu, Zsolt Kira
We create the first benchmark for semi-supervised multi-modal semantic segmentation and also report the robustness to missing modalities.
Ranked #1 on
Semantic Segmentation
on SUN-RGBD
(Mean IoU (test) metric, using extra
training data)
RGBD Semantic Segmentation
Robust Semi-Supervised RGBD Semantic Segmentation
+3
2 code implementations • CVPR 2023 • Junjiao Tian, Xiaoliang Dai, Chih-Yao Ma, Zecheng He, Yen-Cheng Liu, Zsolt Kira
To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization.
no code implementations • 7 Oct 2022 • Yen-Cheng Liu, Chih-Yao Ma, Junjiao Tian, Zijian He, Zsolt Kira
Specifically, Polyhistor achieves competitive accuracy compared to the state-of-the-art while only using ~10% of their trainable parameters.
no code implementations • 29 Aug 2022 • Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira
To address this problem, we consider online and offline OOD detection modules, which are integrated with SSOD methods.
1 code implementation • CVPR 2022 • Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira
In this paper, we present Unbiased Teacher v2, which shows the generalization of SS-OD method to anchor-free detectors and also introduces Listen2Student mechanism for the unsupervised regression loss.
2 code implementations • CVPR 2022 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda
To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.
no code implementations • 29 Sep 2021 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris M. Kitani, Peter Vajda
This enables the student model to capture domain-invariant features.
no code implementations • 29 Sep 2021 • Zhuoran Yu, Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira
Inspired by the fact that teacher/student pseudo-labeling approaches result in a weak and sparse gradient signal due to the difficulty of confidence-thresholding, CrossMatch leverages \textit{multi-scale feature extraction} in object detection.
no code implementations • 1 Jul 2021 • Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
In this paper, we address the multi-robot collaborative perception problem, specifically in the context of multi-view infilling for distributed semantic segmentation.
no code implementations • 1 Jul 2021 • Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
In this paper, we address bandwidth-limited and obstruction-prone collaborative perception, specifically in the context of multi-agent semantic segmentation.
4 code implementations • ICLR 2021 • Yen-Cheng Liu, Chih-Yao Ma, Zijian He, Chia-Wen Kuo, Kan Chen, Peizhao Zhang, Bichen Wu, Zsolt Kira, Peter Vajda
To address this, we introduce Unbiased Teacher, a simple yet effective approach that jointly trains a student and a gradually progressing teacher in a mutually-beneficial manner.
no code implementations • NeurIPS 2020 • Junjiao Tian, Yen-Cheng Liu, Nathan Glaser, Yen-Chang Hsu, Zsolt Kira
Neural Networks can perform poorly when the training label distribution is heavily imbalanced, as well as when the testing data differs from the training distribution.
Ranked #31 on
Long-tail Learning
on CIFAR-100-LT (ρ=10)
2 code implementations • CVPR 2020 • Yen-Cheng Liu, Junjiao Tian, Nathaniel Glaser, Zsolt Kira
While significant advances have been made for single-agent perception, many applications require multiple sensing agents and cross-agent communication due to benefits such as coverage and robustness.
1 code implementation • 21 Mar 2020 • Yen-Cheng Liu, Junjiao Tian, Chih-Yao Ma, Nathan Glaser, Chia-Wen Kuo, Zsolt Kira
In this paper, we propose the problem of collaborative perception, where robots can combine their local observations with those of neighboring agents in a learnable way to improve accuracy on a perception task.
Multi-agent Reinforcement Learning
Reinforcement Learning
+2
no code implementations • 6 Nov 2019 • Junjiao Tian, Wesley Cheung, Nathan Glaser, Yen-Cheng Liu, Zsolt Kira
Specifically, we analyze a number of uncertainty measures, each of which captures a different aspect of uncertainty, and we propose a novel way to fuse degraded inputs by scaling modality-specific output softmax probabilities.
13 code implementations • ICLR 2019 • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang
Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples.
3 code implementations • 30 Oct 2018 • Yen-Chang Hsu, Yen-Cheng Liu, Anita Ramasamy, Zsolt Kira
Continual learning has received a great deal of attention recently with several approaches being proposed.
1 code implementation • NeurIPS 2018 • Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang
We present a novel and unified deep learning framework which is capable of learning domain-invariant representation from data across multiple domains.
no code implementations • 25 Apr 2018 • Yu-Jhe Li, Fu-En Yang, Yen-Cheng Liu, Yu-Ying Yeh, Xiaofei Du, Yu-Chiang Frank Wang
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras.
Ranked #20 on
Unsupervised Domain Adaptation
on Duke to Market
no code implementations • CVPR 2018 • Yen-Cheng Liu, Yu-Ying Yeh, Tzu-Chien Fu, Sheng-De Wang, Wei-Chen Chiu, Yu-Chiang Frank Wang
While representation learning aims to derive interpretable features for describing visual data, representation disentanglement further results in such features so that particular image attributes can be identified and manipulated.