no code implementations • 24 Nov 2022 • Kevin Lin, Chung-Ching Lin, Lin Liang, Zicheng Liu, Lijuan Wang
We present Mesh Pre-Training (MPT), a new pre-training framework that leverages 3D mesh data such as MoCap data for human pose and mesh reconstruction from a single image.
1 code implementation • 14 Jun 2022 • Linjie Li, Zhe Gan, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Ce Liu, Lijuan Wang
In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling (MLM) is used as the common interface for all pre-training and downstream tasks.
no code implementations • CVPR 2022 • Chung-Ching Lin, Kevin Lin, Linjie Li, Lijuan Wang, Zicheng Liu
The model design provides a natural mechanism for visual and semantic representations to be learned in a shared knowledge space, whereby it encourages the learned visual embedding to be discriminative and more semantically consistent.
Ranked #2 on
Zero-Shot Action Recognition
on ActivityNet
no code implementations • 8 Jan 2022 • Chung-Ching Lin, Chase Puglisi, Veljko Boljanovic, Han Yan, Erfan Ghaderi, Jayce Gaddis, Qiuyan Xu, Sreeni Poolakkal, Danijela Cabric, Subhanshu Gupta
Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) wireless networks and beyond.
no code implementations • 30 Nov 2021 • Chung-Ching Lin, Veljko Boljanovic, Han Yan, Erfan Ghaderi, Mohammad Ali Mokri, Jayce Jeron Gaddis, Aditya Wadaskar, Chase Puglisi, Soumen Mohapatra, Qiuyan Xu, Sreeni Poolakkal, Deukhyoun Heo, Subhanshu Gupta, Danijela Cabric
The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large arrays with wide aperture widths.
1 code implementation • CVPR 2022 • Kevin Lin, Linjie Li, Chung-Ching Lin, Faisal Ahmed, Zhe Gan, Zicheng Liu, Yumao Lu, Lijuan Wang
Based on this model architecture, we show that video captioning can benefit significantly from more densely sampled video frames as opposed to previous successes with sparsely sampled video frames for video-and-language understanding tasks (e. g., video question answering).
no code implementations • 29 Sep 2021 • Tsun-An Hsieh, Cheng Yu, Ying Hung, Chung-Ching Lin, Yu Tsao
Accordingly, we propose Mutual Information Continuity-constrained Estimator (MICE).
no code implementations • 2 Jun 2021 • Chung-Ching Lin, Chase Puglisi, Veljko Boljanovic, Soumen Mohapatra, Han Yan, Erfan Ghaderi, Deukhyoun Heo, Danijela Cabric, Subhanshu Gupta
In this work, we demonstrate a true-time-delay (TTD) array with digitally reconfigurable delay elements enabling both fast beam-training at the receiver with wideband data communications.
no code implementations • ICLR 2021 • Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris
An inherent property of real-world videos is the high correlation of information across frames which can translate into redundancy in either temporal or spatial feature maps of the models, or both.
no code implementations • ICLR 2021 • Yue Meng, Rameswar Panda, Chung-Ching Lin, Prasanna Sattigeri, Leonid Karlinsky, Kate Saenko, Aude Oliva, Rogerio Feris
Temporal modelling is the key for efficient video action recognition.
1 code implementation • ECCV 2020 • Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris
Specifically, given a video frame, a policy network is used to decide what input resolution should be used for processing by the action recognition model, with the goal of improving both accuracy and efficiency.
no code implementations • 17 Jul 2020 • Veljko Boljanovic, Han Yan, Chung-Ching Lin, Soumen Mohapatra, Deukhyoun Heo, Subhanshu Gupta, Danijela Cabric
We also propose a suitable algorithm that requires a single pilot to achieve high-accuracy estimation of angle of arrival.
no code implementations • CVPR 2020 • Chung-Ching Lin, Ying Hung, Rogerio Feris, Linglin He
We propose a modified variational autoencoder (VAE) architecture built on top of Mask R-CNN for instance-level video segmentation and tracking.
no code implementations • CVPR 2018 • Chung-Ching Lin, Ying Hung
This paper presents a prior-less method for tracking and clustering an unknown number of human faces and maintaining their individual identities in unconstrained videos.
1 code implementation • 30 May 2018 • Noel C. F. Codella, Chung-Ching Lin, Allan Halpern, Michael Hind, Rogerio Feris, John R. Smith
Quantitative relevance of results, according to non-expert similarity, as well as localized image regions, are also significantly improved.
no code implementations • 26 Aug 2017 • Karthikeyan Natesan Ramamurthy, Chung-Ching Lin, Aleksandr Aravkin, Sharath Pankanti, Raphael Viguier
The runtime of our implementation scales linearly with the number of observed points.
no code implementations • CVPR 2015 • Chung-Ching Lin, Sharathchandra U. Pankanti, Karthikeyan Natesan Ramamurthy, Aleksandr Y. Aravkin
Computing the warp is fully automated and uses a combination of local homography and global similarity transformations, both of which are estimated with respect to the target.