2 code implementations • 5 Dec 2024 • Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Vishwesh Nath, Jinyi Hu, Sifei Liu, Ranjay Krishna, Daguang Xu, Xiaolong Wang, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu
This paper introduces NVILA, a family of open VLMs designed to optimize both efficiency and accuracy.
Ranked #6 on
Video Question Answering
on NExT-QA
no code implementations • 5 Dec 2024 • An-Chieh Cheng, Yandong Ji, Zhaojing Yang, Zaitian Gongye, Xueyan Zou, Jan Kautz, Erdem Biyik, Hongxu Yin, Sifei Liu, Xiaolong Wang
This paper proposes to solve the problem of Vision-and-Language Navigation with legged robots, which not only provides a flexible way for humans to command but also allows the robot to navigate through more challenging and cluttered scenes.
1 code implementation • 3 Jun 2024 • An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu
Vision Language Models (VLMs) have demonstrated remarkable performance in 2D vision and language tasks.
no code implementations • 4 May 2023 • An-Chieh Cheng, Xueting Li, Sifei Liu, Xiaolong Wang
This allows the texture to be disentangled from the underlying shape and transferable to other shapes that share the same UV space, i. e., from the same category.
1 code implementation • 5 Apr 2022 • An-Chieh Cheng, Xueting Li, Sifei Liu, Min Sun, Ming-Hsuan Yang
With the capacity of modeling long-range dependencies in sequential data, transformers have shown remarkable performances in a variety of generative tasks such as image, audio, and text generation.
no code implementations • NeurIPS 2021 • An-Chieh Cheng, Xueting Li, Min Sun, Ming-Hsuan Yang, Sifei Liu
We propose a canonical point autoencoder (CPAE) that predicts dense correspondences between 3D shapes of the same category.
no code implementations • NeurIPS 2020 • Hung-Jen Chen, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
To preserve the knowledge we learn from previous instances, we proposed a method to protect the path by restricting the gradient updates of one instance from overriding past updates calculated from previous instances if these instances are not similar.
2 code implementations • 26 Nov 2018 • An-Chieh Cheng, Chieh Hubert Lin, Da-Cheng Juan, Wei Wei, Min Sun
Conventional Neural Architecture Search (NAS) aims at finding a single architecture that achieves the best performance, which usually optimizes task related learning objectives such as accuracy.
no code implementations • 9 Sep 2018 • Hsuan-Kung Yang, An-Chieh Cheng, Kuan-Wei Ho, Tsu-Jui Fu, Chun-Yi Lee
The additional depth prediction path supplements the relationship prediction model in a way that bounding boxes or segmentation masks are unable to deliver.
no code implementations • 29 Aug 2018 • An-Chieh Cheng, Jin-Dong Dong, Chi-Hung Hsu, Shu-Huan Chang, Min Sun, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, Da-Cheng Juan
Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performance in many tasks such as image classification and language understanding.
no code implementations • ECCV 2018 • Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
We propose DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related (e. g., inference time and memory usage) and device-agnostic (e. g., accuracy and model size) objectives.