no code implementations • 13 Dec 2023 • Xijun Wang, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming Lin, Shan Yang
In this work, we propose an efficient Video-Language Alignment (ViLA) network.
Ranked #1 on Video Question Answering on STAR Benchmark
no code implementations • 6 Oct 2023 • Muhammad Osama Khan, Junbang Liang, Chun-Kai Wang, Shan Yang, Yu Lou
Furthermore, via experiments on the NYUv2 and IBims-1 datasets, we demonstrate that these enhanced representations translate to performance improvements in both the in-distribution and out-of-distribution settings.
Ranked #11 on Monocular Depth Estimation on NYU-Depth V2
no code implementations • 14 Feb 2023 • Erich Liang, Kenan Deng, Xi Zhang, Chun-Kai Wang
Recent advances in neural implicit surfaces for multi-view 3D reconstruction primarily focus on improving large-scale surface reconstruction accuracy, but often produce over-smoothed geometries that lack fine surface details.
no code implementations • 1 Oct 2021 • Xi Zhang, Chun-Kai Wang, Kenan Deng, Tomas Yago-Vicente, Himanshu Arora
In addition to using learnt robust features, our approach learns an additional ranking function to estimate the final layout instead of using optimization.