no code implementations • 14 Mar 2024 • Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, BoWen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Guoli Yin, Mark Lee, ZiRui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang
Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons.
Ranked #42 on Visual Question Answering on MM-Vet
1 code implementation • 26 May 2021 • Chun-Ta Lu, Yun Zeng, Da-Cheng Juan, Yicheng Fan, Zhe Li, Jan Dlabal, Yi-Ting Chen, Arjun Gopalan, Allan Heydon, Chun-Sung Ferng, Reah Miyara, Ariel Fuxman, Futang Peng, Zhen Li, Tom Duerig, Andrew Tomkins
In this work, we propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks by enabling multiple components -- model trainers, knowledge makers and knowledge banks -- to concertedly work together in an asynchronous fashion across hardware platforms.
no code implementations • 8 Mar 2020 • Yang Feng, Futang Peng, Xu Zhang, Wei Zhu, Shanfeng Zhang, Howard Zhou, Zhen Li, Tom Duerig, Shih-Fu Chang, Jiebo Luo
Therefore, we propose to distill the knowledge in multiple specialists into a universal embedding to solve this problem.
1 code implementation • 14 Feb 2019 • Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Yaxi Gao, Tom Duerig, Andrew Tomkins, Sujith Ravi
Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering.
Ranked #11 on Image Classification on iNaturalist