1 code implementation • 7 Mar 2024 • Kaiwen Cai, Zhekai Duan, Gaowen Liu, Charles Fleming, Chris Xiaoxuan Lu
Recent advancements in Vision-Language (VL) models have sparked interest in their deployment on edge devices, yet challenges in handling diverse visual modalities, manual annotation, and computational constraints remain.
no code implementations • 31 May 2023 • Ruimin Gao, Hao Zou, Zhekai Duan
In computer vision, different basic blocks are created around different matrix operations, and models based on different basic blocks have achieved good results.
1 code implementation • 28 May 2023 • Haobo Yang, Wenyu Wang, Ze Cao, Zhekai Duan, Xuchen Liu
Our work presents a new lens to understand these models better by focusing on their handling of visual illusions -- a complex interplay of perception and logic.