no code implementations • 18 Aug 2024 • Chaofan Tao, Gukyeong Kwon, Varad Gunjal, Hao Yang, Zhaowei Cai, Yonatan Dukler, Ashwin Swaminathan, R. Manmatha, Colin Jon Taylor, Stefano Soatto
The benchmark is constructed by generating negative texts with incorrect action descriptions for a given video and the model is expected to pair a positive text with its corresponding video.
no code implementations • 30 May 2023 • Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang
The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge.
no code implementations • 3 Aug 2022 • Gukyeong Kwon, Zhaowei Cai, Avinash Ravichandran, Erhan Bas, Rahul Bhotika, Stefano Soatto
Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality.
no code implementations • 23 Jun 2022 • Yash-yee Logan, Ryan Benkert, Ahmad Mustafa, Gukyeong Kwon, Ghassan AlRegib
For this purpose, we propose a framework that incorporates clinical insights into the sample selection process of active learning that can be incorporated with existing algorithms.
no code implementations • 12 Apr 2022 • Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto
In this paper, we study the challenging instance-wise vision-language tasks, where the free-form language is required to align with the objects instead of the whole image.
1 code implementation • 8 Mar 2022 • Gukyeong Kwon, Ghassan AlRegib
Also, the two-stream autoencoder works as a unified framework for the gating model and the unseen expert, which makes the proposed method computationally efficient.
no code implementations • 13 Aug 2020 • Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
To articulate the significance of the model perspective in novelty detection, we utilize backpropagated gradients.
3 code implementations • 1 Aug 2020 • Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, Ghassan AlRegib
Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$.
2 code implementations • ECCV 2020 • Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
Anomalies require more drastic model updates to fully represent them compared to normal data.
no code implementations • ICLR 2020 • Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
To complement the learned information from activation-based representation, we propose utilizing a gradient-based representation that explicitly focuses on missing information.
no code implementations • 25 Sep 2019 • Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, Ghassan AlRegib
Such a positioning scheme is based on a data point’s second-order property.
2 code implementations • 27 Aug 2019 • Gukyeong Kwon, Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
In this paper, we utilize weight gradients from backpropagation to characterize the representation space learned by deep learning algorithms.
no code implementations • 17 Feb 2019 • Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, Ghassan AlRegib
In this paper, we generate and control semantically interpretable filters that are directly learned from natural images in an unsupervised fashion.
2 code implementations • 12 Dec 2018 • Mohammed A. Aabed, Gukyeong Kwon, Ghassan AlRegib
This is a full-reference tempospatial approach that considers both temporal and spatial PSD characteristics.
1 code implementation • 7 Dec 2017 • Dogancan Temel, Gukyeong Kwon, Mohit Prabhushankar, Ghassan AlRegib
We benchmark the performance of existing solutions in real-world scenarios and analyze the performance variation with respect to challenging conditions.