no code implementations • 24 Mar 2024 • Xiaoyu Zhu, Junwei Liang, Po-Yao Huang, Alex Hauptmann
The second is a Masked Consistency Learning module to learn class-discriminative representations.
1 code implementation • 19 May 2022 • Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk
Vision-Language Transformers can be learned without low-level human labels (e. g. class labels, bounding boxes, etc).
1 code implementation • NAACL 2022 • Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao
The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters.
no code implementations • 1 May 2021 • Xiangtan Lin, Pengzhen Ren, Yun Xiao, Xiaojun Chang, Alex Hauptmann
This paper surveyed the recent works on image-based and text-based person search from the perspective of challenges and solutions.
no code implementations • 17 Mar 2021 • Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen, Alex Hauptmann
For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content.
1 code implementation • 28th International Joint Conference on Artificial Intelligence 2019 • Anurag Kumar, Ankit Shah, Alex Hauptmann, Bhiksha Raj
In the last couple of years, weakly labeled learning for sound events has turned out to be an exciting approach for audio event detection.