Search Results for author: Alex Hauptmann

Found 6 papers, 3 papers with code

Training Vision-Language Transformers from Captions

1 code implementation19 May 2022 Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk

Vision-Language Transformers can be learned without low-level human labels (e. g. class labels, bounding boxes, etc).

KAT: A Knowledge Augmented Transformer for Vision-and-Language

1 code implementation NAACL 2022 Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters.

Answer Generation Retrieval +1

Person Search Challenges and Solutions: A Survey

no code implementations1 May 2021 Xiangtan Lin, Pengzhen Ren, Yun Xiao, Xiaojun Chang, Alex Hauptmann

This paper surveyed the recent works on image-based and text-based person search from the perspective of challenges and solutions.

Person Search Text based Person Search

A Comprehensive Survey of Scene Graphs: Generation and Application

no code implementations17 Mar 2021 Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen, Alex Hauptmann

For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content.

Image Captioning Question Answering +4

Cannot find the paper you are looking for? You can Submit a new open access paper.