Search Results for author: Alex Hauptmann

Found 5 papers, 3 papers with code

Training Vision-Language Transformers from Captions Alone

1 code implementation19 May 2022 Liangke Gui, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

We show that Vision-Language Transformers can be learned without human labels (e. g. class labels, bounding boxes, etc).

KAT: A Knowledge Augmented Transformer for Vision-and-Language

1 code implementation NAACL 2022 Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters.

Answer Generation Retrieval +2

Person Search Challenges and Solutions: A Survey

no code implementations1 May 2021 Xiangtan Lin, Pengzhen Ren, Yun Xiao, Xiaojun Chang, Alex Hauptmann

This paper surveyed the recent works on image-based and text-based person search from the perspective of challenges and solutions.

Person Search Text based Person Search

A Comprehensive Survey of Scene Graphs: Generation and Application

no code implementations17 Mar 2021 Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen, Alex Hauptmann

For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content.

Image Captioning Question Answering +4

Learning Sound Events From Webly Labeled Data

1 code implementation25 Nov 2018 Anurag Kumar, Ankit Shah, Alex Hauptmann, Bhiksha Raj

In the last couple of years, weakly labeled learning for sound events has turned out to be an exciting approach for audio event detection.

Event Detection Sound Event Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.