1 code implementation • NeurIPS 2019 • Christopher Thomas, Adriana Kovashka
We collect a dataset of over one million unique images and associated news articles from left- and right-leaning news sources, and develop a method to predict the image's political leaning.
1 code implementation • 22 Oct 2022 • Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang
Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable'' sentences to the video, and these sentences are possibly at different semantic scales.
1 code implementation • 29 Mar 2022 • Christopher Thomas, YiPeng Zhang, Shih-Fu Chang
In this paper, we propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
no code implementations • CVPR 2017 • Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, Adriana Kovashka
There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action.
no code implementations • 1 Jun 2016 • Christopher Thomas
In this technical report, we present our publicly downloadable implementation of the SALICON saliency model.
no code implementations • CVPR 2016 • Christopher Thomas, Adriana Kovashka
To explore the feasibility of current computer vision techniques to address this problem, we created a new dataset of over 180, 000 images taken by 41 well-known photographers.
no code implementations • 25 Jul 2018 • Christopher Thomas, Adriana Kovashka
We show how our model can be used to produce visually distinct faces which appear to be from a fixed ad topic category.
no code implementations • 28 Dec 2018 • Christopher Thomas, Adriana Kovashka
To do so, we introduce a complementary training modality constructed to be similar in artistic style to the target domain, and enforce that the network learns features that are invariant between the two training modalities.
no code implementations • ECCV 2020 • Christopher Thomas, Adriana Kovashka
The abundance of multimodal data (e. g. social media posts) has inspired interest in cross-modal retrieval methods.
no code implementations • 15 Sep 2020 • Christopher Thomas, Thilo Womelsdorf
Oscillations in the local field potential (LFP) of the brain are key signatures of neural information processing.
no code implementations • 3 Dec 2020 • Christopher Thomas, Yale Song, Adriana Kovashka
We study the problem of animating images by transferring spatio-temporal visual effects (such as melting) from a collection of videos.
no code implementations • ACL 2021 • Yi Fung, Christopher Thomas, Revanth Gangi Reddy, Sandeep Polisetty, Heng Ji, Shih-Fu Chang, Kathleen McKeown, Mohit Bansal, Avi Sil
To defend against machine-generated fake news, an effective mechanism is urgently needed.
no code implementations • Findings (EMNLP) 2021 • Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang
We introduce the new task of Video MultiMedia Event Extraction (Video M2E2) and propose two novel components to build the first system towards this task.
no code implementations • 14 Jun 2022 • Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Long Chen, Yulei Niu, Xudong Lin, Xuande Feng, Jaywon Koo, Sounak Ray, Shih-Fu Chang
To support research on this task, we introduce the Multimodal Hierarchical Events (MultiHiEve) dataset.
no code implementations • IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2022 • Christopher Thomas, Adriana Kovashka
Existing cross-modal retrieval methods assume a straightforward relationship where images and text contain portrayals or mentions of the same objects.
Ranked #1 on Cross-Modal Retrieval on COCO 2014 (using extra training data)
no code implementations • 29 May 2023 • Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang
Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community.