no code implementations • CVPR 2016 • Christopher Thomas, Adriana Kovashka
To explore the feasibility of current computer vision techniques to address this problem, we created a new dataset of over 180, 000 images taken by 41 well-known photographers.
no code implementations • 1 Jun 2016 • Christopher Thomas
In this technical report, we present our publicly downloadable implementation of the SALICON saliency model.
no code implementations • CVPR 2017 • Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, Adriana Kovashka
There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action.
no code implementations • 25 Jul 2018 • Christopher Thomas, Adriana Kovashka
We show how our model can be used to produce visually distinct faces which appear to be from a fixed ad topic category.
no code implementations • 28 Dec 2018 • Christopher Thomas, Adriana Kovashka
To do so, we introduce a complementary training modality constructed to be similar in artistic style to the target domain, and enforce that the network learns features that are invariant between the two training modalities.
1 code implementation • NeurIPS 2019 • Christopher Thomas, Adriana Kovashka
We collect a dataset of over one million unique images and associated news articles from left- and right-leaning news sources, and develop a method to predict the image's political leaning.
no code implementations • ECCV 2020 • Christopher Thomas, Adriana Kovashka
The abundance of multimodal data (e. g. social media posts) has inspired interest in cross-modal retrieval methods.
no code implementations • 15 Sep 2020 • Christopher Thomas, Thilo Womelsdorf
Oscillations in the local field potential (LFP) of the brain are key signatures of neural information processing.
no code implementations • 3 Dec 2020 • Christopher Thomas, Yale Song, Adriana Kovashka
We study the problem of animating images by transferring spatio-temporal visual effects (such as melting) from a collection of videos.
no code implementations • ACL 2021 • Yi Fung, Christopher Thomas, Revanth Gangi Reddy, Sandeep Polisetty, Heng Ji, Shih-Fu Chang, Kathleen McKeown, Mohit Bansal, Avi Sil
To defend against machine-generated fake news, an effective mechanism is urgently needed.
no code implementations • Findings (EMNLP) 2021 • Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang
We introduce the new task of Video MultiMedia Event Extraction (Video M2E2) and propose two novel components to build the first system towards this task.
1 code implementation • 29 Mar 2022 • Christopher Thomas, YiPeng Zhang, Shih-Fu Chang
In this paper, we propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
no code implementations • 14 Jun 2022 • Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Long Chen, Yulei Niu, Xudong Lin, Xuande Feng, Jaywon Koo, Sounak Ray, Shih-Fu Chang
To support research on this task, we introduce the Multimodal Hierarchical Events (MultiHiEve) dataset.
no code implementations • IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2022 • Christopher Thomas, Adriana Kovashka
Existing cross-modal retrieval methods assume a straightforward relationship where images and text contain portrayals or mentions of the same objects.
Ranked #1 on Cross-Modal Retrieval on COCO 2014 (using extra training data)
1 code implementation • 22 Oct 2022 • Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang
Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable'' sentences to the video, and these sentences are possibly at different semantic scales.
no code implementations • 29 May 2023 • Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang
Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community.