1 code implementation • EMNLP (BlackboxNLP) 2021 • Radina Dobreva, Frank Keller
Pre-trained vision-and-language models have achieved impressive results on a variety of tasks, including ones that require complex reasoning beyond object recognition.
no code implementations • 26 Nov 2021 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen
Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects, which is essential for full scene understanding.
no code implementations • 16 Nov 2021 • Pinelopi Papalampidi, Frank Keller, Mirella Lapata
Movie trailers perform multiple functions: they introduce viewers to the story, convey the mood and artistic style of the film, and encourage audiences to see the movie.
1 code implementation • 14 Sep 2021 • David Wilmot, Frank Keller
Recent language models can generate interesting and grammatically correct text in story generation but often lack plot development and long-term coherence.
1 code implementation • EMNLP 2021 • David Wilmot, Frank Keller
Measuring event salience is essential in the understanding of stories.
no code implementations • 27 Jul 2021 • Shreyank N Gowda, Laura Sevilla-Lara, Kiyoon Kim, Frank Keller, Marcus Rohrbach
We benchmark several recent approaches on the proposed True Zero-Shot(TruZe) Split for UCF101 and HMDB51, with zero-shot and generalized zero-shot evaluation.
no code implementations • 18 Jan 2021 • Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach
Theproblem can be seen as learning a function which general-izes well to instances of unseen classes without losing dis-crimination between classes.
1 code implementation • 14 Dec 2020 • Pinelopi Papalampidi, Frank Keller, Mirella Lapata
We summarize full-length movies by creating shorter videos containing their most informative scenes.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Bowen Li, Taeuk Kim, Reinald Kim Amplayo, Frank Keller
Here, we propose a novel fully unsupervised parsing approach that extracts constituency trees from PLM attention heads.
1 code implementation • ACL 2020 • David Wilmot, Frank Keller
Suspense is a crucial ingredient of narrative fiction, engaging readers and making stories compelling.
2 code implementations • ACL 2020 • Pinelopi Papalampidi, Frank Keller, Lea Frermann, Mirella Lapata
Most general-purpose extractive summarization models are trained on news articles, which are short and present all important information upfront.
no code implementations • IJCNLP 2019 • Pinelopi Papalampidi, Frank Keller, Mirella Lapata
According to screenwriting theory, turning points (e. g., change of plans, major setback, climax) are crucial narrative moments within a screenplay: they define the plot structure, determine its progression and segment the screenplay into thematic units (e. g., setup, complications, aftermath).
1 code implementation • ACL 2019 • Bowen Li, Lili Mou, Frank Keller
In our work, we propose an imitation learning approach to unsupervised parsing, where we transfer the syntactic knowledge induced by the PRPN to a Tree-LSTM model with discrete parsing actions.
1 code implementation • NAACL 2019 • Spandana Gella, Desmond Elliott, Frank Keller
We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9, 504 images annotated with English, German, and Spanish verbs.
no code implementations • 2 Feb 2019 • Michael Hahn, Frank Keller, Yonatan Bisk, Yonatan Belinkov
Also, transpositions are more difficult than misspellings, and a high error rate increases difficulty for all words, including correct ones.
no code implementations • 14 Nov 2018 • Bowen Li, Jianpeng Cheng, Yang Liu, Frank Keller
Transition-based models enable faster inference with $O(n)$ time complexity, but their performance still lags behind.
no code implementations • 31 Jul 2018 • Michael Hahn, Frank Keller
We propose a neural architecture that combines an attention module (deciding whether to skip words) and a task module (memorizing the input).
no code implementations • NAACL 2018 • Sp Gella, ana, Frank Keller
Recent research in language and vision has developed models for predicting and disambiguating verbs from images.
no code implementations • ICCV 2017 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari
We crowd-source extreme point annotations for PASCAL VOC 2007 and 2012 and show that (1) annotation time is only 7s per box, 5x faster than the traditional way of drawing boxes [62]; (2) the quality of the boxes is as good as the original ground-truth drawn the traditional way; (3) detectors trained on our annotations are as accurate as those trained on the original ground-truth.
no code implementations • EMNLP 2017 • Spandana Gella, Rico Sennrich, Frank Keller, Mirella Lapata
In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding.
no code implementations • ACL 2017 • Spandana Gella, Frank Keller
A large amount of recent research has focused on tasks that combine language and vision, resulting in a proliferation of datasets and methods.
no code implementations • CVPR 2017 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari
Training object class detectors typically requires a large set of images with objects annotated by bounding boxes.
Multiple Instance Learning
Weakly-Supervised Object Localization
no code implementations • COLING 2016 • Maria Barrett, Frank Keller, Anders S{\o}gaard
Several recent studies have shown that eye movements during reading provide information about grammatical and syntactic processing, which can assist the induction of NLP models.
no code implementations • EMNLP 2016 • Michael Hahn, Frank Keller
When humans read text, they fixate some words and skip others.
1 code implementation • NAACL 2016 • Spandana Gella, Mirella Lapata, Frank Keller
We introduce a new task, visual sense disambiguation for verbs: given an image and a verb, assign the correct sense of the verb, i. e., the one that describes the action depicted in the image.
1 code implementation • CVPR 2016 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari
Training object class detectors typically requires a large set of images in which objects are annotated by bounding-boxes.
no code implementations • 15 Jan 2016 • Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem, Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat, Barbara Plank
Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities.
no code implementations • TACL 2013 • Federico Sangati, Frank Keller
In this paper, we present the first incremental parser for Tree Substitution Grammar (TSG).