no code implementations • 23 Mar 2023 • Relja Arandjelović, Alex Andonian, Arthur Mensch, Olivier J. Hénaff, Jean-Baptiste Alayrac, Andrew Zisserman
The core problem in zero-shot open vocabulary detection is how to align visual and text features, so that the detector performs well on unseen classes.
1 code implementation • 13 Oct 2022 • Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge.
no code implementations • 1 Jun 2022 • Camilo Fosco, Emilie Josephs, Alex Andonian, Allen Lee, Xi Wang, Aude Oliva
Our approach learns to generate attention maps of video artifacts, semi-supervised on human annotations.
no code implementations • CVPR 2022 • Alex Andonian, Shixing Chen, Raffay Hamid
The learning objective of vision-language approach of CLIP does not effectively account for the noisy many-to-many correspondences found in web-harvested image captioning datasets, which contributes to its compute and data inefficiency.
Ranked #95 on
Image Classification
on ObjectNet
(using extra training data)
1 code implementation • 10 Feb 2022 • Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov
To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME).
1 code implementation • 12 Nov 2021 • Alex Andonian, Taesung Park, Bryan Russell, Phillip Isola, Jun-Yan Zhu, Richard Zhang
Training supervised image synthesis models requires a critic to compare two images: the ground truth to the result.
no code implementations • 19 Mar 2021 • Alex Andonian, Sabrina Osmany, Audrey Cui, YeonHwan Park, Ali Jahanian, Antonio Torralba, David Bau
We investigate the problem of zero-shot semantic image painting.
no code implementations • ICLR 2021 • Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris
An inherent property of real-world videos is the high correlation of information across frames which can translate into redundancy in either temporal or spatial feature maps of the models, or both.
1 code implementation • ECCV 2020 • Alex Andonian, Camilo Fosco, Mathew Monfort, Allen Lee, Rogerio Feris, Carl Vondrick, Aude Oliva
This allows our model to perform cognitive tasks such as set abstraction (which general concept is in common among a set of videos?
2 code implementations • 1 Nov 2019 • Mathew Monfort, Bowen Pan, Kandan Ramakrishnan, Alex Andonian, Barry A McNamara, Alex Lascelles, Quanfu Fan, Dan Gutfreund, Rogerio Feris, Aude Oliva
Videos capture events that typically contain multiple sequential, and simultaneous, actions even in the span of only a few seconds.
1 code implementation • ICCV 2019 • Authors, :, Lore Goetschalckx, Alex Andonian, Aude Oliva, Phillip Isola
We introduce a framework that uses Generative Adversarial Networks (GANs) to study cognitive properties like memorability, aesthetics, and emotional valence.
1 code implementation • 9 Jun 2019 • Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou
Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input.
1 code implementation • CVPR 2020 • Chengxu Zhuang, Tianwei She, Alex Andonian, Max Sobol Mark, Daniel Yamins
Because of the rich dynamical structure of videos and their ubiquity in everyday life, it is a natural idea that video data could serve as a powerful unsupervised learning signal for training visual representations in deep neural networks.
no code implementations • 14 May 2019 • Radoslaw Martin Cichy, Gemma Roig, Alex Andonian, Kshitij Dwivedi, Benjamin Lahner, Alex Lascelles, Yalda Mohsenzadeh, Kandan Ramakrishnan, Aude Oliva
Recently, researchers of natural intelligence have begun using those AI models to explore how the brain performs such tasks.
4 code implementations • 9 Jan 2018 • Mathew Monfort, Alex Andonian, Bolei Zhou, Kandan Ramakrishnan, Sarah Adel Bargal, Tom Yan, Lisa Brown, Quanfu Fan, Dan Gutfruend, Carl Vondrick, Aude Oliva
We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds.
3 code implementations • ECCV 2018 • Bolei Zhou, Alex Andonian, Aude Oliva, Antonio Torralba
Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.
Ranked #2 on
Hand Gesture Recognition
on Jester test