no code implementations • COLING 2022 • Artem Abzaliev, Andrew Owens, Rada Mihalcea
In this paper, we explore the relation between gestures and language.
1 code implementation • 21 Mar 2023 • Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, Matthias Nießner
We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input.
1 code implementation • 20 Mar 2023 • Ziyang Chen, Shengyi Qian, Andrew Owens
In this paper, we use these cues to solve a problem we call Sound Localization from Motion (SLfM): jointly estimating camera rotation and localizing sound sources.
no code implementations • 11 Jan 2023 • Chenhao Zheng, Ayush Shrivastava, Andrew Owens
We learn a visual representation that captures information about the camera that recorded a given photo.
no code implementations • 4 Jan 2023 • Chao Feng, Ziyang Chen, Andrew Owens
Manipulated videos often contain subtle inconsistencies between their visual and audio signals.
no code implementations • CVPR 2022 • Xixi Hu, Ziyang Chen, Andrew Owens
This task requires a model to both group a sound mixture into individual sources, and to associate them with a visual signal.
no code implementations • 22 Nov 2022 • Fengyu Yang, Chenyang Ma, Jiacheng Zhang, Jing Zhu, Wenzhen Yuan, Andrew Owens
The ability to associate touch with sight is essential for tasks that require physically interacting with objects in the world.
no code implementations • 10 May 2022 • Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao
Our model learns to manipulate the texture of a scene to match a sound, a problem we term audio-driven image stylization.
1 code implementation • 26 Apr 2022 • Ziyang Chen, David F. Fouhey, Andrew Owens
We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings.
no code implementations • CVPR 2022 • Zhangxing Bian, Allan Jabri, Alexei A. Efros, Andrew Owens
A range of video modeling tasks, from optical flow to multiple object tracking, share the same fundamental challenge: establishing space-time correspondence.
no code implementations • 18 Jan 2022 • Rui Guo, Jasmine Collins, Oscar de Lima, Andrew Owens
Our model learns to camouflage a variety of object shapes from randomly sampled locations and viewpoints within the input scene, and is the first to address the problem of hiding complex object shapes.
1 code implementation • 10 Nov 2021 • Ziyang Chen, Xixi Hu, Andrew Owens
From whirling ceiling fans to ticking clocks, the sounds that we hear subtly vary as we move through a scene.
1 code implementation • CVPR 2022 • Daniel Geng, Max Hamilton, Andrew Owens
Image prediction methods often struggle on tasks that require changing the positions of objects, such as video prediction, producing blurry images that average over the many positions that objects might occupy.
no code implementations • 6 Apr 2021 • Medhini Narasimhan, Shiry Ginosar, Andrew Owens, Alexei A. Efros, Trevor Darrell
We learn representations for video frames and frame-to-frame transition probabilities by fitting a video-specific model trained using contrastive learning.
1 code implementation • ICCV 2021 • Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey
The paper studies planar surface reconstruction of indoor scenes from two views with unknown camera poses.
no code implementations • 1 Jan 2021 • Medhini Narasimhan, Shiry Ginosar, Andrew Owens, Alexei A Efros, Trevor Darrell
By randomly traversing edges with high transition probabilities, we generate diverse temporally smooth videos with novel sequences and transitions.
1 code implementation • ECCV 2020 • Triantafyllos Afouras, Andrew Owens, Joon Son Chung, Andrew Zisserman
Our objective is to transform a video into a set of discrete audio-visual objects using self-supervised learning.
1 code implementation • NeurIPS 2020 • Allan Jabri, Andrew Owens, Alexei A. Efros
We cast correspondence as prediction of links in a space-time graph constructed from video.
3 code implementations • CVPR 2020 • Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, Alexei A. Efros
In this work we ask whether it is possible to create a "universal" detector for telling apart real images from these generated by a CNN, regardless of architecture or dataset used.
2 code implementations • ICCV 2019 • Sheng-Yu Wang, Oliver Wang, Andrew Owens, Richard Zhang, Alexei A. Efros
Most malicious photo manipulations are created using standard image editing tools, such as Adobe Photoshop.
2 code implementations • CVPR 2019 • Shiry Ginosar, Amir Bar, Gefen Kohavi, Caroline Chan, Andrew Owens, Jitendra Malik
Specifically, we perform cross-modal translation from "in-the-wild'' monologue speech of a single speaker to their hand and arm motion.
Ranked #4 on
Gesture Generation
on BEAT
no code implementations • 14 Sep 2018 • Xiuming Zhang, Tali Dekel, Tianfan Xue, Andrew Owens, Qiurui He, Jiajun Wu, Stefanie Mueller, William T. Freeman
We present a system that allows users to visualize complex human motion via 3D motion sculptures---a representation that conveys the 3D structure swept by a human body as it moves through space.
no code implementations • 28 May 2018 • Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward H. Adelson, Sergey Levine
This model -- a deep, multimodal convolutional network -- predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions.
3 code implementations • ECCV 2018 • Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros
In this paper, we propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs.
1 code implementation • ECCV 2018 • Andrew Owens, Alexei A. Efros
The thud of a bouncing ball, the onset of speech as lips open -- when visual and audio events occur together, it suggests that there might be a common, underlying event that produced both signals.
no code implementations • 20 Dec 2017 • Andrew Owens, Jiajun Wu, Josh H. McDermott, William T. Freeman, Antonio Torralba
The sound of crashing waves, the roar of fast-moving cars -- sound conveys important information about the objects in our surroundings.
1 code implementation • 16 Oct 2017 • Roberto Calandra, Andrew Owens, Manu Upadhyaya, Wenzhen Yuan, Justin Lin, Edward H. Adelson, Sergey Levine
In this work, we investigate the question of whether touch sensing aids in predicting grasp outcomes within a multimodal sensing framework that combines vision and touch.
1 code implementation • 25 Aug 2016 • Andrew Owens, Jiajun Wu, Josh H. McDermott, William T. Freeman, Antonio Torralba
We show that, through this process, the network learns a representation that conveys information about objects and scenes.
no code implementations • CVPR 2016 • Andrew Owens, Phillip Isola, Josh Mcdermott, Antonio Torralba, Edward H. Adelson, William T. Freeman
Objects make distinctive sounds when they are hit or scratched.
no code implementations • CVPR 2014 • Andrew Owens, Connelly Barnes, Alex Flint, Hanumant Singh, William Freeman
We address the problem of camouflaging a 3D object from the many viewpoints that one might see it from.