no code implementations • CVPR 2022 • Chuang Gan, Yi Gu, Siyuan Zhou, Jeremy Schwartz, Seth Alter, James Traer, Dan Gutfreund, Joshua B. Tenenbaum, Josh Mcdermott, Antonio Torralba
The way an object looks and sounds provide complementary reflections of its physical properties.
no code implementations • 16 Dec 2021 • Vinayak Agarwal, Maddie Cusimano, James Traer, Josh Mcdermott
Sustained contact interactions like scraping and rolling produce a wide variety of sounds.
1 code implementation • 25 Mar 2021 • Chuang Gan, Siyuan Zhou, Jeremy Schwartz, Seth Alter, Abhishek Bhandwaldar, Dan Gutfreund, Daniel L. K. Yamins, James J DiCarlo, Josh Mcdermott, Antonio Torralba, Joshua B. Tenenbaum
To complete the task, an embodied agent must plan a sequence of actions to change the state of a large number of objects in the face of realistic physical constraints.
1 code implementation • NeurIPS 2019 • Cory Stephenson, Jenelle Feather, Suchismita Padhy, Oguz Elibol, Hanlin Tang, Josh Mcdermott, SueYeon Chung
Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network.
1 code implementation • NeurIPS 2019 • Jenelle Feather, Alex Durango, Ray Gonzalez, Josh Mcdermott
Although model metamers from early network layers were recognizable to humans, those from deeper layers were not.
no code implementations • 28 May 2019 • Suchismita Padhy, Jenelle Feather, Cory Stephenson, Oguz Elibol, Hanlin Tang, Josh Mcdermott, SueYeon Chung
The success of deep neural networks in visual tasks have motivated recent theoretical and empirical work to understand how these networks operate.
no code implementations • 18 Apr 2019 • Andrew Rouditchenko, Hang Zhao, Chuang Gan, Josh Mcdermott, Antonio Torralba
Segmenting objects in images and separating sound sources in audio are challenging tasks, in part because traditional approaches require large amounts of labeled data.
2 code implementations • ECCV 2018 • Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh Mcdermott, Antonio Torralba
We introduce PixelPlayer, a system that, by leveraging large amounts of unlabeled videos, learns to locate image regions which produce sounds and separate the input sounds into a set of components that represents the sound from each pixel.
no code implementations • CVPR 2016 • Andrew Owens, Phillip Isola, Josh Mcdermott, Antonio Torralba, Edward H. Adelson, William T. Freeman
Objects make distinctive sounds when they are hit or scratched.