1 code implementation • 4 Nov 2023 • Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin T. Wang, Scott C. Lowe, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel X. Chang, Graham W. Taylor
Understanding biodiversity is a global challenge, in which DNA barcodes - short snippets of DNA that cluster by species - play a pivotal role.
no code implementations • ICCV 2023 • Yiming Zhang, ZeMing Gong, Angel X. Chang
We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natural language descriptions.
1 code implementation • NeurIPS 2023 • Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott C. Lowe, Jaclyn T. A. McKeown, Chris C. Y. Ho, Joschka McLeod, Yi-Yun C Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel X. Chang, Graham W. Taylor, Paul Fieguth
In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-Insect Dataset.
no code implementations • 20 Jun 2023 • Mukul Khanna, Yongsen Mao, Hanxiao Jiang, Sanjay Haresh, Brennan Shacklett, Dhruv Batra, Alexander Clegg, Eric Undersander, Angel X. Chang, Manolis Savva
Surprisingly, we observe that agents trained on just 122 scenes from our dataset outperform agents trained on 10, 000 scenes from the ProcTHOR-10K dataset in terms of zero-shot generalization in real-world scanned environments.
no code implementations • 29 May 2023 • Supriya Gadi Patil, Angel X. Chang, Manolis Savva
Our study, on a synthetic dataset of 3D scenes where objects instances occur in different orientations, reveals that deep learning-based rotation invariant methods are effective for relatively easy settings with easy-to-distinguish pairs.
no code implementations • 7 Apr 2023 • Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang
We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.
no code implementations • ICCV 2023 • Enrico Cancelli, Tommaso Campari, Luciano Serafini, Angel X. Chang, Lamberto Ballan
In this paper, we propose an end-to-end architecture that exploits Proximity-Aware Tasks (referred as to Risk and Proximity Compass) to inject into a reinforcement learning navigation policy the ability to infer common-sense social behaviors.
no code implementations • ICCV 2023 • Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang
Performing 3D dense captioning and visual grounding requires a common and shared understanding of the underlying multimodal relationships.
no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu
We present a retrospective on the state of Embodied AI research.
no code implementations • 30 Sep 2022 • Han-Hung Lee, Angel X. Chang
We illustrate how different image-based augmentations prevent the adversarial generation problem, and how the generated results are impacted.
1 code implementation • 12 Sep 2022 • Sanjay Haresh, Xiaohao Sun, Hanxiao Jiang, Angel X. Chang, Manolis Savva
Human-object interactions with articulated objects are common in everyday life.
1 code implementation • 30 Mar 2022 • Hanxiao Jiang, Yongsen Mao, Manolis Savva, Angel X. Chang
The input is a single image of an object, and as output we detect what parts of the object can open, and the motion parameters describing the articulation of each openable part.
no code implementations • 19 Jan 2022 • Yue Ruan, Han-Hung Lee, Ke Zhang, Angel X. Chang
Recent work on contrastive losses for learning joint embeddings over multimodal data has been successful at downstream tasks such as retrieval and classification.
no code implementations • 10 Dec 2021 • Kai Wang, Xianghao Xu, Leon Lei, Selena Ling, Natalie Lindsay, Angel X. Chang, Manolis Savva, Daniel Ritchie
We then discuss different strategies for solving the problem, and design two representative pipelines: one uses available 2D floor plans to guide selection and deformation of 3D rooms; the other learns to retrieve a set of compatible 3D rooms and combine them into novel layouts.
no code implementations • 2 Dec 2021 • Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
Our D3Net unifies dense captioning and visual grounding in 3D in a self-critical manner.
no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
no code implementations • EMNLP 2021 • Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang
Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.
2 code implementations • 16 Sep 2021 • Santhosh K. Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, Dhruv Batra
When compared to existing photorealistic 3D datasets such as Replica, MP3D, Gibson, and ScanNet, images rendered from HM3D have 20 - 85% higher visual fidelity w. r. t.
1 code implementation • CVPR 2021 • Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva
Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors.
1 code implementation • CVPR 2021 • Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva
We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene.
Ranked #1 on
Plan2Scene
on Rent3D++
no code implementations • NeurIPS 2020 • Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva
We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.
no code implementations • CVPR 2021 • Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang
We introduce the task of dense captioning in 3D scans from commodity RGB-D sensors.
no code implementations • 3 Nov 2020 • Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su
In the rearrangement task, the goal is to bring a given physical environment into a specified state.
1 code implementation • CVPR 2020 • Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su
To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.
3 code implementations • ECCV 2020 • Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
We introduce the task of 3D object localization in RGB-D scans using natural language descriptions.
no code implementations • CONLL 2019 • Justin Dieter, Tian Wang, Arun Tejasvi Chaganty, Gabor Angeli, Angel X. Chang
Reflective listening{--}demonstrating that you have heard your conversational partner{--}is key to effective communication.
5 code implementations • CVPR 2019 • Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information.
Ranked #3 on
3D Instance Segmentation
on PartNet
2 code implementations • CVPR 2019 • Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X. Chang, Matthias Nießner
For a 3D reconstruction of an indoor scene, our method takes as input a set of CAD models, and predicts a 9DoF pose that aligns each model to the underlying scan geometry.
Ranked #1 on
3D Reconstruction
on Scan2CAD
no code implementations • CVPR 2018 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image.
2 code implementations • 22 Mar 2018 • Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese
To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes.
no code implementations • 12 Dec 2017 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image.
2 code implementations • 11 Dec 2017 • Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, Vladlen Koltun
We present MINOS, a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments.
no code implementations • 7 Apr 2017 • Kyle Genova, Manolis Savva, Angel X. Chang, Thomas Funkhouser
We provide a search algorithm that generates a sampling of likely candidate views according to the example distribution, and a set selection algorithm that chooses a subset of the candidates that jointly cover the example distribution.
no code implementations • 28 Feb 2017 • Angel X. Chang, Mihail Eric, Manolis Savva, Christopher D. Manning
We present SceneSeer: an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.
no code implementations • 28 Feb 2017 • Manolis Savva, Angel X. Chang, Maneesh Agrawala
We present SceneSuggest: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement.
Graphics Human-Computer Interaction
1 code implementation • CVPR 2017 • Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets.
Ranked #11 on
Semantic Segmentation
on ScanNetV2
3 code implementations • CVPR 2017 • Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser
This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.
Ranked #2 on
3D Semantic Scene Completion
on KITTI-360
no code implementations • 15 Mar 2016 • Angel X. Chang, Valentin I. Spitkovsky, Christopher D. Manning, Eneko Agirre
Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia.
14 code implementations • 9 Dec 2015 • Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qi-Xing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, Fisher Yu
We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects.
no code implementations • LREC 2012 • Angel X. Chang, Christopher Manning
We describe SUTIME, a temporal tagger for recognizing and normalizing temporal expressions in English text.
no code implementations • LREC 2012 • Valentin I. Spitkovsky, Angel X. Chang
We present a resource for automatically associating strings of text with English Wikipedia concepts.