no code implementations • 21 Mar 2025 • Hou In Derek Pun, Hou In Ivan Tam, Austin T. Wang, Xiaoliang Huo, Angel X. Chang, Manolis Savva
We present HSM, a hierarchical framework for indoor scene generation with dense object arrangements across spatial scales.
1 code implementation • 20 Mar 2025 • Han-Hung Lee, Qinghong Han, Angel X. Chang
In this paper, we explore the task of generating expansive outdoor scenes, ranging from castles to high-rises.
no code implementations • 18 Mar 2025 • Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva
SceneEval includes metrics for both explicit user requirements, such as the presence of specific objects and their attributes described in the input text, and implicit expectations, like the absence of object collisions, providing a comprehensive assessment of scene quality.
no code implementations • 6 Mar 2025 • Adrian Chang, Kai Wang, Yuanbo Li, Manolis Savva, Angel X. Chang, Daniel Ritchie
Empirical observations show that current systems tend to produce incomplete next object location distributions.
no code implementations • 25 Feb 2025 • Monireh Safari, Pablo Millan Arias, Scott C. Lowe, Lila Kari, Angel X. Chang, Graham W. Taylor
Masked language modelling (MLM) as a pretraining objective has been widely adopted in genomic sequence modelling.
no code implementations • 10 Jan 2025 • Sonia Raychaudhuri, Angel X. Chang
Among many skills that the agents need to possess, building and maintaining a semantic map of the environment is most crucial in long-horizon tasks.
no code implementations • 2 Jan 2025 • Austin T. Wang, ZeMing Gong, Angel X. Chang
3D visual grounding (3DVG) involves localizing entities in a 3D scene referred to by natural language text.
no code implementations • 29 Nov 2024 • Qirui Wu, Denys Iliash, Daniel Ritchie, Manolis Savva, Angel X. Chang
Reconstructing structured 3D scenes from RGB images using CAD objects unlocks efficient and compact scene representations that maintain compositionality and interactability.
no code implementations • 12 Nov 2024 • Sonia Raychaudhuri, Duy Ta, Katrina Ashton, Angel X. Chang, Jiuguang Wang, Bernadette Bucher
Large scale scenes such as multifloor homes can be robustly and efficiently mapped with a 3D graph of landmarks estimated jointly with robot poses in a factor graph, a technique commonly used in commercial robots such as drones and robot vacuums.
no code implementations • 21 Oct 2024 • Jiayi Liu, Denys Iliash, Angel X. Chang, Manolis Savva, Ali Mahdavi-Amiri
To capture the ambiguity in part shape and motion posed by a single view of the object, we design a diffusion model that learns the plausible variations of objects in terms of geometry and kinematics.
no code implementations • 27 Sep 2024 • Denys Iliash, Hanxiao Jiang, Yiming Zhang, Manolis Savva, Angel X. Chang
Despite much progress in large 3D datasets there are currently few interactive 3D object datasets, and their scale is limited due to the manual effort required in their construction.
no code implementations • 6 Aug 2024 • Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang
We introduce a new approach for generating realistic 3D models with UV maps through a representation termed "Object Images."
2 code implementations • 18 Jun 2024 • Zahra Gharaee, Scott C. Lowe, ZeMing Gong, Pablo Millan Arias, Nicholas Pellegrino, Austin T. Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Lila Kari, Dirk Steinke, Graham W. Taylor, Paul Fieguth, Angel X. Chang
We propose three benchmark experiments to demonstrate the impact of the multi-modal data types on the classification and clustering accuracy.
1 code implementation • 17 Jun 2024 • Han-Hung Lee, Yiming Zhang, Angel X. Chang
Our model also achieves better performance in more fine-grained text to shape retrieval, demonstrating better text-and-shape alignment than point cloud based models.
4 code implementations • 27 May 2024 • ZeMing Gong, Austin T. Wang, Xiaoliang Huo, Joakim Bruslund Haurum, Scott C. Lowe, Graham W. Taylor, Angel X. Chang
Measuring biodiversity is crucial for understanding ecosystem health.
1 code implementation • 16 May 2024 • Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu
Hence, with this paper, we aim to chart a course for future research that explores and expands the capabilities of 3D-LLMs in understanding and interacting with the complex 3D world.
no code implementations • 20 Mar 2024 • Han-Hung Lee, Manolis Savva, Angel X. Chang
Recent years have seen an explosion of work and interest in text-to-3D shape generation.
1 code implementation • 31 Dec 2023 • Qirui Wu, Daniel Ritchie, Manolis Savva, Angel X. Chang
Single-view 3D shape retrieval is a challenging task that is increasingly important with the growth of available 3D data.
4 code implementations • 4 Nov 2023 • Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin T. Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel X. Chang, Scott C. Lowe, Graham W. Taylor
We compared the performance of BarcodeBERT on taxonomic identification tasks against a spectrum of machine learning approaches including supervised training of classical neural architectures and fine-tuning of general DNA foundation models.
1 code implementation • ICCV 2023 • Yiming Zhang, ZeMing Gong, Angel X. Chang
We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natural language descriptions.
2 code implementations • NeurIPS 2023 • Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott C. Lowe, Jaclyn T. A. McKeown, Chris C. Y. Ho, Joschka McLeod, Yi-Yun C Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel X. Chang, Graham W. Taylor, Paul Fieguth
In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-Insect Dataset.
Ranked #1 on
Classification
on BIOSCAN_1M_Insect Dataset
no code implementations • CVPR 2024 • Mukul Khanna, Yongsen Mao, Hanxiao Jiang, Sanjay Haresh, Brennan Shacklett, Dhruv Batra, Alexander Clegg, Eric Undersander, Angel X. Chang, Manolis Savva
Surprisingly, we observe that agents trained on just 122 scenes from our dataset outperform agents trained on 10, 000 scenes from the ProcTHOR-10K dataset in terms of zero-shot generalization in real-world scanned environments.
no code implementations • 29 May 2023 • Supriya Gadi Patil, Angel X. Chang, Manolis Savva
Our study, on a synthetic dataset of 3D scenes where objects instances occur in different orientations, reveals that deep learning-based rotation invariant methods are effective for relatively easy settings with easy-to-distinguish pairs.
no code implementations • 7 Apr 2023 • Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang
We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.
no code implementations • ICCV 2023 • Enrico Cancelli, Tommaso Campari, Luciano Serafini, Angel X. Chang, Lamberto Ballan
In this paper, we propose an end-to-end architecture that exploits Proximity-Aware Tasks (referred as to Risk and Proximity Compass) to inject into a reinforcement learning navigation policy the ability to infer common-sense social behaviors.
no code implementations • ICCV 2023 • Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang
Performing 3D dense captioning and visual grounding requires a common and shared understanding of the underlying multimodal relationships.
no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu
We present a retrospective on the state of Embodied AI research.
no code implementations • 30 Sep 2022 • Han-Hung Lee, Angel X. Chang
We illustrate how different image-based augmentations prevent the adversarial generation problem, and how the generated results are impacted.
1 code implementation • 12 Sep 2022 • Sanjay Haresh, Xiaohao Sun, Hanxiao Jiang, Angel X. Chang, Manolis Savva
Human-object interactions with articulated objects are common in everyday life.
1 code implementation • 30 Mar 2022 • Hanxiao Jiang, Yongsen Mao, Manolis Savva, Angel X. Chang
The input is a single image of an object, and as output we detect what parts of the object can open, and the motion parameters describing the articulation of each openable part.
no code implementations • 19 Jan 2022 • Yue Ruan, Han-Hung Lee, Yiming Zhang, Ke Zhang, Angel X. Chang
Text-to-shape retrieval is an increasingly relevant problem with the growth of 3D shape data.
no code implementations • 10 Dec 2021 • Kai Wang, Xianghao Xu, Leon Lei, Selena Ling, Natalie Lindsay, Angel X. Chang, Manolis Savva, Daniel Ritchie
We then discuss different strategies for solving the problem, and design two representative pipelines: one uses available 2D floor plans to guide selection and deformation of 3D rooms; the other learns to retrieve a set of compatible 3D rooms and combine them into novel layouts.
no code implementations • 2 Dec 2021 • Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
Our D3Net unifies dense captioning and visual grounding in 3D in a self-critical manner.
Ranked #8 on
3D dense captioning
on Nr3D
no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
no code implementations • EMNLP 2021 • Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang
Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.
3 code implementations • 16 Sep 2021 • Santhosh K. Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, Dhruv Batra
When compared to existing photorealistic 3D datasets such as Replica, MP3D, Gibson, and ScanNet, images rendered from HM3D have 20 - 85% higher visual fidelity w. r. t.
1 code implementation • CVPR 2021 • Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva
Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors.
1 code implementation • CVPR 2021 • Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva
We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene.
Ranked #1 on
Plan2Scene
on Rent3D++
no code implementations • NeurIPS 2020 • Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva
We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.
no code implementations • CVPR 2021 • Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang
We introduce the task of dense captioning in 3D scans from commodity RGB-D sensors.
Ranked #9 on
3D dense captioning
on ScanRefer Dataset
no code implementations • 3 Nov 2020 • Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su
In the rearrangement task, the goal is to bring a given physical environment into a specified state.
1 code implementation • CVPR 2020 • Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su
To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.
3 code implementations • ECCV 2020 • Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
We introduce the task of 3D object localization in RGB-D scans using natural language descriptions.
no code implementations • CONLL 2019 • Justin Dieter, Tian Wang, Arun Tejasvi Chaganty, Gabor Angeli, Angel X. Chang
Reflective listening{--}demonstrating that you have heard your conversational partner{--}is key to effective communication.
5 code implementations • CVPR 2019 • Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information.
Ranked #3 on
3D Instance Segmentation
on PartNet
2 code implementations • CVPR 2019 • Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X. Chang, Matthias Nießner
For a 3D reconstruction of an indoor scene, our method takes as input a set of CAD models, and predicts a 9DoF pose that aligns each model to the underlying scan geometry.
Ranked #1 on
3D Reconstruction
on Scan2CAD
no code implementations • CVPR 2018 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image.
2 code implementations • 22 Mar 2018 • Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese
To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes.
no code implementations • 12 Dec 2017 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image.
2 code implementations • 11 Dec 2017 • Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, Vladlen Koltun
We present MINOS, a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments.
no code implementations • 7 Apr 2017 • Kyle Genova, Manolis Savva, Angel X. Chang, Thomas Funkhouser
We provide a search algorithm that generates a sampling of likely candidate views according to the example distribution, and a set selection algorithm that chooses a subset of the candidates that jointly cover the example distribution.
no code implementations • 28 Feb 2017 • Angel X. Chang, Mihail Eric, Manolis Savva, Christopher D. Manning
We present SceneSeer: an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.
no code implementations • 28 Feb 2017 • Manolis Savva, Angel X. Chang, Maneesh Agrawala
We present SceneSuggest: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement.
Graphics Human-Computer Interaction
1 code implementation • CVPR 2017 • Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets.
Ranked #11 on
Semantic Segmentation
on ScanNetV2
3 code implementations • CVPR 2017 • Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser
This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.
Ranked #2 on
3D Semantic Scene Completion
on KITTI-360
no code implementations • 15 Mar 2016 • Angel X. Chang, Valentin I. Spitkovsky, Christopher D. Manning, Eneko Agirre
Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia.
16 code implementations • 9 Dec 2015 • Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qi-Xing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, Fisher Yu
We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects.
no code implementations • LREC 2012 • Angel X. Chang, Christopher Manning
We describe SUTIME, a temporal tagger for recognizing and normalizing temporal expressions in English text.
no code implementations • LREC 2012 • Valentin I. Spitkovsky, Angel X. Chang
We present a resource for automatically associating strings of text with English Wikipedia concepts.