Search Results for author: Angel X. Chang

Found 43 papers, 16 papers with code

Text-to-3D Shape Generation

no code implementations • 20 Mar 2024 • Han-Hung Lee, Manolis Savva, Angel X. Chang

Recent years have seen an explosion of work and interest in text-to-3D shape generation.

3D Shape Generation Representation Learning +1

Paper
Add Code

Generalizing Single-View 3D Shape Retrieval to Occlusions and Unseen Objects

no code implementations • 31 Dec 2023 • Qirui Wu, Daniel Ritchie, Manolis Savva, Angel X. Chang

Single-view 3D shape retrieval is a challenging task that is increasingly important with the growth of available 3D data.

3D Shape Retrieval Retrieval

Paper
Add Code

BarcodeBERT: Transformers for Biodiversity Analysis

1 code implementation • 4 Nov 2023 • Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin T. Wang, Scott C. Lowe, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel X. Chang, Graham W. Taylor

Understanding biodiversity is a global challenge, in which DNA barcodes - short snippets of DNA that cluster by species - play a pivotal role.

Model Selection

Paper
Code

Multi3DRefer: Grounding Text Description to Multiple 3D Objects

no code implementations • ICCV 2023 • Yiming Zhang, ZeMing Gong, Angel X. Chang

We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natural language descriptions.

Contrastive Learning Object +3

Paper
Add Code

A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset

1 code implementation • NeurIPS 2023 • Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott C. Lowe, Jaclyn T. A. McKeown, Chris C. Y. Ho, Joschka McLeod, Yi-Yun C Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel X. Chang, Graham W. Taylor, Paul Fieguth

In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-Insect Dataset.

Classification

Paper
Code

Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation

no code implementations • 20 Jun 2023 • Mukul Khanna, Yongsen Mao, Hanxiao Jiang, Sanjay Haresh, Brennan Shacklett, Dhruv Batra, Alexander Clegg, Eric Undersander, Angel X. Chang, Manolis Savva

Surprisingly, we observe that agents trained on just 122 scenes from our dataset outperform agents trained on 10, 000 scenes from the ProcTHOR-10K dataset in terms of zero-shot generalization in real-world scanned environments.

Navigate Zero-shot Generalization

Paper
Add Code

Evaluating 3D Shape Analysis Methods for Robustness to Rotation Invariance

no code implementations • 29 May 2023 • Supriya Gadi Patil, Angel X. Chang, Manolis Savva

Our study, on a synthetic dataset of 3D scenes where objects instances occur in different orientations, reveals that deep learning-based rotation invariant methods are effective for relatively easy settings with easy-to-distinguish pairs.

Paper
Add Code

MOPA: Modular Object Navigation with PointGoal Agents

no code implementations • 7 Apr 2023 • Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang

We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.

Navigate Object +3

Paper
Add Code

Exploiting Proximity-Aware Tasks for Embodied Social Navigation

no code implementations • ICCV 2023 • Enrico Cancelli, Tommaso Campari, Luciano Serafini, Angel X. Chang, Lamberto Ballan

In this paper, we propose an end-to-end architecture that exploits Proximity-Aware Tasks (referred as to Risk and Proximity Compass) to inject into a reinforcement learning navigation policy the ability to infer common-sense social behaviors.

Common Sense Reasoning Navigate +1

Paper
Add Code

UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding

no code implementations • ICCV 2023 • Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang

Performing 3D dense captioning and visual grounding requires a common and shared understanding of the underlying multimodal relationships.

3D dense captioning Dense Captioning +1

Paper
Add Code

Retrospectives on the Embodied AI Workshop

no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu

We present a retrospective on the state of Embodied AI research.

Visual Navigation

Paper
Add Code

Understanding Pure CLIP Guidance for Voxel Grid NeRF Models

no code implementations • 30 Sep 2022 • Han-Hung Lee, Angel X. Chang

We illustrate how different image-based augmentations prevent the adversarial generation problem, and how the generated results are impacted.

Text to 3D

Paper
Add Code

Articulated 3D Human-Object Interactions from RGB Videos: An Empirical Analysis of Approaches and Challenges

1 code implementation • 12 Sep 2022 • Sanjay Haresh, Xiaohao Sun, Hanxiao Jiang, Angel X. Chang, Manolis Savva

Human-object interactions with articulated objects are common in everyday life.

3D Reconstruction Human-Object Interaction Detection +2

Paper
Code

OPD: Single-view 3D Openable Part Detection

1 code implementation • 30 Mar 2022 • Hanxiao Jiang, Yongsen Mao, Manolis Savva, Angel X. Chang

The input is a single image of an object, and as output we detect what parts of the object can open, and the motion parameters describing the articulation of each openable part.

Object OPD: Single-view 3D Openable Part Detection

Paper
Code

TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval

no code implementations • 19 Jan 2022 • Yue Ruan, Han-Hung Lee, Yiming Zhang, Ke Zhang, Angel X. Chang

Text-to-shape retrieval is an increasingly relevant problem with the growth of 3D shape data.

Contrastive Learning Multi-Task Learning +2

Paper
Add Code

Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms

no code implementations • 10 Dec 2021 • Kai Wang, Xianghao Xu, Leon Lei, Selena Ling, Natalie Lindsay, Angel X. Chang, Manolis Savva, Daniel Ritchie

We then discuss different strategies for solving the problem, and design two representative pipelines: one uses available 2D floor plans to guide selection and deformation of 3D rooms; the other learns to retrieve a set of compatible 3D rooms and combine them into novel layouts.

3D Reconstruction Autonomous Navigation +2

Paper
Add Code

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

no code implementations • 2 Dec 2021 • Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang

Our D3Net unifies dense captioning and visual grounding in 3D in a self-critical manner.

3D dense captioning Caption Generation +2

Paper
Add Code

Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents

no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang

We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.

Paper
Add Code

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

no code implementations • EMNLP 2021 • Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang

Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.

Vision and Language Navigation

Paper
Add Code

Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI

2 code implementations • 16 Sep 2021 • Santhosh K. Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, Dhruv Batra

When compared to existing photorealistic 3D datasets such as Replica, MP3D, Gibson, and ScanNet, images rendered from HM3D have 20 - 85% higher visual fidelity w. r. t.

PointGoal Navigation Surface Reconstruction

293

Paper
Code

Mirror3D: Depth Refinement for Mirror Surfaces

1 code implementation • CVPR 2021 • Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva

Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors.

3D Reconstruction Depth Estimation

Paper
Code

Plan2Scene: Converting Floorplans to 3D Scenes

1 code implementation • CVPR 2021 • Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva

We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene.

Ranked #1 on Plan2Scene on Rent3D++

Plan2Scene

424

Paper
Code

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

no code implementations • NeurIPS 2020 • Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva

We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.

Benchmarking Object

Paper
Add Code

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

no code implementations • CVPR 2021 • Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang

We introduce the task of dense captioning in 3D scans from commodity RGB-D sensors.

3D Object Detection Dense Captioning +3

Paper
Add Code

Rearrangement: A Challenge for Embodied AI

no code implementations • 3 Nov 2020 • Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su

In the rearrangement task, the goal is to bring a given physical environment into a specified state.

Benchmarking

Paper
Add Code

SAPIEN: A SimulAted Part-based Interactive ENvironment

1 code implementation • CVPR 2020 • Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su

To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable.

Attribute

319

Paper
Code

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

3 code implementations • ECCV 2020 • Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner

We introduce the task of 3D object localization in RGB-D scans using natural language descriptions.

Object Object Localization +3

211

Paper
Code

Mimic and Rephrase: Reflective Listening in Open-Ended Dialogue

no code implementations • CONLL 2019 • Justin Dieter, Tian Wang, Arun Tejasvi Chaganty, Gabor Angeli, Angel X. Chang

Reflective listening{--}demonstrating that you have heard your conversational partner{--}is key to effective communication.

Paper
Add Code

PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding

5 code implementations • CVPR 2019 • Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su

We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information.

Ranked #3 on 3D Instance Segmentation on PartNet

3D Instance Segmentation 3D Semantic Segmentation +2

1,351

Paper
Code

Scan2CAD: Learning CAD Model Alignment in RGB-D Scans

2 code implementations • CVPR 2019 • Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X. Chang, Matthias Nießner

For a 3D reconstruction of an indoor scene, our method takes as input a set of CAD models, and predicts a 9DoF pose that aligns each model to the underlying scan geometry.

Ranked #1 on 3D Reconstruction on Scan2CAD

3D Reconstruction

413

Paper
Code

Im2Pano3D: Extrapolating 360Â° Structure and Semantics Beyond the Field of View

no code implementations • CVPR 2018 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser

We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image.

Paper
Add Code

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

2 code implementations • 22 Mar 2018 • Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese

To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes.

Metric Learning Retrieval

Paper
Code

Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

no code implementations • 12 Dec 2017 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser

Paper
Add Code

MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments

2 code implementations • 11 Dec 2017 • Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, Vladlen Koltun

We present MINOS, a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments.

Navigate reinforcement-learning +1

200

Paper
Code

Learning Where to Look: Data-Driven Viewpoint Set Selection for 3D Scenes

no code implementations • 7 Apr 2017 • Kyle Genova, Manolis Savva, Angel X. Chang, Thomas Funkhouser

We provide a search algorithm that generates a sampling of likely candidate views according to the example distribution, and a set selection algorithm that chooses a subset of the candidates that jointly cover the example distribution.

Segmentation Semantic Segmentation

Paper
Add Code

SceneSeer: 3D Scene Design with Natural Language

no code implementations • 28 Feb 2017 • Angel X. Chang, Mihail Eric, Manolis Savva, Christopher D. Manning

We present SceneSeer: an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.

Scene Generation Text to 3D

Paper
Add Code

SceneSuggest: Context-driven 3D Scene Design

no code implementations • 28 Feb 2017 • Manolis Savva, Angel X. Chang, Maneesh Agrawala

We present SceneSuggest: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement.

Graphics Human-Computer Interaction

Paper
Add Code

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes

1 code implementation • CVPR 2017 • Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner

A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets.

Ranked #11 on Semantic Segmentation on ScanNetV2

3D Object Classification General Classification +4

Paper
Code

Semantic Scene Completion from a Single Depth Image

3 code implementations • CVPR 2017 • Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser

This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.

Ranked #2 on 3D Semantic Scene Completion on KITTI-360