1 code implementation • 4 Jun 2024 • Mohamed El Amine Boudjoghra, Angela Dai, Jean Lahoud, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan
To this end, we propose a fast yet accurate open-vocabulary 3D instance segmentation approach, named Open-YOLO 3D, that effectively leverages only 2D object detection from multi-view RGB images for open-vocabulary 3D instance segmentation.
1 code implementation • 20 May 2024 • Yiheng Xiong, Angela Dai
Experiments demonstrate that our model outperforms state of the art in both scenarios
1 code implementation • 12 Mar 2024 • Mohamed Elrefaie, Angela Dai, Faez Ahmed
This study introduces DrivAerNet, a large-scale high-fidelity CFD dataset of 3D industry-standard car shapes, and RegDGCNN, a dynamic graph convolutional neural network model, both aimed at aerodynamic car design through machine learning.
1 code implementation • 13 Dec 2023 • Shivangi Aneja, Justus Thies, Angela Dai, Matthias Nießner
We propose a new latent diffusion model for this task, operating in the expression space of neural parametric head models, to synthesize audio-driven realistic head sequences.
1 code implementation • 4 Dec 2023 • Anh-Quan Cao, Angela Dai, Raoul de Charette
We propose the task of Panoptic Scene Completion (PSC) which extends the recently popular Semantic Scene Completion (SSC) task with instance-level information to produce a richer understanding of the 3D scene.
no code implementations • 2 Dec 2023 • Jiapeng Tang, Angela Dai, Yinyu Nie, Lev Markhasin, Justus Thies, Matthias Niessner
We introduce Diffusion Parametric Head Models (DPHMs), a generative model that enables robust volumetric head reconstruction and tracking from monocular depth sequences.
no code implementations • 30 Nov 2023 • Daoyi Gao, Dávid Rozenberszki, Stefan Leutenegger, Angela Dai
We formulate this as a conditional generative task, leveraging diffusion to learn implicit probabilistic models capturing the shape, pose, and scale of CAD objects in an image.
no code implementations • 29 Nov 2023 • Lei LI, Angela Dai
Given a natural language description and a coarse point location of the desired interaction in a 3D scene, we first leverage VLMs to imagine plausible 2D human interactions inpainted into multiple rendered views of the scene.
2 code implementations • 27 Nov 2023 • Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, Matthias Nießner
We introduce MeshGPT, a new approach for generating triangle meshes that reflects the compactness typical of artist-created meshes, in contrast to dense triangle meshes extracted by iso-surfacing methods from neural fields.
no code implementations • 27 Nov 2023 • Christian Diller, Angela Dai
Our method first learns to model human motion, object motion, and contact in a joint diffusion process, inter-correlated through cross-attention.
Human-Object Interaction Detection Human-Object Interaction Generation +1
no code implementations • ICCV 2023 • Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, Angela Dai
Each scene is captured with a high-end laser scanner at sub-millimeter resolution, along with registered 33-megapixel images from a DSLR camera, and RGB-D streams from an iPhone.
no code implementations • ICCV 2023 • Alexey Bokhovkin, Shubham Tulsiani, Angela Dai
The learned texture manifold enables effective navigation to generate an object texture for a given 3D object geometry that matches to an input RGB image, which maintains robustness even under challenging real-world scenarios where the mesh geometry approximates an inexact match to the underlying geometry in the RGB image.
1 code implementation • ICCV 2023 • Ziya Erkoç, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai
HyperDiffusion operates directly on MLP weights and generates new neural implicit fields encoded by synthesized MLP parameters.
no code implementations • 25 Mar 2023 • David Rozenberszki, Or Litany, Angela Dai
We propose UnScene3D, the first fully unsupervised 3D learning approach for class-agnostic 3D instance segmentation of indoor scans.
1 code implementation • 24 Mar 2023 • Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner
We introduce a diffusion network to synthesize a collection of 3D indoor objects by denoising a set of unordered object attributes.
no code implementations • CVPR 2023 • Ji Hou, Xiaoliang Dai, Zijian He, Angela Dai, Matthias Nießner
Current popular backbones in computer vision, such as Vision Transformers (ViT) and ResNets are trained to perceive the world from 2D images.
1 code implementation • CVPR 2023 • Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Buló, Norman Müller, Matthias Nießner, Angela Dai, Peter Kontschieder
We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes.
no code implementations • CVPR 2023 • Can Gümeli, Angela Dai, Matthias Nießner
We present ObjectMatch, a semantic and object-centric camera pose estimator for RGB-D SLAM pipelines.
1 code implementation • 2 Dec 2022 • Shivangi Aneja, Justus Thies, Angela Dai, Matthias Nießner
Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model.
no code implementations • 25 Nov 2022 • Angela Dai, Matthias Nießner
Implicit neural field generating signed distance field representations (SDFs) of 3D shapes have shown remarkable progress in 3D shape reconstruction and generation.
no code implementations • CVPR 2023 • Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner
Holistic 3D scene understanding entails estimation of both layout configuration and object geometry in a 3D environment.
no code implementations • 25 Nov 2022 • Christian Diller, Thomas Funkhouser, Angela Dai
Thus, we design our method to only require 2D RGB data at inference time while being able to generate 3D human motion sequences.
1 code implementation • 10 Jun 2022 • Yuchen Rao, Yinyu Nie, Angela Dai
While 3D shape representations enable powerful reasoning in many visual and perception applications, learning 3D shape priors tends to be constrained to the specific categories trained on, leading to an inefficient learning process, particularly for general applications with unseen categories.
1 code implementation • 16 Apr 2022 • David Rozenberszki, Or Litany, Angela Dai
Recent advances in 3D semantic segmentation with deep neural networks have shown remarkable success, with rapid performance increase on available datasets.
Ranked #7 on 3D Semantic Segmentation on ScanNet200
no code implementations • 5 Apr 2022 • Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai
Texture cues on 3D objects are key to compelling visual representations, with the possibility to create high visual fidelity with inherent spatial consistency across different views.
no code implementations • 24 Mar 2022 • Tim Beyer, Angela Dai
CAD model retrieval to real-world scene observations has shown strong promise as a basis for 3D perception of objects and a clean, lightweight mesh-based scene representation; however, current approaches to retrieve CAD models to a query scan rely on expensive manual annotations of 1:1 associations of CAD-scan objects, which typically contain strong lower-level geometric differences.
no code implementations • CVPR 2023 • Alexey Bokhovkin, Angela Dai
3D object recognition has seen significant advances in recent years, showing impressive performance on real-world 3D scan benchmarks, but lacking in object part reasoning, which is fundamental to higher-level scene understanding such as inter-object similarities or object functionality.
no code implementations • CVPR 2022 • Pablo Palafox, Nikolaos Sarafianos, Tony Tung, Angela Dai
We observe that deformable object motion is often semantically structured, and thus propose to learn Structured-implicit PArametric Models (SPAMs) as a deformable object representation that structurally decomposes non-rigid object motion into part-based disentangled representations of shape and pose, with each being represented by deep implicit functions.
no code implementations • 6 Dec 2021 • Yujin Chen, Matthias Nießner, Angela Dai
We present a new approach to instill 4D dynamic object priors into learned 3D representations by unsupervised pre-training.
Ranked #21 on 3D Instance Segmentation on ScanNet(v2)
1 code implementation • CVPR 2022 • Can Gümeli, Angela Dai, Matthias Nießner
We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image.
3D Dense Shape Correspondence 3D Object Detection From Monocular Images +2
no code implementations • 1 Dec 2021 • Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner
To this end, we propose P2R-Net to learn a probabilistic 3D model of the objects in a scene characterized by their class categories and oriented 3D bounding boxes, based on an input observed human trajectory in the environment.
1 code implementation • NeurIPS 2021 • Manuel Dahnert, Ji Hou, Matthias Nießner, Angela Dai
Inspired by 2D panoptic segmentation, we propose to unify the tasks of geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation into the task of panoptic 3D scene reconstruction - from a single RGB image, predicting the complete geometric reconstruction of the scene in the camera frustum of the image, along with semantic and instance segmentations.
no code implementations • ICCV 2021 • Weicheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai
3D perception of object shapes from RGB image input is fundamental towards semantic scene understanding, grounding image-based perception in our spatially 3-dimensional real-world environments.
1 code implementation • NeurIPS 2021 • Aljaž Božič, Pablo Palafox, Justus Thies, Angela Dai, Matthias Nießner
We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach.
1 code implementation • ICCV 2021 • Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, Matthias Nießner
Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints.
1 code implementation • ICCV 2021 • Pablo Palafox, Aljaž Božič, Justus Thies, Matthias Nießner, Angela Dai
Crucially, once learned, our neural parametric models of shape and pose enable optimization over the learned spaces to fit to new observations, similar to the fitting of a traditional parametric model, e. g., SMPL.
1 code implementation • ICCV 2021 • Yawar Siddiqui, Justus Thies, Fangchang Ma, Qi Shan, Matthias Nießner, Angela Dai
3D reconstruction of large scenes is a challenging problem due to the high-complexity nature of the solution space, in particular for generative neural networks.
no code implementations • CVPR 2021 • Norman Müller, Yu-Shiang Wong, Niloy J. Mitra, Angela Dai, Matthias Nießner
From a sequence of RGB-D frames, we detect objects in each frame and learn to predict their complete object geometry as well as a dense correspondence mapping into a canonical space.
1 code implementation • CVPR 2021 • Alexey Bokhovkin, Vladislav Ishimtsev, Emil Bogomolov, Denis Zorin, Alexey Artemov, Evgeny Burnaev, Angela Dai
Recent advances in 3D semantic scene understanding have shown impressive progress in 3D instance segmentation, enabling object-level reasoning about 3D scenes; however, a finer-grained understanding is required to enable interactions with objects and their functional understanding.
1 code implementation • CVPR 2021 • Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Justus Thies, Angela Dai, Matthias Nießner
We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.
no code implementations • CVPR 2022 • Christian Diller, Thomas Funkhouser, Angela Dai
To predict characteristic poses, we propose a probabilistic approach that models the possible multi-modality in the distribution of likely characteristic poses.
no code implementations • ECCV 2020 • Wei-cheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image by constructing a CAD-based representation of the objects and their poses.
1 code implementation • CVPR 2021 • Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner
We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion.
1 code implementation • NeurIPS 2020 • Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Angela Dai, Justus Thies, Matthias Nießner
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction by a learned robust optimization.
no code implementations • ECCV 2020 • Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors.
1 code implementation • CVPR 2020 • Jingwei Huang, Justus Thies, Angela Dai, Abhijit Kundu, Chiyu Max Jiang, Leonidas Guibas, Matthias Nießner, Thomas Funkhouser
In this work, we present a novel approach for color texture generation using a conditional adversarial loss obtained from weakly-supervised views.
2 code implementations • CVPR 2020 • Angela Dai, Christian Diller, Matthias Nießner
We present a novel approach that converts partial and noisy RGB-D scans into high-quality 3D scene reconstructions by inferring unobserved scene geometry.
1 code implementation • ICCV 2019 • Manuel Dahnert, Angela Dai, Leonidas Guibas, Matthias Nießner
We propose a novel approach to learn a joint embedding space between scan and CAD geometry, where semantically similar objects from both domains lie close together.
1 code implementation • ICCV 2019 • Armen Avetisyan, Angela Dai, Matthias Nießner
We present a novel, end-to-end approach to align CAD models to an 3D scan of a scene, enabling transformation of a noisy, incomplete 3D scan to a compact, CAD reconstruction with clean, complete object geometry.
no code implementations • CVPR 2020 • Ji Hou, Angela Dai, Matthias Nießner
Thus, we introduce the task of semantic instance completion: from an incomplete RGB-D scan of a scene, we aim to detect the individual object instances and infer their complete object geometry.
1 code implementation • CVPR 2019 • Ji Hou, Angela Dai, Matthias Nießner
We introduce 3D-SIS, a novel neural network architecture for 3D semantic instance segmentation in commodity RGB-D scans.
Ranked #3 on 3D Semantic Instance Segmentation on ScanNetV2
2 code implementations • CVPR 2019 • Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X. Chang, Matthias Nießner
For a 3D reconstruction of an indoor scene, our method takes as input a set of CAD models, and predicts a 9DoF pose that aligns each model to the underlying scan geometry.
Ranked #1 on 3D Reconstruction on Scan2CAD
1 code implementation • CVPR 2019 • Angela Dai, Matthias Nießner
We introduce Scan2Mesh, a novel data-driven generative approach which transforms an unstructured and potentially incomplete range scan into a structured 3D mesh representation.
1 code implementation • ECCV 2018 • Angela Dai, Matthias Nießner
We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network.
Ranked #1 on Scene Segmentation on ScanNet
no code implementations • CVPR 2018 • Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, Matthias Nießner
We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels.
1 code implementation • 18 Sep 2017 • Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, yinda zhang
Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms.
1 code implementation • CVPR 2017 • Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets.
Ranked #11 on Semantic Segmentation on ScanNetV2
2 code implementations • CVPR 2017 • Angela Dai, Charles Ruizhongtai Qi, Matthias Nießner
Although our 3D-EPN outperforms state-of-the-art completion method, the main contribution in our work lies in the combination of a data-driven shape predictor and analytic 3D shape synthesis.
2 code implementations • CVPR 2016 • Charles R. Qi, Hao Su, Matthias Niessner, Angela Dai, Mengyuan Yan, Leonidas J. Guibas
Empirical results from these two types of CNNs exhibit a large gap, indicating that existing volumetric CNN architectures and approaches are unable to fully exploit the power of 3D representations.
Ranked #3 on 3D Object Recognition on ModelNet40
1 code implementation • 5 Apr 2016 • Angela Dai, Matthias Nießner, Michael Zollhöfer, Shahram Izadi, Christian Theobalt
Our approach estimates globally optimized (i. e., bundle adjusted) poses in real-time, supports robust tracking with recovery from gross tracking failures (i. e., relocalization), and re-estimates the 3D model in real-time to ensure global consistency; all within a single framework.
no code implementations • 18 Mar 2016 • Julien Valentin, Angela Dai, Matthias Nießner, Pushmeet Kohli, Philip Torr, Shahram Izadi, Cem Keskin
We demonstrate the efficacy of our approach on the challenging problem of RGB Camera Relocalization.