Search Results for author: Xiaoxiao Long

Found 23 papers, 11 papers with code

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation28 Mar 2024 Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the \textbf{domain gap} between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the \textbf{lack of data} with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment

1 code implementation20 Mar 2024 Peishan Cong, Ziyi Wang, Zhiyang Dou, Yiming Ren, Wei Yin, Kai Cheng, Yujing Sun, Xiaoxiao Long, Xinge Zhu, Yuexin Ma

Language-guided scene-aware human motion generation has great significance for entertainment and robotics.

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

no code implementations18 Mar 2024 Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long

We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes, e. g., depth and normals, from single images.

3D Reconstruction

GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping

1 code implementation14 Mar 2024 Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu Jin, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang

In particular, we propose an Efficient Feature Distillation (EFD) module that employs contrastive learning to efficiently and accurately distill language embeddings derived from foundational models.

Contrastive Learning Robotic Grasping

MonoOcc: Digging into Monocular Semantic Occupancy Prediction

1 code implementation13 Mar 2024 Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang

However, existing methods rely on a complex cascaded framework with relatively limited information to restore 3D scenes, including a dependency on supervision solely on the whole network's output, single-frame input, and the utilization of a small backbone.

Autonomous Vehicles

GaussianPro: 3D Gaussian Splatting with Progressive Propagation

no code implementations22 Feb 2024 Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, Xuejin Chen

The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed.

Neural Rendering Patch Matching

Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images

no code implementations8 Feb 2024 Xiaoxiao Long, Yuhang Zheng, Yupeng Zheng, Beiwen Tian, Cheng Lin, Lingjie Liu, Hao Zhao, Guyue Zhou, Wenping Wang

We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context.

Depth Estimation

GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces

no code implementations29 Nov 2023 Yingwenqi Jiang, Jiadong Tu, YuAn Liu, Xifeng Gao, Xiaoxiao Long, Wenping Wang, Yuexin Ma

In this paper, we present GaussianShader, a novel method that applies a simplified shading function on 3D Gaussians to enhance the neural rendering in scenes with reflective surfaces while preserving the training and rendering efficiency.

Neural Rendering

UC-NeRF: Neural Radiance Field for Under-Calibrated Multi-view Cameras in Autonomous Driving

no code implementations28 Nov 2023 Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen

Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities.

Autonomous Driving Depth Estimation +1

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

no code implementations23 Oct 2023 Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, YuAn Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang

In this work, we introduce Wonder3D, a novel method for efficiently generating high-fidelity textured meshes from single-view images. Recent methods based on Score Distillation Sampling (SDS) have shown the potential to recover 3D geometry from 2D diffusion priors, but they typically suffer from time-consuming per-shape optimization and inconsistent geometry.

Image to 3D

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

2 code implementations7 Sep 2023 YuAn Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, Wenping Wang

In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image.

3D Generation Image to 3D +2

NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images

1 code implementation27 May 2023 YuAn Liu, Peng Wang, Cheng Lin, Xiaoxiao Long, Jiepeng Wang, Lingjie Liu, Taku Komura, Wenping Wang

We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.

Neural Rendering Object

NeTO:Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing

no code implementations ICCV 2023 Zongcheng Li, Xiaoxiao Long, Yusen Wang, Tuo Cao, Wenping Wang, Fei Luo, Chunxia Xiao

In this paper, we propose to leverage implicit Signed Distance Function (SDF) as surface representation, and optimize the SDF field via volume rendering with a self-occlusion aware refractive ray tracing.

Transparent objects

Learning Long-Range Information with Dual-Scale Transformers for Indoor Scene Completion

no code implementations ICCV 2023 Ziqi Wang, Fei Luo, Xiaoxiao Long, Wenxiao Zhang, Chunxia Xiao

Due to the limited resolution of 3D sensors and the inevitable mutual occlusion between objects, 3D scans of real scenes are commonly incomplete.

NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies

no code implementations CVPR 2023 Xiaoxiao Long, Cheng Lin, Lingjie Liu, YuAn Liu, Peng Wang, Christian Theobalt, Taku Komura, Wenping Wang

In this paper, we propose to represent surfaces as the Unsigned Distance Function (UDF) and develop a new volume rendering scheme to learn the neural UDF representation.

Neural Rendering

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors

no code implementations27 Jun 2022 Jiepeng Wang, Peng Wang, Xiaoxiao Long, Christian Theobalt, Taku Komura, Lingjie Liu, Wenping Wang

The key idea of NeuRIS is to integrate estimated normal of indoor scenes as a prior in a neural rendering framework for reconstructing large texture-less shapes and, importantly, to do this in an adaptive manner to also enable the reconstruction of irregular shapes with fine details.

3D Reconstruction Neural Rendering

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

1 code implementation12 Jun 2022 Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang

We introduce SparseNeuS, a novel neural rendering based method for the task of surface reconstruction from multi-view images.

Neural Rendering Surface Reconstruction

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

1 code implementation CVPR 2021 Xiaoxiao Long, Lingjie Liu, Wei Li, Christian Theobalt, Wenping Wang

We present a novel method for multi-view depth estimation from a single video, which is a critical task in various applications, such as perception, reconstruction and robot navigation.

Depth Estimation Robot Navigation

Occlusion-Aware Depth Estimation with Adaptive Normal Constraints

1 code implementation ECCV 2020 Xiaoxiao Long, Lingjie Liu, Christian Theobalt, Wenping Wang

We present a new learning-based method for multi-frame depth estimation from a color video, which is a fundamental problem in scene understanding, robot navigation or handheld 3D reconstruction.

3D Reconstruction Depth Estimation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.