Search Results for author: Wei Yin

Found 44 papers, 21 papers with code

Depth Any Video with Scalable Synthetic Data

1 code implementation14 Oct 2024 Honghui Yang, Di Huang, Wei Yin, Chunhua Shen, Haifeng Liu, Xiaofei He, Binbin Lin, Wanli Ouyang, Tong He

Video depth estimation has long been hindered by the scarcity of consistent and scalable ground truth data, leading to inconsistent and unreliable results.

Depth Estimation

DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model

no code implementations14 Oct 2024 Songen Gu, Wei Yin, Bu Jin, Xiaoyang Guo, Junming Wang, Haodong Li, Qian Zhang, Xiaoxiao Long

The ability of this world model to capture the evolution of the environment is crucial for planning in autonomous driving.

Autonomous Driving

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

no code implementations7 Oct 2024 Junming Wang, Xingyu Zhang, Zebin Xing, Songen Gu, Xiaoyang Guo, Yang Hu, Ziying Song, Qian Zhang, Xiaoxiao Long, Wei Yin

In this paper, we propose HE-Drive: the first human-like-centric end-to-end autonomous driving system to generate trajectories that are both temporally consistent and comfortable.

Autonomous Driving Denoising +1

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

no code implementations26 Sep 2024 Jing He, Haodong Li, Wei Yin, Yixun Liang, Leheng Li, Kaiqiang Zhou, Hongbo Zhang, Bingbing Liu, Ying-Cong Chen

In this paper, we provide a systemic analysis of the diffusion formulation for the dense prediction, focusing on both quality and efficiency.

3D Reconstruction Denoising +3

DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos

1 code implementation27 May 2024 Linhan Wang, Kai Cheng, Shuo Lei, Shengkun Wang, Wei Yin, Chenyang Lei, Xiaoxiao Long, Chang-Tien Lu

Dash cam videos often suffer from severe obstructions such as reflections and occlusions on the windshields, which significantly impede the application of neural rendering techniques.

Autonomous Vehicles Neural Rendering +1

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

1 code implementation18 Mar 2024 Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long

We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes, e. g., depth and normals, from single images.

3D geometry 3D Reconstruction +1

GaussianPro: 3D Gaussian Splatting with Progressive Propagation

no code implementations22 Feb 2024 Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, Xuejin Chen

The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed.

Neural Rendering Patch Matching

SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets

no code implementations19 Feb 2024 Jialei Xu, Wei Yin, Dong Gong, Junjun Jiang, Xianming Liu

We suggest building virtual pinhole cameras to resolve the distortion problem of fisheye cameras and unify the processing for the two types of 360$^\circ$ cameras.

3D Object Detection Autonomous Driving +2

GIM: Learning Generalizable Image Matcher From Internet Videos

1 code implementation16 Feb 2024 Xuelun Shen, Zhipeng Cai, Wei Yin, Matthias Müller, Zijun Li, Kaixuan Wang, Xiaozhi Chen, Cheng Wang

Given an architecture, GIM first trains it on standard domain-specific datasets and then combines it with complementary matching methods to create dense labels on nearby frames of novel videos.

3D Reconstruction Camera Pose Estimation +2

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion

no code implementations CVPR 2024 Ying-Tian Liu, Yuan-Chen Guo, Guan Luo, Heyi Sun, Wei Yin, Song-Hai Zhang

However, the generation quality and generalization ability of 3D diffusion models is hindered by the scarcity of high-quality and large-scale 3D datasets.

3D Generation Text to 3D

UC-NeRF: Neural Radiance Field for Under-Calibrated Multi-view Cameras in Autonomous Driving

no code implementations28 Nov 2023 Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen

Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities.

Autonomous Driving Depth Estimation +1

HumanRecon: Neural Reconstruction of Dynamic Human Using Geometric Cues and Physical Priors

1 code implementation26 Nov 2023 Junhui Yin, Wei Yin, Hao Chen, Xuqian Ren, Zhanyu Ma, Jun Guo, Yifan Liu

These priors ensure the color rendered along rays to be robust to view direction and reduce the inherent ambiguities of density estimated along rays.

Novel View Synthesis

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

no code implementations ICCV 2023 Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen

In this paper, we propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.

Monocular Depth Estimation

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

1 code implementation ICCV 2023 Wei Yin, Chi Zhang, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen

State-of-the-art (SOTA) monocular metric depth estimation methods can only handle a single camera model and are unable to perform mixed-data training due to the metric ambiguity.

Ranked #25 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Image Reconstruction Monocular Depth Estimation +1

Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes

1 code implementation CVPR 2023 Rui Li, Dong Gong, Wei Yin, Hao Chen, Yu Zhu, Kaixuan Wang, Xiaozhi Chen, Jinqiu Sun, Yanning Zhang

To let the geometric perception learned from multi-view cues in static areas propagate to the monocular representation in dynamic areas and let monocular cues enhance the representation of multi-view cost volume, we propose a cross-cue fusion (CCF) module, which includes the cross-cue attention (CCA) to encode the spatially non-local relative intra-relations from each source to enhance the representation of the other.

Autonomous Driving Depth Estimation

Distributed Graph Embedding with Information-Oriented Random Walks

1 code implementation28 Mar 2023 Peng Fang, Arijit Khan, Siqiang Luo, Fang Wang, Dan Feng, Zhenli Li, Wei Yin, Yuchao Cao

Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks.

Graph Embedding graph partitioning +1

Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image

1 code implementation28 Aug 2022 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Yifan Liu, Chunhua Shen

To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.

Depth Estimation Depth Prediction

Towards Domain-agnostic Depth Completion

1 code implementation29 Jul 2022 Guangkai Xu, Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Jia-Wang Bian

Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model.

Depth Completion Depth Estimation +2

Controllable Shadow Generation Using Pixel Height Maps

no code implementations12 Jul 2022 Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes

It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.

Exploiting Correspondences with All-pairs Correlations for Multi-view Depth Estimation

no code implementations5 May 2022 Kai Cheng, Hao Chen, Wei Yin, Guangkai Xu, Xuejin Chen

However, multi-view depth estimation is fundamentally a correspondence-based optimization problem, but previous learning-based methods mainly rely on predefined depth hypotheses to build correspondence as the cost volume and implicitly regularize it to fit depth prediction, deviating from the essence of iterative optimization based on stereo correspondence.

Depth Estimation Depth Prediction +1

PointInst3D: Segmenting 3D Instances by Points

no code implementations25 Apr 2022 Tong He, Wei Yin, Chunhua Shen, Anton Van Den Hengel

The current state-of-the-art methods in 3D instance segmentation typically involve a clustering step, despite the tendency towards heuristics, greedy algorithms, and a lack of robustness to the changes in data statistics.

3D Instance Segmentation Clustering +2

Improving Monocular Visual Odometry Using Learned Depth

no code implementations4 Apr 2022 Libo Sun, Wei Yin, Enze Xie, Zhengrong Li, Changming Sun, Chunhua Shen

The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.

Monocular Depth Estimation Monocular Visual Odometry

Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings

no code implementations4 Feb 2022 Wei Yin, Yifan Liu, Chunhua Shen, Baichuan Sun, Anton Van Den Hengel

The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets, despite not using any images therefrom.

Instance Segmentation Monocular Depth Estimation +4

Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth

no code implementations3 Feb 2022 Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Wu, Feng Zhao

However, in some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.

3D Scene Reconstruction Depth Completion +1

Pseudo-LiDAR Based Road Detection

no code implementations28 Jul 2021 Libo Sun, Haokui Zhang, Wei Yin

Specifically, we exploit pseudo-LiDAR using depth estimation, and propose a feature fusion network where RGB and learned depth information are fused for improved road detection.

Depth Estimation Self-Driving Cars

Generic Perceptual Loss for Modeling Structured Output Dependencies

no code implementations CVPR 2021 Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen

We hope that this simple, extended perceptual loss may serve as a generic structured-output loss that is applicable to most structured output learning tasks.

Depth Estimation Image Generation +4

Learning to Recover 3D Scene Shape from a Single Image

1 code implementation CVPR 2021 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.

 Ranked #1 on Indoor Monocular Depth Estimation on DIODE (using extra training data)

3D Scene Reconstruction Depth Prediction +3

Hierarchical Text Interaction for Rating Prediction

no code implementations15 Oct 2020 Jiahui Wen, Jingwei Ma, Hongkui Tu, Wei Yin, Jian Fang

At review level, we mutually propagate textual features between the user and item, and capture the informative reviews.

Collaborative Filtering Recommendation Systems

DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data

2 code implementations3 Feb 2020 Wei Yin, Xinlong Wang, Chunhua Shen, Yifan Liu, Zhi Tian, Songcen Xu, Changming Sun, Dou Renyin

Compared with previous learning objectives, i. e., learning metric depth or relative depth, we propose to learn the affine-invariant depth using our diverse dataset to ensure both generalization and high-quality geometric shapes of scenes.

Depth Estimation Depth Prediction

Task-Aware Monocular Depth Estimation for 3D Object Detection

1 code implementation17 Sep 2019 Xinlong Wang, Wei Yin, Tao Kong, Yuning Jiang, Lei LI, Chunhua Shen

In this paper, we first analyse the data distributions and interaction of foreground and background, then propose the foreground-background separated monocular depth estimation (ForeSeE) method, to estimate the foreground depth and background depth using separate optimization objectives and depth decoders.

3D Object Detection 3D Object Recognition +4

Auxiliary Learning for Deep Multi-task Learning

no code implementations5 Sep 2019 Yifan Liu, Bohan Zhuang, Chunhua Shen, Hao Chen, Wei Yin

The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized.

Auxiliary Learning Depth Estimation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.