Search Results for author: Sai Bi

Found 42 papers, 8 papers with code

Gaussian Mixture Flow Matching Models

1 code implementation7 Apr 2025 Hansheng Chen, Kai Zhang, Hao Tan, Zexiang Xu, Fujun Luan, Leonidas Guibas, Gordon Wetzstein, Sai Bi

Diffusion models approximate the denoising distribution as a Gaussian and predict its mean, whereas flow matching models reparameterize the Gaussian mean as flow velocity.

Denoising Image Generation

EP-CFG: Energy-Preserving Classifier-Free Guidance

no code implementations13 Dec 2024 Kai Zhang, Fujun Luan, Sai Bi, Jianming Zhang

Classifier-free guidance (CFG) is widely used in diffusion models but often introduces over-contrast and over-saturation artifacts at higher guidance strengths.

Denoising Prediction

Turbo3D: Ultra-fast Text-to-3D Generation

no code implementations5 Dec 2024 Hanzhe Hu, Tianwei Yin, Fujun Luan, Yiwei Hu, Hao Tan, Zexiang Xu, Sai Bi, Shubham Tulsiani, Kai Zhang

By shifting the Gaussian reconstructor's inputs from pixel space to latent space, we eliminate the extra image decoding time and halve the transformer sequence length for maximum efficiency.

3D Generation Text to 3D

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

no code implementations26 Nov 2024 Zhengfei Kuang, Tianyuan Zhang, Kai Zhang, Hao Tan, Sai Bi, Yiwei Hu, Zexiang Xu, Milos Hasan, Gordon Wetzstein, Fujun Luan

We present Buffer Anytime, a framework for estimation of depth and normal maps (which we call geometric buffers) from video that eliminates the need for paired video--depth and video--normal training data.

Optical Flow Estimation

Generating 3D-Consistent Videos from Unposed Internet Photos

no code implementations20 Nov 2024 Gene Chou, Kai Zhang, Sai Bi, Hao Tan, Zexiang Xu, Fujun Luan, Bharath Hariharan, Noah Snavely

We design a self-supervised method that takes advantage of the consistency of videos and variability of multiview internet photos to train a scalable, 3D-aware video model without any 3D annotations such as camera parameters.

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

no code implementations22 Oct 2024 Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, Zexiang Xu

We propose the Large View Synthesis Model (LVSM), a novel transformer-based approach for scalable and generalizable novel view synthesis from sparse-view inputs.

3DGS Decoder +5

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

no code implementations16 Oct 2024 Chen Ziwen, Hao Tan, Kai Zhang, Sai Bi, Fujun Luan, Yicong Hong, Li Fuxin, Zexiang Xu

Unlike previous feed-forward models that are limited to processing 1~4 input images and can only reconstruct a small portion of a large scene, Long-LRM reconstructs the entire scene in a single feed-forward step.

RelitLRM: Generative Relightable Radiance for Large Reconstruction Models

no code implementations8 Oct 2024 Tianyuan Zhang, Zhengfei Kuang, Haian Jin, Zexiang Xu, Sai Bi, Hao Tan, He Zhang, Yiwei Hu, Milos Hasan, William T. Freeman, Kai Zhang, Fujun Luan

We propose RelitLRM, a Large Reconstruction Model (LRM) for generating high-quality Gaussian splatting representations of 3D objects under novel illuminations from sparse (4-8) posed images captured under unknown static lighting.

Inverse Rendering

PBIR-NIE: Glossy Object Capture under Non-Distant Lighting

no code implementations13 Aug 2024 Guangyan Cai, Fujun Luan, Miloš Hašan, Kai Zhang, Sai Bi, Zexiang Xu, Iliyan Georgiev, Shuang Zhao

Glossy objects present a significant challenge for 3D reconstruction from multi-view input images under natural lighting.

3D Reconstruction Inverse Rendering +1

LRM-Zero: Training Large Reconstruction Models with Synthesized Data

1 code implementation13 Jun 2024 Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie Kaufman, Xin Sun, Hao Tan

We demonstrate that our LRM-Zero, trained with our fully synthesized Zeroverse, can achieve high visual quality in the reconstruction of real-world objects, competitive with models trained on Objaverse.

3D Reconstruction

Neural Gaffer: Relighting Any Object via Diffusion

no code implementations11 Jun 2024 Haian Jin, Yuan Li, Fujun Luan, Yuanbo Xiangli, Sai Bi, Kai Zhang, Zexiang Xu, Jin Sun, Noah Snavely

Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting.

Image Relighting Object

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

no code implementations CVPR 2024 Liwen Wu, Sai Bi, Zexiang Xu, Fujun Luan, Kai Zhang, Iliyan Georgiev, Kalyan Sunkavalli, Ravi Ramamoorthi

In contrast to previous methods that use encoding functions with only angular input, we additionally cone-trace spatial features to obtain a spatially varying directional encoding, which addresses the challenging interreflection effects.

NeRF Novel View Synthesis

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

no code implementations30 Apr 2024 Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu

We propose GS-LRM, a scalable large reconstruction model that can predict high-quality 3D Gaussian primitives from 2-4 posed sparse images in 0. 23 seconds on single A100 GPU.

3D Generation

DATENeRF: Depth-Aware Text-based Editing of NeRFs

no code implementations6 Apr 2024 Sara Rojas, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavall

However, extending these techniques to edit scenes in Neural Radiance Fields (NeRF) is complex, as editing individual 2D frames can result in inconsistencies across multiple views.

NeRF

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

no code implementations CVPR 2024 Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman

To this end, we introduce Carve3D, an improved RLFT algorithm coupled with a novel Multi-view Reconstruction Consistency (MRC) metric, to enhance the consistency of multi-view diffusion models.

Language Modelling Large Language Model +2

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

no code implementations20 Nov 2023 Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1. 3 seconds on a single A100 GPU.

3D Reconstruction Image to 3D +1

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

no code implementations15 Nov 2023 Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.

3D Generation Denoising +3

LRM: Large Reconstruction Model for Single Image to 3D

1 code implementation8 Nov 2023 Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan

We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds.

Image to 3D NeRF

Controllable Dynamic Appearance for Neural 3D Portraits

no code implementations20 Sep 2023 ShahRukh Athar, Zhixin Shu, Zexiang Xu, Fujun Luan, Sai Bi, Kalyan Sunkavalli, Dimitris Samaras

The surface normals prediction is guided using 3DMM normals that act as a coarse prior for the normals of the human head, where direct prediction of normals is hard due to rigid and non-rigid deformations induced by head-pose and facial expression changes.

Neural Free-Viewpoint Relighting for Glossy Indirect Illumination

no code implementations12 Jul 2023 Nithin Raghavan, Yan Xiao, Kai-En Lin, Tiancheng Sun, Sai Bi, Zexiang Xu, Tzu-Mao Li, Ravi Ramamoorthi

In this paper, we demonstrate a hybrid neural-wavelet PRT solution to high-frequency indirect illumination, including glossy reflection, for relighting with changing view.

Tensor Decomposition

PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

no code implementations CVPR 2023 Zhengfei Kuang, Fujun Luan, Sai Bi, Zhixin Shu, Gordon Wetzstein, Kalyan Sunkavalli

Recent advances in neural radiance fields have enabled the high-fidelity 3D reconstruction of complex scenes for novel view synthesis.

3D Reconstruction NeRF +1

ARF: Artistic Radiance Fields

1 code implementation13 Jun 2022 Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, Noah Snavely

We present a method for transferring the artistic features of an arbitrary style image to a 3D scene.

Differentiable Rendering of Neural SDFs through Reparameterization

no code implementations10 Jun 2022 Sai Praveen Bangaru, Michaël Gharbi, Tzu-Mao Li, Fujun Luan, Kalyan Sunkavalli, Miloš Hašan, Sai Bi, Zexiang Xu, Gilbert Bernstein, Frédo Durand

Our method leverages the distance to surface encoded in an SDF and uses quadrature on sphere tracer points to compute this warping function.

Inverse Rendering

Physically-Based Editing of Indoor Scene Lighting from a Single Image

no code implementations19 May 2022 Zhengqin Li, Jia Shi, Sai Bi, Rui Zhu, Kalyan Sunkavalli, Miloš Hašan, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker

We tackle this problem using two novel components: 1) a holistic scene reconstruction method that estimates scene reflectance and parametric 3D lighting, and 2) a neural rendering framework that re-renders the scene from our predictions.

Inverse Rendering Lighting Estimation +1

NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction

1 code implementation CVPR 2022 Xiaoshuai Zhang, Sai Bi, Kalyan Sunkavalli, Hao Su, Zexiang Xu

We demonstrate that NeRFusion achieves state-of-the-art quality on both large-scale indoor and small-scale object scenes, with substantially faster reconstruction than NeRF and other recent methods.

3D Reconstruction NeRF

Point-NeRF: Point-based Neural Radiance Fields

1 code implementation CVPR 2022 Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, Ulrich Neumann

Point-NeRF combines the advantages of these two approaches by using neural 3D point clouds, with associated neural features, to model a radiance field.

3D Reconstruction NeRF +1

Learning Neural Transmittance for Efficient Rendering of Reflectance Fields

no code implementations25 Oct 2021 Mohammad Shafiei, Sai Bi, Zhengqin Li, Aidas Liaudanskas, Rodrigo Ortiz-Cayon, Ravi Ramamoorthi

However, it remains challenging and time-consuming to render such representations under complex lighting such as environment maps, which requires individual ray marching towards each single light to calculate the transmittance at every sampled point.

NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting

no code implementations26 Jul 2021 Tiancheng Sun, Kai-En Lin, Sai Bi, Zexiang Xu, Ravi Ramamoorthi

Our system is trained on a large number of synthetic models, and can generalize to different synthetic and real portraits under various lighting conditions.

OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets

no code implementations CVPR 2021 Zhengqin Li, Ting-Wei Yu, Shen Sang, Sarah Wang, Meng Song, YuHan Liu, Yu-Ying Yeh, Rui Zhu, Nitesh Gundavarapu, Jia Shi, Sai Bi, Hong-Xing Yu, Zexiang Xu, Kalyan Sunkavalli, Milos Hasan, Ravi Ramamoorthi, Manmohan Chandraker

Finally, we demonstrate that our framework may also be integrated with physics engines, to create virtual robotics environments with unique ground truth such as friction coefficients and correspondence to real scenes.

Friction Inverse Rendering +1

Neural Reflectance Fields for Appearance Acquisition

no code implementations9 Aug 2020 Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi

We combine this representation with a physically-based differentiable ray marching framework that can render images from a neural reflectance field under any viewpoint and light.

OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets

no code implementations25 Jul 2020 Zhengqin Li, Ting-Wei Yu, Shen Sang, Sarah Wang, Meng Song, YuHan Liu, Yu-Ying Yeh, Rui Zhu, Nitesh Gundavarapu, Jia Shi, Sai Bi, Zexiang Xu, Hong-Xing Yu, Kalyan Sunkavalli, Miloš Hašan, Ravi Ramamoorthi, Manmohan Chandraker

Finally, we demonstrate that our framework may also be integrated with physics engines, to create virtual robotics environments with unique ground truth such as friction coefficients and correspondence to real scenes.

Friction Inverse Rendering +2

Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images

no code implementations CVPR 2020 Sai Bi, Zexiang Xu, Kalyan Sunkavalli, David Kriegman, Ravi Ramamoorthi

We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object from a sparse set of only six images captured by wide-baseline cameras under collocated point lighting.

Deep Hybrid Real and Synthetic Training for Intrinsic Decomposition

no code implementations30 Jul 2018 Sai Bi, Nima Khademi Kalantari, Ravi Ramamoorthi

Experimental results show that our approach produces better results than the state-of-the-art DL and non-DL methods on various synthetic and real datasets both visually and numerically.

Intrinsic Image Decomposition

Cannot find the paper you are looking for? You can Submit a new open access paper.