Search Results for author: Weicai Ye

Found 29 papers, 15 papers with code

LLaVA-SLT: Visual Language Tuning for Sign Language Translation

no code implementations21 Dec 2024 Han Liang, Chengyu Huang, Yuecheng Xu, Cheng Tang, Weicai Ye, Juze Zhang, Xin Chen, Jingyi Yu, Lan Xu

We propose hierarchical visual encoder that learns a robust word-level intermediate representation that is compatible with LLM token embeddings.

Sign Language Translation Translation

Splatter-360: Generalizable 360$^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images

1 code implementation9 Dec 2024 Zheng Chen, Chenming Wu, Zhelun Shen, Chen Zhao, Weicai Ye, Haocheng Feng, Errui Ding, Song-Hai Zhang

Wide-baseline panoramic images are frequently used in applications like VR and simulations to minimize capturing labor costs and storage needs.

3DGS NeRF

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

no code implementations21 Nov 2024 Zhuoman Liu, Weicai Ye, Yan Luximon, Pengfei Wan, Di Zhang

Realistic simulation of dynamic scenes requires accurately capturing diverse material properties and modeling complex object interactions grounded in physical principles.

Optical Flow Estimation

DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild

no code implementations20 Nov 2024 Weicai Ye, Xinyu Chen, Ruohao Zhan, Di Huang, Xiaoshui Huang, Haoyi Zhu, Hujun Bao, Wanli Ouyang, Tong He, Guofeng Zhang

To tackle these challenges, we propose a dynamic-aware tracking any point (DATAP) method that leverages consistent video depth and point tracking.

Camera Pose Estimation Depth Estimation +4

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

no code implementations19 Nov 2024 Hao Li, Yuanyuan Gao, Haosong Peng, Chenming Wu, Weicai Ye, Yufeng Zhan, Chen Zhao, Dingwen Zhang, Jingdong Wang, Junwei Han

This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes.

Novel View Synthesis

DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion

1 code implementation31 Oct 2024 Weicai Ye, Chenhao Ji, Zheng Chen, Junyao Gao, Xiaoshui Huang, Song-Hai Zhang, Wanli Ouyang, Tong He, Cairong Zhao, Guofeng Zhang

Then, we propose a novel text-driven panoramic generation framework, termed DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.

Scene Generation

Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction

no code implementations24 Oct 2024 Junyi Chen, Di Huang, Weicai Ye, Wanli Ouyang, Tong He

Our model simultaneously estimates the camera pose from a single image and predicts the view from a new camera pose, effectively bridging the gap between spatial awareness and visual prediction.

Novel View Synthesis Pose Estimation +2

HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction

no code implementations8 Oct 2024 Shengji Tang, Weicai Ye, Peng Ye, Weihao Lin, Yang Zhou, Tao Chen, Wanli Ouyang

Recently, advances in generalizable 3D Gaussian Splatting have enabled high-quality novel view synthesis for unseen scenes from sparse input views by feed-forward predicting per-pixel Gaussian parameters without extra optimization.

Novel View Synthesis

DynaSurfGS: Dynamic Surface Reconstruction with Planar-based Gaussian Splatting

1 code implementation26 Aug 2024 Weiwei Cai, Weicai Ye, Peng Ye, Tong He, Tao Chen

Extensive experiments demonstrate that DynaSurfGS surpasses state-of-the-art methods in both high-fidelity surface reconstruction and photorealistic rendering.

Surface Reconstruction

ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction

1 code implementation22 Aug 2024 Ziyu Tang, Weicai Ye, Yifan Wang, Di Huang, Hujun Bao, Tong He, Guofeng Zhang

Neural implicit reconstruction via volume rendering has demonstrated its effectiveness in recovering dense 3D surfaces.

NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction

no code implementations19 Aug 2024 Yifan Wang, Di Huang, Weicai Ye, Guofeng Zhang, Wanli Ouyang, Tong He

Signed Distance Function (SDF)-based volume rendering has demonstrated significant capabilities in surface reconstruction.

Surface Reconstruction

VIPeR: Visual Incremental Place Recognition with Adaptive Mining and Continual Learning

no code implementations31 Jul 2024 Yuhang Ming, Minyang Xu, Xingrui Yang, Weicai Ye, Weihan Wang, Yong Peng, Weichen Dai, Wanzeng Kong

Then, to prevent catastrophic forgetting in lifelong learning, we draw inspiration from human memory systems and design a novel memory bank for our VIPeR.

Continual Learning Knowledge Distillation +1

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

1 code implementation14 Jun 2024 YiWen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang

Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement.

Decoder

PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction

no code implementations10 Jun 2024 Danpeng Chen, Hai Li, Weicai Ye, Yifan Wang, Weijian Xie, Shangjin Zhai, Nan Wang, Haomin Liu, Hujun Bao, Guofeng Zhang

Experiments on indoor and outdoor scenes show that our method achieves fast training and rendering while maintaining high-fidelity rendering and geometric reconstruction, outperforming 3DGS-based and NeRF-based methods.

3DGS Image Reconstruction +2

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

2 code implementations9 May 2024 Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li

Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details.

Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy

1 code implementation11 Mar 2024 Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wang

In this paper, we propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism.

Mamba Semantic Segmentation

NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection

1 code implementation22 Feb 2024 Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang

We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors.

Depth Estimation Depth Prediction +2

Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

2 code implementations4 Feb 2024 Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He

These outcomes suggest that the 3D point cloud is a valuable observation modality for intricate robotic tasks.

Zero-shot Generalization

DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Diffusion Model

1 code implementation29 Nov 2023 Jiuming Liu, Guangming Wang, Weicai Ye, Chaokang Jiang, Jinru Han, Zhe Liu, Guofeng Zhang, Dalong Du, Hesheng Wang

Furthermore, we also develop an uncertainty estimation module within diffusion to evaluate the reliability of estimated scene flow.

Diversity Scene Flow Estimation

Improving Feature-based Visual Localization by Geometry-Aided Matching

1 code implementation16 Nov 2022 Hailin Yu, Youji Feng, Weicai Ye, Mingxuan Jiang, Hujun Bao, Guofeng Zhang

We apply GAM to a new hierarchical visual localization pipeline and show that GAM can effectively improve the robustness and accuracy of localization.

3D Feature Matching Pose Estimation +1

IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis

1 code implementation ICCV 2023 Weicai Ye, Shuo Chen, Chong Bao, Hujun Bao, Marc Pollefeys, Zhaopeng Cui, Guofeng Zhang

Existing inverse rendering combined with neural rendering methods can only perform editable novel view synthesis on object-specific scenes, while we present intrinsic neural radiance fields, dubbed IntrinsicNeRF, which introduce intrinsic decomposition into the NeRF-based neural rendering method and can extend its application to room-scale scenes.

Clustering Inverse Rendering +3

iDF-SLAM: End-to-End RGB-D SLAM with Neural Implicit Mapping and Deep Feature Tracking

no code implementations16 Sep 2022 Yuhang Ming, Weicai Ye, Andrew Calway

The neural implicit mapper is trained on-the-fly, while though the neural tracker is pretrained on the ScanNet dataset, it is also finetuned along with the training of the neural implicit mapper.

NeRF

PVO: Panoptic Visual Odometry

1 code implementation CVPR 2023 Weicai Ye, Xinyue Lan, Shuo Chen, Yuhang Ming, Xingyuan Yu, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

We present PVO, a novel panoptic visual odometry framework to achieve more comprehensive modeling of the scene motion, geometry, and panoptic segmentation information.

Camera Pose Estimation Optical Flow Estimation +4

Hybrid Tracker with Pixel and Instance for Video Panoptic Segmentation

no code implementations2 Mar 2022 Weicai Ye, Xinyue Lan, Ge Su, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

HybridTracker performs pixel tracker and instance tracker in parallel to obtain the association matrices, which are fused into a matching matrix.

Optical Flow Estimation Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.