Search Results for author: Jiemin Fang

Found 28 papers, 19 papers with code

GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

1 code implementation • 15 Feb 2024 • Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.

Neural Rendering Object

610

Paper
Code

Fast High Dynamic Range Radiance Fields for Dynamic Scenes

no code implementations • 11 Jan 2024 • Guanjun Wu, Taoran Yi, Jiemin Fang, Wenyu Liu, Xinggang Wang

To extend HDR NeRF methods to wider applications, we propose a dynamic HDR NeRF framework, named HDR-HexPlane, which can learn 3D scenes from dynamic 2D images captured with various exposures.

Paper
Add Code

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

no code implementations • 7 Dec 2023 • Yabo Chen, Jiemin Fang, YuYang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

We propose a cascade generation framework constructed with two Zero-1-to-3 models, named Cascade-Zero123, to tackle this issue, which progressively extracts 3D information from the source image.

Transparent objects

Paper
Add Code

Segment Any 3D Gaussians

no code implementations • 1 Dec 2023 • Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Interactive 3D segmentation in radiance fields is an appealing task since its importance in 3D scene understanding and manipulation.

Interactive Segmentation Scene Understanding +1

Paper
Add Code

GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions

no code implementations • 27 Nov 2023 • Jiemin Fang, Junjie Wang, Xiaopeng Zhang, Lingxi Xie, Qi Tian

Specifically, we first extract the region of interest (RoI) corresponding to the text instruction, aligning it to 3D Gaussians.

3D scene Editing

Paper
Add Code

4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

1 code implementation • 12 Oct 2023 • Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang

Representing and rendering dynamic scenes has been an important but challenging task.

1,644

Paper
Code

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

1 code implementation • 12 Oct 2023 • Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, Xinggang Wang

In recent times, the generation of 3D assets from text prompts has shown impressive results.

Text to 3D

515

Paper
Code

TiAVox: Time-aware Attenuation Voxels for Sparse-view 4D DSA Reconstruction

no code implementations • 5 Sep 2023 • Zhenghong Zhou, Huangxuan Zhao, Jiemin Fang, Dongqiao Xiang, Lei Chen, Lingxia Wu, Feihong Wu, Wenyu Liu, Chuansheng Zheng, Xinggang Wang

Additionally, 2D and 3D DSA imaging results can be generated from the reconstructed 4D DSA images.

3D Reconstruction Novel View Synthesis

Paper
Add Code

Segment Anything in 3D with Radiance Fields

1 code implementation • NeurIPS 2023 • Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.

Inverse Rendering Segmentation

784

Paper
Code

TinyDet: Accurate Small Object Detection in Lightweight Generic Detectors

no code implementations • 7 Apr 2023 • Shaoyu Chen, Tianheng Cheng, Jiemin Fang, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

Small object detection requires the detection head to scan a large number of positions on image feature maps, which is extremely hard for computation- and energy-efficient lightweight generic detectors.

object-detection Small Object Detection

Paper
Add Code

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

1 code implementation • 3 Apr 2023 • Lianghui Zhu, Yingyue Li, Jiemin Fang, Yan Liu, Hao Xin, Wenyu Liu, Xinggang Wang

Thus a novel weight-based method is proposed to end-to-end estimate the importance of attention heads, while the self-attention maps are adaptively fused for high-quality CAM results that tend to have more complete objects.

Ranked #2 on Weakly-Supervised Semantic Segmentation on PASCAL VOC 2012 train

Weakly-supervised Learning Weakly supervised Semantic Segmentation +1

115

Paper
Code

Generalizable Neural Voxels for Fast Human Radiance Fields

no code implementations • 27 Mar 2023 • Taoran Yi, Jiemin Fang, Xinggang Wang, Wenyu Liu

Rendering moving human bodies at free viewpoints only from a monocular video is quite a challenging problem.

Novel View Synthesis

Paper
Add Code

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian

A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.

307

Paper
Code

Temporally Efficient Vision Transformer for Video Instance Segmentation

3 code implementations • CVPR 2022 • Shusheng Yang, Xinggang Wang, Yu Li, Yuxin Fang, Jiemin Fang, Wenyu Liu, Xun Zhao, Ying Shan

To effectively and efficiently model the crucial temporal information within a video clip, we propose a Temporally Efficient Vision Transformer (TeViT) for video instance segmentation (VIS).

Ranked #35 on Video Instance Segmentation on OVIS validation

Instance Segmentation Semantic Segmentation +1

400

Paper
Code

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

The past year has witnessed a rapid development of masked image modeling (MIM).

Paper
Code

Exploring Complicated Search Spaces with Interleaving-Free Sampling

no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian

In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.

Neural Architecture Search

Paper
Add Code

NeuSample: Neural Sample Field for Efficient View Synthesis

1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.

Paper
Code

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Ranked #63 on Semantic Segmentation on Cityscapes test

Representation Learning Semantic Segmentation

Paper
Code

Hierarchical Aggregation for 3D Instance Segmentation

1 code implementation • ICCV 2021 • Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang

Instance segmentation on point clouds is a fundamental task in 3D scene perception.

Ranked #4 on 3D Instance Segmentation on S3DIS (mCov metric, using extra training data)

3D Instance Segmentation Clustering +2

163

Paper
Code

Bag of Instances Aggregation Boosts Self-supervised Distillation

1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.

Contrastive Learning Self-Supervised Learning

Paper
Code

You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection

2 code implementations • NeurIPS 2021 • Yuxin Fang, Bencheng Liao, Xinggang Wang, Jiemin Fang, Jiyang Qi, Rui Wu, Jianwei Niu, Wenyu Liu

Can Transformer perform 2D object- and region-level recognition from a pure sequence-to-sequence perspective with minimal knowledge about the 2D spatial structure?

Ranked #30 on Object Detection on COCO-O

Object object-detection +1

124,527

Paper
Code

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

3 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Transformers have offered a new methodology of designing neural networks for visual recognition.

Image Classification object-detection +1

Paper
Code

ResizeMix: Mixing Data with Preserved Object Information and True Labels

1 code implementation • 21 Dec 2020 • Jie Qin, Jiemin Fang, Qian Zhang, Wenyu Liu, Xingang Wang, Xinggang Wang

Especially, CutMix uses a simple but effective method to improve the classifiers by randomly cropping a patch from one image and pasting it on another image.

Data Augmentation Image Classification +3

567

Paper
Code

EfficientPose: Efficient Human Pose Estimation with Neural Architecture Search

1 code implementation • 13 Dec 2020 • Wenqiang Zhang, Jiemin Fang, Xinggang Wang, Wenyu Liu

Human pose estimation from image and video is a vital task in many multimedia applications.

Image Classification Neural Architecture Search +1

Paper
Code

FNA++: Fast Network Adaptation via Parameter Remapping and Architecture Search

2 code implementations • 21 Jun 2020 • Jiemin Fang, Yuzhu Sun, Qian Zhang, Kangjian Peng, Yuan Li, Wenyu Liu, Xinggang Wang

In this paper, we propose a Fast Network Adaptation (FNA++) method, which can adapt both the architecture and parameters of a seed network (e. g. an ImageNet pre-trained network) to become a network with different depths, widths, or kernel sizes via a parameter remapping technique, making it possible to use NAS for segmentation and detection tasks a lot more efficiently.

Image Classification Neural Architecture Search +5

164

Paper
Code

Fast Neural Network Adaptation via Parameter Remapping and Architecture Search

no code implementations • ICLR 2020 • Jiemin Fang, Yuzhu Sun, Kangjian Peng, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

In our experiments, we conduct FNA on MobileNetV2 to obtain new networks for both segmentation and detection that clearly out-perform existing networks designed both manually and by NAS.

Image Classification Neural Architecture Search +4