Search Results for author: Yikai Wang

Found 48 papers, 30 papers with code

FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

1 code implementation • 1 Apr 2024 • Ruowen Zhao, Zhengyi Wang, Yikai Wang, Zihan Zhou, Jun Zhu

However, due to the challenge of directly deforming the mesh representation to approach the target topology, most methodologies learn an implicit representation (such as NeRF) during the sparse-view reconstruction and acquire the target mesh by a post-processing extraction.

3D Generation Image to 3D

Paper
Code

Equivariant Local Reference Frames for Unsupervised Non-rigid Point Cloud Shape Correspondence

no code implementations • 1 Apr 2024 • Ling Wang, Runfa Chen, Yikai Wang, Fuchun Sun, Xinzhou Wang, Sun Kai, Guangyuan Fu, Jianwei Zhang, Wenbing Huang

Based on the assumption of local rigidity, one solution for reducing complexity is to decompose the overall shape into independent local regions using Local Reference Frames (LRFs) that are invariant to SE(3) transformations.

Paper
Add Code

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

no code implementations • 27 Mar 2024 • Jingyang Huo, Yikai Wang, Xuelin Qian, Yun Wang, Chong Li, Jianfeng Feng, Yanwei Fu

Recent fMRI-to-image approaches mainly focused on associating fMRI signals with specific conditions of pre-trained diffusion models.

Image Reconstruction

Paper
Add Code

DreamReward: Text-to-3D Generation with Human Preference

no code implementations • 21 Mar 2024 • Junliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhu

Building upon the 3D reward model, we finally perform theoretical analysis and present the Reward3D Feedback Learning (DreamFL), a direct tuning algorithm to optimize the multi-view diffusion models with a redefined scorer.

3D Generation Text to 3D +1

Paper
Add Code

Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding

1 code implementation • 15 Mar 2024 • Pengkun Liu, Yikai Wang, Fuchun Sun, Jiafang Li, Hang Xiao, Hongxiang Xue, Xinzhou Wang

As a result, with a single image CLIP embedding, Isotropic3D is capable of generating multi-view mutually consistent images and also a 3D model with more symmetrical and neat content, well-proportioned geometry, rich colored texture, and less distortion compared with existing image-to-3D methods while still preserving the similarity to the reference image to a large extent.

3D Generation Image to 3D +1

Paper
Code

V3D: Video Diffusion Models are Effective 3D Generators

2 code implementations • 11 Mar 2024 • Zilong Chen, Yikai Wang, Feng Wang, Zhengyi Wang, Huaping Liu

To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator.

3D Generation Novel View Synthesis

404

Paper
Code

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

no code implementations • 8 Mar 2024 • Zhengyi Wang, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu

In this work, we present the Convolutional Reconstruction Model (CRM), a high-fidelity feed-forward single image-to-3D generative model.

Image to 3D

Paper
Add Code

Repositioning the Subject within Image

1 code implementation • 30 Jan 2024 • Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, YiFan Li, xiangyang xue, Yanwei Fu

Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task.

Image Generation Image Manipulation

Paper
Code

Towards Context-Stable and Visual-Consistent Image Inpainting

1 code implementation • 8 Dec 2023 • Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks.

Decoder Image Inpainting

Paper
Code

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

no code implementations • 6 Dec 2023 • Xinzhou Wang, Yikai Wang, Junliang Ye, Zhengyi Wang, Fuchun Sun, Pengkun Liu, Ling Wang, Kai Sun, Xintong Wang, Bin He

Extensive experiments demonstrate the capability of our method in generating high-flexibility text-guided 3D models from the monocular video, while also showing improved reconstruction performance over existing non-rigid reconstruction methods.

3D Generation Denoising +1

Paper
Add Code

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

1 code implementation • 24 Nov 2023 • YiWen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin

3D editing plays a crucial role in many areas such as gaming and virtual reality.

901

Paper
Code

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

1 code implementation • 6 Nov 2023 • Jianhui Li, Shilong Liu, Zidong Liu, Yikai Wang, Kaiwen Zheng, Jinghui Xu, Jianmin Li, Jun Zhu

With the success of Neural Radiance Field (NeRF) in 3D-aware portrait editing, a variety of works have achieved promising results regarding both quality and 3D consistency.

Paper
Code

Text-to-3D using Gaussian Splatting

1 code implementation • 28 Sep 2023 • Zilong Chen, Feng Wang, Yikai Wang, Huaping Liu

Specifically, our method adopts a progressive optimization strategy, which includes a geometry optimization stage and an appearance refinement stage.

3D Generation Text to 3D

705

Paper
Code

Coarse-to-Fine Amodal Segmentation with Shape Prior

1 code implementation • ICCV 2023 • Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

To address this issue, we propose a convolution refine module to inject fine-grained information and provide a more precise amodal object segmentation based on visual features and coarse-predicted segmentation.

Object Segmentation +1

Paper
Code

Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos

no code implementations • ICCV 2023 • Yikai Wang, Yinpeng Dong, Fuchun Sun, Xiao Yang

The key idea of our method, Root Pose Decomposition (RPD), is to maintain a per-frame root pose transformation, meanwhile building a dense field with local transformations to rectify the root pose.

3D Reconstruction Object

Paper
Add Code

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

2 code implementations • 9 Aug 2023 • Peike Li, BoYu Chen, Yao Yao, Yikai Wang, Allen Wang, Alex Wang

Despite the task's significance, prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization.

Ranked #1 on Text-to-Music Generation on MusicCaps

Computational Efficiency In-Context Learning +2

Paper
Code

Human-imperceptible, Machine-recognizable Images

1 code implementation • 6 Jun 2023 • Fusheng Hao, Fengxiang He, Yikai Wang, Fuxiang Wu, Jing Zhang, Jun Cheng, DaCheng Tao

Massive human-related data is collected to train neural networks for computer vision tasks.

Image Classification object-detection +2

Paper
Code

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

2 code implementations • NeurIPS 2023 • Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i. e., $7. 5$).

3D Generation Text to 3D

5,735

Paper
Code

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

3 code implementations • 19 May 2023 • Chenjie Cao, Yunuo Cai, Qiaole Dong, Yikai Wang, Yanwei Fu

As an exemplar, we leverage LeftRefill to address two different challenges: reference-guided inpainting and novel view synthesis, based on the pre-trained StableDiffusion.

Image Inpainting Image Manipulation +2

Paper
Code

REMAST: Real-time Emotion-based Music Arrangement with Soft Transition

1 code implementation • 14 May 2023 • ZiHao Wang, Le Ma, Chen Zhang, Bo Han, Yunfei Xu, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang

Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.

Paper
Code

Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild

no code implementations • 21 Apr 2023 • Yikai Wang, Zheyuan Jiang, Jianyu Chen

In this paper, we propose a new framework for learning robust, agile and natural legged locomotion skills over challenging terrain.

reinforcement-learning

Paper
Add Code

Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition

1 code implementation • CVPR 2023 • Xiao Yang, Chang Liu, Longlong Xu, Yikai Wang, Yinpeng Dong, Ning Chen, Hang Su, Jun Zhu

The goal of this work is to develop a more reliable technique that can carry out an end-to-end evaluation of adversarial robustness for commercial systems.

Adversarial Robustness Face Recognition

Paper
Code

Joint fMRI Decoding and Encoding with Latent Embedding Alignment

no code implementations • 26 Mar 2023 • Xuelin Qian, Yikai Wang, Yanwei Fu, Xinwei Sun, xiangyang xue, Jianfeng Feng

Our Latent Embedding Alignment (LEA) model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.

Image Generation

Paper
Add Code

Compacting Binary Neural Networks by Sparse Kernel Selection

no code implementations • CVPR 2023 • Yikai Wang, Wenbing Huang, Yinpeng Dong, Fuchun Sun, Anbang Yao

Binary Neural Network (BNN) represents convolution weights with 1-bit values, which enhances the efficiency of storage and computation.

Binarization

Paper
Add Code

Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving

no code implementations • 20 Mar 2023 • Yinpeng Dong, Caixin Kang, Jinlai Zhang, Zijian Zhu, Yikai Wang, Xiao Yang, Hang Su, Xingxing Wei, Jun Zhu

3D object detection is an important task in autonomous driving to perceive the surroundings.

Autonomous Driving Benchmarking +3

Paper
Add Code

Entity-Level Text-Guided Image Manipulation

1 code implementation • 22 Feb 2023 • Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

Paper
Code

Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

1 code implementation • 2 Jan 2023 • Yikai Wang, Yanwei Fu, Xinwei Sun

While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data.

Ranked #1 on Learning with noisy labels on Clothing1M

Learning with noisy labels regression

Paper
Code

Benchmarking Robustness of 3D Object Detection to Common Corruptions

1 code implementation • CVPR 2023 • Yinpeng Dong, Caixin Kang, Jinlai Zhang, Zijian Zhu, Yikai Wang, Xiao Yang, Hang Su, Xingxing Wei, Jun Zhu

3D object detection is an important task in autonomous driving to perceive the surroundings.

3D Object Detection Autonomous Driving +2

103

Paper
Code

Bridged Transformer for Vision and Point Cloud 3D Object Detection

2 code implementations • CVPR 2022 • Yikai Wang, TengQi Ye, Lele Cao, Wenbing Huang, Fuchun Sun, Fengxiang He, DaCheng Tao

Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises.

3D Object Detection Object +1

Paper
Code

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

no code implementations • 13 Sep 2022 • ZiHao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang

In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias.

Paper
Add Code

A Simple Test-Time Method for Out-of-Distribution Detection

no code implementations • 17 Jul 2022 • Ke Fan, Yikai Wang, Qian Yu, Da Li, Yanwei Fu

In contrast, this paper proposes a simple Test-time Linear Training (ETLT) method for OOD detection.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Add Code

Multimodal Token Fusion for Vision Transformers

11 code implementations • journal 2022 • Yikai Wang, Xinghao Chen, Lele Cao, Wenbing Huang, Fuchun Sun, Yunhe Wang

Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked to handle input sources like images.

Ranked #1 on Semantic Segmentation on SUN-RGBD (using extra training data)

3D Object Detection Image-to-Image Translation +2

835

Paper
Code

Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels

1 code implementation • CVPR 2022 • Yikai Wang, Xinwei Sun, Yanwei Fu

Noisy training set usually leads to the degradation of generalization and robustness of neural networks.

Ranked #4 on Learning with noisy labels on Clothing1M

Learning with noisy labels regression

Paper
Code

Sound Adversarial Audio-Visual Navigation

1 code implementation • ICLR 2022 • Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu

In this work, we design an acoustically complex environment in which, besides the target sound, there exists a sound attacker playing a zero-sum game with the agent.

Navigate Visual Navigation

Paper
Code

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

1 code implementation • 4 Dec 2021 • Yikai Wang, Fuchun Sun, Wenbing Huang, Fengxiang He, DaCheng Tao

For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning.

Ranked #7 on Semantic Segmentation on LLRGBD-synthetic

Semantic Segmentation

278

Paper
Code

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

1 code implementation • ICCV 2021 • Yikai Wang, Yi Yang, Fuchun Sun, Anbang Yao

In the low-bit quantization field, training Binary Neural Networks (BNNs) is the extreme solution to ease the deployment of deep models on resource-constrained devices, having the lowest storage cost and significantly cheaper bit-wise operations compared to 32-bit floating-point counterparts.

Quantization

Paper
Code

Relative Instance Credibility Inference for Learning with Noisy Labels

no code implementations • 29 Sep 2021 • Yikai Wang, Xinwei Sun, Yanwei Fu

Specifically, we re-purpose a sparse linear model with incidental parameters as a unified Relative Instance Credibility Inference (RICI) framework, which will detect and remove outliers in the forward pass of each mini-batch and use the remaining instances to train the network.

Learning with noisy labels

Paper
Add Code

Elastic Tactile Simulation Towards Tactile-Visual Perception

2 code implementations • 11 Aug 2021 • Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun, Chang Li

By contrast, EIP models the tactile sensor as a group of coordinated particles, and the elastic property is applied to regulate the deformation of particles during contact.

Paper
Code

Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion

1 code implementation • 11 Aug 2021 • Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao

We propose a compact and effective framework to fuse multimodal features at multiple layers in a single network.

Ranked #42 on Semantic Segmentation on NYU Depth v2

Representation Learning Semantic Segmentation +1

Paper
Code

Explicit Connection Distillation

no code implementations • 1 Jan 2021 • Lujun Li, Yikai Wang, Anbang Yao, Yi Qian, Xiao Zhou, Ke He

In this paper, we present Explicit Connection Distillation (ECD), a new KD framework, which addresses the knowledge distillation problem in a novel perspective of bridging dense intermediate feature connections between a student network and its corresponding teacher generated automatically in the training, achieving knowledge transfer goal via direct cross-network layer-to-layer gradients propagation, without need to define complex distillation losses and assume a pre-trained teacher model to be available.

Image Classification Knowledge Distillation +1

Paper
Add Code

Blind signal decomposition of various word embeddings based on join and individual variance explained

no code implementations • 30 Nov 2020 • Yikai Wang, Weijian Li

We found that by mapping different word embeddings into the joint component, sentiment performance can be greatly improved for the original word embeddings with lower performance.

Dimensionality Reduction Sentiment Analysis +1

Paper
Add Code

Elastic Interaction of Particles for Robotic Tactile Simulation

no code implementations • 23 Nov 2020 • Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun

At its core, EIP models the tactile sensor as a group of coordinated particles, and the elastic theory is applied to regulate the deformation of particles during the contact process.

Paper
Add Code

Deep Multimodal Fusion by Channel Exchanging

1 code implementation • NeurIPS 2020 • Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, Junzhou Huang

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications.

Image-to-Image Translation Semantic Segmentation +1

278

Paper
Code

LOCUS: A Novel Decomposition Method for Brain Network Connectivity Matrices using Low-rank Structure with Uniform Sparsity

no code implementations • 19 Aug 2020 • Yikai Wang, Ying Guo

In this paper, we propose a novel blind source separation method with low-rank structure and uniform sparsity (LOCUS) as a fully data-driven decomposition method for network measures.

blind source separation

Paper
Add Code

Resolution Switchable Networks for Runtime Efficient Image Recognition

1 code implementation • ECCV 2020 • Yikai Wang, Fuchun Sun, Duo Li, Anbang Yao

We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference.

Knowledge Distillation Quantization

Paper
Code

How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning

2 code implementations • 15 Jul 2020 • Yikai Wang, Li Zhang, Yuan YAO, Yanwei Fu

We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.

Data Augmentation Few-Shot Learning

Paper
Code

Instance Credibility Inference for Few-Shot Learning

1 code implementation • CVPR 2020 • Yikai Wang, Chengming Xu, Chen Liu, Li Zhang, Yanwei Fu

To measure the credibility of each pseudo-labeled instance, we then propose to solve another linear regression hypothesis by increasing the sparsity of the incidental parameters and rank the pseudo-labeled instances with their sparsity degree.

Ranked #2 on Few-Shot Image Classification on Dirichlet Tiered-Imagenet (5-way, 1-shot)

Data Augmentation Few-Shot Image Classification +2

Paper
Code

Regularized Adversarial Sampling and Deep Time-aware Attention for Click-Through Rate Prediction

no code implementations • 3 Nov 2019 • Yikai Wang, Liang Zhang, Quanyu Dai, Fuchun Sun, Bo Zhang, Yang He, Weipeng Yan, Yongjun Bao

In deep CTR models, exploiting users' historical data is essential for learning users' behaviors and interests.

Click-Through Rate Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.