Search Results for author: Yikai Wang

Found 55 papers, 32 papers with code

Small Scale Data-Free Knowledge Distillation

1 code implementation CVPR 2024 He Liu, Yikai Wang, Huaping Liu, Fuchun Sun, Anbang Yao

In this line of research, existing methods typically follow an inversion-and-distillation paradigm in which a generative adversarial network on-the-fly trained with the guidance of the pre-trained teacher network is used to synthesize a large-scale sample set for knowledge distillation.

Data-free Knowledge Distillation Generative Adversarial Network +2

Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models

no code implementations2 Jun 2024 Wenqiang Sun, Zhengyi Wang, Shuo Chen, Yikai Wang, Zilong Chen, Jun Zhu, Jun Zhang

We first analyze the role of triplanes in feed-forward methods and find that the inconsistent multi-view images introduce high-frequency artifacts on triplanes, leading to low-quality 3D meshes.

3D StreetUnveiler with Semantic-Aware 2DGS

no code implementations28 May 2024 Jingwei Xu, Yikai Wang, Yiqun Zhao, Yanwei Fu, Shenghua Gao

The mesh representation of the empty street can be extracted for further applications.

3D Inpainting Autonomous Driving

PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance

no code implementations27 May 2024 Haohan Weng, Yikai Wang, Tong Zhang, C. L. Philip Chen, Jun Zhu

Generating compact and sharply detailed 3D meshes poses a significant challenge for current 3D generative models.

Towards Global Optimal Visual In-Context Learning Prompt Selection

no code implementations24 May 2024 Chengming Xu, Chen Liu, Yikai Wang, Yanwei Fu

Visual In-Context Learning (VICL) is a prevailing way to transfer visual foundation models to new tasks by leveraging contextual information contained in in-context examples to enhance learning and prediction of query sample.

Colorization Foreground Segmentation +4

FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

1 code implementation1 Apr 2024 Ruowen Zhao, Zhengyi Wang, Yikai Wang, Zihan Zhou, Jun Zhu

However, since directly reconstructing triangle meshes from multi-view images is challenging, most methodologies opt to an implicit representation (such as NeRF) during the sparse-view reconstruction and acquire the target mesh by a post-processing extraction.

3D Generation Image to 3D

Equivariant Local Reference Frames for Unsupervised Non-rigid Point Cloud Shape Correspondence

no code implementations1 Apr 2024 Ling Wang, Runfa Chen, Yikai Wang, Fuchun Sun, Xinzhou Wang, Sun Kai, Guangyuan Fu, Jianwei Zhang, Wenbing Huang

Based on the assumption of local rigidity, one solution for reducing complexity is to decompose the overall shape into independent local regions using Local Reference Frames (LRFs) that are invariant to SE(3) transformations.

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

no code implementations27 Mar 2024 Jingyang Huo, Yikai Wang, Xuelin Qian, Yun Wang, Chong Li, Jianfeng Feng, Yanwei Fu

Recent fMRI-to-image approaches mainly focused on associating fMRI signals with specific conditions of pre-trained diffusion models.

Image Reconstruction

DreamReward: Text-to-3D Generation with Human Preference

no code implementations21 Mar 2024 Junliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhu

Building upon the 3D reward model, we finally perform theoretical analysis and present the Reward3D Feedback Learning (DreamFL), a direct tuning algorithm to optimize the multi-view diffusion models with a redefined scorer.

3D Generation Text to 3D +1

Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding

1 code implementation15 Mar 2024 Pengkun Liu, Yikai Wang, Fuchun Sun, Jiafang Li, Hang Xiao, Hongxiang Xue, Xinzhou Wang

As a result, with a single image CLIP embedding, Isotropic3D is capable of generating multi-view mutually consistent images and also a 3D model with more symmetrical and neat content, well-proportioned geometry, rich colored texture, and less distortion compared with existing image-to-3D methods while still preserving the similarity to the reference image to a large extent.

3D Generation Image to 3D +1

V3D: Video Diffusion Models are Effective 3D Generators

1 code implementation11 Mar 2024 Zilong Chen, Yikai Wang, Feng Wang, Zhengyi Wang, Huaping Liu

To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator.

3D Generation Novel View Synthesis

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

no code implementations8 Mar 2024 Zhengyi Wang, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu

In this work, we present the Convolutional Reconstruction Model (CRM), a high-fidelity feed-forward single image-to-3D generative model.

Image to 3D

Repositioning the Subject within Image

1 code implementation30 Jan 2024 Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, YiFan Li, xiangyang xue, Yanwei Fu

Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task.

Image Generation Image Manipulation

Towards Context-Stable and Visual-Consistent Image Inpainting

1 code implementation8 Dec 2023 Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks.

Decoder Image Inpainting

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

no code implementations6 Dec 2023 Xinzhou Wang, Yikai Wang, Junliang Ye, Zhengyi Wang, Fuchun Sun, Pengkun Liu, Ling Wang, Kai Sun, Xintong Wang, Bin He

Extensive experiments demonstrate the capability of our method in generating high-flexibility text-guided 3D models from the monocular video, while also showing improved reconstruction performance over existing non-rigid reconstruction methods.

3D Generation Denoising +1

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

1 code implementation6 Nov 2023 Jianhui Li, Shilong Liu, Zidong Liu, Yikai Wang, Kaiwen Zheng, Jinghui Xu, Jianmin Li, Jun Zhu

With the success of Neural Radiance Field (NeRF) in 3D-aware portrait editing, a variety of works have achieved promising results regarding both quality and 3D consistency.

Text-to-3D using Gaussian Splatting

1 code implementation CVPR 2024 Zilong Chen, Feng Wang, Yikai Wang, Huaping Liu

Specifically, our method adopts a progressive optimization strategy, which includes a geometry optimization stage and an appearance refinement stage.

3D Generation Text to 3D

Coarse-to-Fine Amodal Segmentation with Shape Prior

1 code implementation ICCV 2023 Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

To address this issue, we propose a convolution refine module to inject fine-grained information and provide a more precise amodal object segmentation based on visual features and coarse-predicted segmentation.

Object Segmentation +1

Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos

no code implementations ICCV 2023 Yikai Wang, Yinpeng Dong, Fuchun Sun, Xiao Yang

The key idea of our method, Root Pose Decomposition (RPD), is to maintain a per-frame root pose transformation, meanwhile building a dense field with local transformations to rectify the root pose.

3D Reconstruction Object

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

2 code implementations9 Aug 2023 Peike Li, BoYu Chen, Yao Yao, Yikai Wang, Allen Wang, Alex Wang

Despite the task's significance, prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization.

Computational Efficiency In-Context Learning +2

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

2 code implementations NeurIPS 2023 Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i. e., $7. 5$).

3D Generation Diversity +1

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

3 code implementations CVPR 2024 Chenjie Cao, Yunuo Cai, Qiaole Dong, Yikai Wang, Yanwei Fu

As an exemplar, we leverage LeftRefill to address two different challenges: reference-guided inpainting and novel view synthesis, based on the pre-trained StableDiffusion.

Image Inpainting Image Manipulation +2

REMAST: Real-time Emotion-based Music Arrangement with Soft Transition

1 code implementation14 May 2023 ZiHao Wang, Le Ma, Chen Zhang, Bo Han, Yunfei Xu, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang

Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.

Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild

no code implementations21 Apr 2023 Yikai Wang, Zheyuan Jiang, Jianyu Chen

In this paper, we propose a new framework for learning robust, agile and natural legged locomotion skills over challenging terrain.


Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition

1 code implementation CVPR 2023 Xiao Yang, Chang Liu, Longlong Xu, Yikai Wang, Yinpeng Dong, Ning Chen, Hang Su, Jun Zhu

The goal of this work is to develop a more reliable technique that can carry out an end-to-end evaluation of adversarial robustness for commercial systems.

Adversarial Robustness Face Recognition

Joint fMRI Decoding and Encoding with Latent Embedding Alignment

no code implementations26 Mar 2023 Xuelin Qian, Yikai Wang, Yanwei Fu, Xinwei Sun, xiangyang xue, Jianfeng Feng

Our Latent Embedding Alignment (LEA) model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.

Image Generation

Compacting Binary Neural Networks by Sparse Kernel Selection

no code implementations CVPR 2023 Yikai Wang, Wenbing Huang, Yinpeng Dong, Fuchun Sun, Anbang Yao

Binary Neural Network (BNN) represents convolution weights with 1-bit values, which enhances the efficiency of storage and computation.


Entity-Level Text-Guided Image Manipulation

1 code implementation22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

1 code implementation2 Jan 2023 Yikai Wang, Yanwei Fu, Xinwei Sun

While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data.

Learning with noisy labels regression

Bridged Transformer for Vision and Point Cloud 3D Object Detection

2 code implementations CVPR 2022 Yikai Wang, TengQi Ye, Lele Cao, Wenbing Huang, Fuchun Sun, Fengxiang He, DaCheng Tao

Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises.

3D Object Detection Object +1

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

no code implementations13 Sep 2022 ZiHao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang

In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias.

Multimodal Token Fusion for Vision Transformers

11 code implementations journal 2022 Yikai Wang, Xinghao Chen, Lele Cao, Wenbing Huang, Fuchun Sun, Yunhe Wang

Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked to handle input sources like images.

Ranked #3 on Semantic Segmentation on SUN-RGBD (using extra training data)

3D Object Detection Image-to-Image Translation +2

Sound Adversarial Audio-Visual Navigation

1 code implementation ICLR 2022 Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu

In this work, we design an acoustically complex environment in which, besides the target sound, there exists a sound attacker playing a zero-sum game with the agent.

Navigate Visual Navigation

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

1 code implementation4 Dec 2021 Yikai Wang, Fuchun Sun, Wenbing Huang, Fengxiang He, DaCheng Tao

For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning.

Semantic Segmentation

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

1 code implementation ICCV 2021 Yikai Wang, Yi Yang, Fuchun Sun, Anbang Yao

In the low-bit quantization field, training Binary Neural Networks (BNNs) is the extreme solution to ease the deployment of deep models on resource-constrained devices, having the lowest storage cost and significantly cheaper bit-wise operations compared to 32-bit floating-point counterparts.


Relative Instance Credibility Inference for Learning with Noisy Labels

no code implementations29 Sep 2021 Yikai Wang, Xinwei Sun, Yanwei Fu

Specifically, we re-purpose a sparse linear model with incidental parameters as a unified Relative Instance Credibility Inference (RICI) framework, which will detect and remove outliers in the forward pass of each mini-batch and use the remaining instances to train the network.

Learning with noisy labels

Elastic Tactile Simulation Towards Tactile-Visual Perception

2 code implementations11 Aug 2021 Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun, Chang Li

By contrast, EIP models the tactile sensor as a group of coordinated particles, and the elastic property is applied to regulate the deformation of particles during contact.

Explicit Connection Distillation

no code implementations1 Jan 2021 Lujun Li, Yikai Wang, Anbang Yao, Yi Qian, Xiao Zhou, Ke He

In this paper, we present Explicit Connection Distillation (ECD), a new KD framework, which addresses the knowledge distillation problem in a novel perspective of bridging dense intermediate feature connections between a student network and its corresponding teacher generated automatically in the training, achieving knowledge transfer goal via direct cross-network layer-to-layer gradients propagation, without need to define complex distillation losses and assume a pre-trained teacher model to be available.

Image Classification Knowledge Distillation +1

Blind signal decomposition of various word embeddings based on join and individual variance explained

no code implementations30 Nov 2020 Yikai Wang, Weijian Li

We found that by mapping different word embeddings into the joint component, sentiment performance can be greatly improved for the original word embeddings with lower performance.

Dimensionality Reduction Sentiment Analysis +1

Elastic Interaction of Particles for Robotic Tactile Simulation

no code implementations23 Nov 2020 Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun

At its core, EIP models the tactile sensor as a group of coordinated particles, and the elastic theory is applied to regulate the deformation of particles during the contact process.

Deep Multimodal Fusion by Channel Exchanging

1 code implementation NeurIPS 2020 Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, Junzhou Huang

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications.

Image-to-Image Translation Semantic Segmentation +1

LOCUS: A Novel Decomposition Method for Brain Network Connectivity Matrices using Low-rank Structure with Uniform Sparsity

no code implementations19 Aug 2020 Yikai Wang, Ying Guo

In this paper, we propose a novel blind source separation method with low-rank structure and uniform sparsity (LOCUS) as a fully data-driven decomposition method for network measures.

blind source separation

Resolution Switchable Networks for Runtime Efficient Image Recognition

1 code implementation ECCV 2020 Yikai Wang, Fuchun Sun, Duo Li, Anbang Yao

We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference.

Knowledge Distillation Quantization

How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning

2 code implementations15 Jul 2020 Yikai Wang, Li Zhang, Yuan YAO, Yanwei Fu

We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.

Data Augmentation Few-Shot Learning

Instance Credibility Inference for Few-Shot Learning

1 code implementation CVPR 2020 Yikai Wang, Chengming Xu, Chen Liu, Li Zhang, Yanwei Fu

To measure the credibility of each pseudo-labeled instance, we then propose to solve another linear regression hypothesis by increasing the sparsity of the incidental parameters and rank the pseudo-labeled instances with their sparsity degree.

Data Augmentation Few-Shot Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.