Search Results for author: Yikai Wang

Found 45 papers, 28 papers with code

Instance Credibility Inference for Few-Shot Learning

1 code implementation CVPR 2020 Yikai Wang, Chengming Xu, Chen Liu, Li Zhang, Yanwei Fu

To measure the credibility of each pseudo-labeled instance, we then propose to solve another linear regression hypothesis by increasing the sparsity of the incidental parameters and rank the pseudo-labeled instances with their sparsity degree.

Data Augmentation Few-Shot Image Classification +2

How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning

2 code implementations15 Jul 2020 Yikai Wang, Li Zhang, Yuan YAO, Yanwei Fu

We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.

Data Augmentation Few-Shot Learning

Resolution Switchable Networks for Runtime Efficient Image Recognition

1 code implementation ECCV 2020 Yikai Wang, Fuchun Sun, Duo Li, Anbang Yao

We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference.

Knowledge Distillation Quantization

LOCUS: A Novel Decomposition Method for Brain Network Connectivity Matrices using Low-rank Structure with Uniform Sparsity

no code implementations19 Aug 2020 Yikai Wang, Ying Guo

In this paper, we propose a novel blind source separation method with low-rank structure and uniform sparsity (LOCUS) as a fully data-driven decomposition method for network measures.

blind source separation

Deep Multimodal Fusion by Channel Exchanging

1 code implementation NeurIPS 2020 Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, Junzhou Huang

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications.

Image-to-Image Translation Semantic Segmentation +1

Elastic Interaction of Particles for Robotic Tactile Simulation

no code implementations23 Nov 2020 Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun

At its core, EIP models the tactile sensor as a group of coordinated particles, and the elastic theory is applied to regulate the deformation of particles during the contact process.

Blind signal decomposition of various word embeddings based on join and individual variance explained

no code implementations30 Nov 2020 Yikai Wang, Weijian Li

We found that by mapping different word embeddings into the joint component, sentiment performance can be greatly improved for the original word embeddings with lower performance.

Dimensionality Reduction Sentiment Analysis +1

Explicit Connection Distillation

no code implementations1 Jan 2021 Lujun Li, Yikai Wang, Anbang Yao, Yi Qian, Xiao Zhou, Ke He

In this paper, we present Explicit Connection Distillation (ECD), a new KD framework, which addresses the knowledge distillation problem in a novel perspective of bridging dense intermediate feature connections between a student network and its corresponding teacher generated automatically in the training, achieving knowledge transfer goal via direct cross-network layer-to-layer gradients propagation, without need to define complex distillation losses and assume a pre-trained teacher model to be available.

Image Classification Knowledge Distillation +1

Elastic Tactile Simulation Towards Tactile-Visual Perception

2 code implementations11 Aug 2021 Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun, Chang Li

By contrast, EIP models the tactile sensor as a group of coordinated particles, and the elastic property is applied to regulate the deformation of particles during contact.

Relative Instance Credibility Inference for Learning with Noisy Labels

no code implementations29 Sep 2021 Yikai Wang, Xinwei Sun, Yanwei Fu

Specifically, we re-purpose a sparse linear model with incidental parameters as a unified Relative Instance Credibility Inference (RICI) framework, which will detect and remove outliers in the forward pass of each mini-batch and use the remaining instances to train the network.

Learning with noisy labels

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

1 code implementation ICCV 2021 Yikai Wang, Yi Yang, Fuchun Sun, Anbang Yao

In the low-bit quantization field, training Binary Neural Networks (BNNs) is the extreme solution to ease the deployment of deep models on resource-constrained devices, having the lowest storage cost and significantly cheaper bit-wise operations compared to 32-bit floating-point counterparts.

Quantization

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

1 code implementation4 Dec 2021 Yikai Wang, Fuchun Sun, Wenbing Huang, Fengxiang He, DaCheng Tao

For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning.

Semantic Segmentation

Sound Adversarial Audio-Visual Navigation

1 code implementation ICLR 2022 Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu

In this work, we design an acoustically complex environment in which, besides the target sound, there exists a sound attacker playing a zero-sum game with the agent.

Navigate Visual Navigation

Multimodal Token Fusion for Vision Transformers

11 code implementations journal 2022 Yikai Wang, Xinghao Chen, Lele Cao, Wenbing Huang, Fuchun Sun, Yunhe Wang

Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked to handle input sources like images.

3D Object Detection Image-to-Image Translation +2

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

no code implementations13 Sep 2022 ZiHao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang

In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias.

Bridged Transformer for Vision and Point Cloud 3D Object Detection

2 code implementations CVPR 2022 Yikai Wang, TengQi Ye, Lele Cao, Wenbing Huang, Fuchun Sun, Fengxiang He, DaCheng Tao

Recently, there is a trend of leveraging multiple sources of input data, such as complementing the 3D point cloud with 2D images that often have richer color and fewer noises.

3D Object Detection Object +1

Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

1 code implementation2 Jan 2023 Yikai Wang, Yanwei Fu, Xinwei Sun

While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data.

Learning with noisy labels regression

Entity-Level Text-Guided Image Manipulation

1 code implementation22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

Compacting Binary Neural Networks by Sparse Kernel Selection

no code implementations CVPR 2023 Yikai Wang, Wenbing Huang, Yinpeng Dong, Fuchun Sun, Anbang Yao

Binary Neural Network (BNN) represents convolution weights with 1-bit values, which enhances the efficiency of storage and computation.

Binarization

Joint fMRI Decoding and Encoding with Latent Embedding Alignment

no code implementations26 Mar 2023 Xuelin Qian, Yikai Wang, Yanwei Fu, Xinwei Sun, xiangyang xue, Jianfeng Feng

Our Latent Embedding Alignment (LEA) model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.

Image Generation

Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition

1 code implementation CVPR 2023 Xiao Yang, Chang Liu, Longlong Xu, Yikai Wang, Yinpeng Dong, Ning Chen, Hang Su, Jun Zhu

The goal of this work is to develop a more reliable technique that can carry out an end-to-end evaluation of adversarial robustness for commercial systems.

Adversarial Robustness Face Recognition

Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild

no code implementations21 Apr 2023 Yikai Wang, Zheyuan Jiang, Jianyu Chen

In this paper, we propose a new framework for learning robust, agile and natural legged locomotion skills over challenging terrain.

reinforcement-learning

REMAST: Real-time Emotion-based Music Arrangement with Soft Transition

1 code implementation14 May 2023 ZiHao Wang, Le Ma, Chen Zhang, Bo Han, Yunfei Xu, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang

Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

3 code implementations19 May 2023 Chenjie Cao, Yunuo Cai, Qiaole Dong, Yikai Wang, Yanwei Fu

As an exemplar, we leverage LeftRefill to address two different challenges: reference-guided inpainting and novel view synthesis, based on the pre-trained StableDiffusion.

Image Inpainting Image Manipulation +2

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

2 code implementations NeurIPS 2023 Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i. e., $7. 5$).

Text to 3D

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

2 code implementations9 Aug 2023 Peike Li, BoYu Chen, Yao Yao, Yikai Wang, Allen Wang, Alex Wang

Despite the task's significance, prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization.

Computational Efficiency In-Context Learning +2

Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos

no code implementations ICCV 2023 Yikai Wang, Yinpeng Dong, Fuchun Sun, Xiao Yang

The key idea of our method, Root Pose Decomposition (RPD), is to maintain a per-frame root pose transformation, meanwhile building a dense field with local transformations to rectify the root pose.

3D Reconstruction Object

Coarse-to-Fine Amodal Segmentation with Shape Prior

1 code implementation ICCV 2023 Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

To address this issue, we propose a convolution refine module to inject fine-grained information and provide a more precise amodal object segmentation based on visual features and coarse-predicted segmentation.

Object Segmentation +1

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

1 code implementation6 Nov 2023 Jianhui Li, Shilong Liu, Zidong Liu, Yikai Wang, Kaiwen Zheng, Jinghui Xu, Jianmin Li, Jun Zhu

With the success of Neural Radiance Field (NeRF) in 3D-aware portrait editing, a variety of works have achieved promising results regarding both quality and 3D consistency.

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

no code implementations6 Dec 2023 Xinzhou Wang, Yikai Wang, Junliang Ye, Zhengyi Wang, Fuchun Sun, Pengkun Liu, Ling Wang, Kai Sun, Xintong Wang, Bin He

At its core, AnimatableDreamer is equipped with our novel optimization design dubbed Canonical Score Distillation (CSD), which simplifies the generation dimension from 4D to 3D by denoising over different frames in the time-varying camera spaces while conducting the distillation process in a unique canonical space shared per video.

Denoising Text to 3D

Towards Context-Stable and Visual-Consistent Image Inpainting

1 code implementation8 Dec 2023 Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks.

Image Inpainting

Repositioning the Subject within Image

1 code implementation30 Jan 2024 Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, YiFan Li, xiangyang xue, Yanwei Fu

Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task.

Image Generation Image Manipulation

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

no code implementations8 Mar 2024 Zhengyi Wang, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu

In this work, we present the Convolutional Reconstruction Model (CRM), a high-fidelity feed-forward single image-to-3D generative model.

Image to 3D

V3D: Video Diffusion Models are Effective 3D Generators

2 code implementations11 Mar 2024 Zilong Chen, Yikai Wang, Feng Wang, Zhengyi Wang, Huaping Liu

To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator.

Novel View Synthesis

Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding

1 code implementation15 Mar 2024 Pengkun Liu, Yikai Wang, Fuchun Sun, Jiafang Li, Hang Xiao, Hongxiang Xue, Xinzhou Wang

As a result, with a single image CLIP embedding, Isotropic3D is capable of generating multi-view mutually consistent images and also a 3D model with more symmetrical and neat content, well-proportioned geometry, rich colored texture, and less distortion compared with existing image-to-3D methods while still preserving the similarity to the reference image to a large extent.

Image to 3D Text to 3D

DreamReward: Text-to-3D Generation with Human Preference

no code implementations21 Mar 2024 Junliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhu

Building upon the 3D reward model, we finally perform theoretical analysis and present the Reward3D Feedback Learning (DreamFL), a direct tuning algorithm to optimize the multi-view diffusion models with a redefined scorer.

Text to 3D text-to-3d-human

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

no code implementations27 Mar 2024 Jingyang Huo, Yikai Wang, Xuelin Qian, Yun Wang, Chong Li, Jianfeng Feng, Yanwei Fu

Recent fMRI-to-image approaches mainly focused on associating fMRI signals with specific conditions of pre-trained diffusion models.

Cannot find the paper you are looking for? You can Submit a new open access paper.