Search Results for author: Hang Zhou

Found 70 papers, 32 papers with code

Prior Constraints-based Reward Model Training for Aligning Large Language Models

1 code implementation • 1 Apr 2024 • Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs. However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model. This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.

reinforcement-learning

Paper
Code

Attacking Transformers with Feature Diversity Adversarial Perturbation

no code implementations • 10 Mar 2024 • Chenxing Gao, Hang Zhou, Junqing Yu, Yuteng Ye, Jiale Cai, Junle Wang, Wei Yang

Understanding the mechanisms behind Vision Transformer (ViT), particularly its vulnerability to adversarial perturba tions, is crucial for addressing challenges in its real-world applications.

Paper
Add Code

AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

no code implementations • 25 Feb 2024 • Yasheng Sun, Wenqing Chu, Hang Zhou, Kaisiyuan Wang, Hideki Koike

In this paper, we propose AVI-Talking, an Audio-Visual Instruction system for expressive Talking face generation.

Hallucination Talking Face Generation

Paper
Add Code

Fine-grained Appearance Transfer with Diffusion Models

1 code implementation • 27 Nov 2023 • Yuteng Ye, Guanwen Li, Hang Zhou, Cai Jiale, Junqing Yu, Yawei Luo, Zikai Song, Qilong Xing, Youjia Zhang, Wei Yang

A pivotal aspect of our approach is the strategic use of the predicted $x_0$ space by diffusion models within the latent space of diffusion processes.

Image-to-Image Translation

Paper
Code

DAE-Net: Deforming Auto-Encoder for fine-grained shape co-segmentation

1 code implementation • 22 Nov 2023 • Zhiqin Chen, Qimin Chen, Hang Zhou, Hao Zhang

To accommodate structural variations in the collection, our network composes each shape by a selected subset of template parts which are affine-transformed.

Paper
Code

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation

1 code implementation • 28 Sep 2023 • Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, Gang Zeng

In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks.

3D Generation

3,636

Paper
Code

Progressive Text-to-Image Diffusion with Soft Latent Direction

1 code implementation • 18 Sep 2023 • Yuteng Ye, Jiale Cai, Hang Zhou, Guanwen Li, Youjia Zhang, Zikai Song, Chenxing Gao, Junqing Yu, Wei Yang

In spite of the rapidly evolving landscape of text-to-image generation, the synthesis and manipulation of multiple entities while adhering to specific relational constraints pose enduring challenges.

Language Modelling Large Language Model +1

Paper
Code

ReliTalk: Relightable Talking Portrait Generation from a Single Video

1 code implementation • 5 Sep 2023 • Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu

Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.

Single-Image Portrait Relighting

Paper
Code

Learning Evaluation Models from Large Language Models for Sequence Generation

no code implementations • 8 Aug 2023 • Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu

Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters.

Machine Translation Style Transfer +1

Paper
Add Code

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

3 code implementations • 4 Aug 2023 • Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

20,766

Paper
Code

ShaDDR: Interactive Example-Based Geometry and Texture Generation via 3D Shape Detailization and Differentiable Rendering

1 code implementation • 8 Jun 2023 • Qimin Chen, Zhiqin Chen, Hang Zhou, Hao Zhang

Furthermore, we showcase the ability of our method to learn geometric details and textures from shapes reconstructed from real-world photos.

Texture Synthesis

Paper
Code

Detecting Errors in a Numerical Response via any Regression Model

2 code implementations • 26 May 2023 • Hang Zhou, Jonas Mueller, Mayank Kumar, Jane-Ling Wang, Jing Lei

Noise plagues many numerical datasets, where the recorded values in the data may fail to match the true underlying values due to reasons including: erroneous sensors, data entry/processing mistakes, or imperfect human estimates.

regression

8,673

Paper
Code

Building an Invisible Shield for Your Portrait against Deepfakes

no code implementations • 22 May 2023 • Jiazhi Guan, Tianshu Hu, Hang Zhou, Zhizhi Guo, Lirui Deng, Chengbin Quan, Errui Ding, Youjian Zhao

Unlike authentic images, where the hidden messages can be extracted with precision, manipulating the facial attributes through deepfake techniques can disrupt the decoding process.

Face Swapping

Paper
Add Code

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

no code implementations • CVPR 2023 • Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang

Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability.

Paper
Add Code

Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement

1 code implementation • ICCV 2023 • Jiaxiang Tang, Hang Zhou, Xiaokang Chen, Tianshu Hu, Errui Ding, Jingdong Wang, Gang Zeng

Neural Radiance Fields (NeRF) have constituted a remarkable breakthrough in image-based 3D reconstruction.

3D Reconstruction

853

Paper
Code

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation

no code implementations • 14 Feb 2023 • Yasheng Sun, Qianyi Wu, Hang Zhou, Kaisiyuan Wang, Tianshu Hu, Chen-Chieh Liao, Shio Miyafuji, Ziwei Liu, Hideki Koike

Creating the photo-realistic version of people sketched portraits is useful to various entertainment purposes.

Paper
Add Code

Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection

1 code implementation • 10 Feb 2023 • Hang Zhou, Junqing Yu, Wei Yang

To address this issue, we propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data.

Anomaly Detection Video Anomaly Detection

Paper
Code

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers

no code implementations • 9 Dec 2022 • Yasheng Sun, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Zhibin Hong, Jingtuo Liu, Errui Ding, Jingdong Wang, Ziwei Liu, Hideki Koike

This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames.

Paper
Add Code

Audio-Driven Co-Speech Gesture Video Generation

no code implementations • 5 Dec 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu

Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.

Video Generation

Paper
Add Code

Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive Diffusion

no code implementations • 29 Nov 2022 • Kui Zhang, Hang Zhou, Jie Zhang, Qidong Huang, Weiming Zhang, Nenghai Yu

Deep 3D point cloud models are sensitive to adversarial attacks, which poses threats to safety-critical applications such as autonomous driving.

Autonomous Driving Denoising

Paper
Add Code

Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification

1 code implementation • 27 Nov 2022 • Yuteng Ye, Hang Zhou, Jiale Cai, Chenxing Gao, Youjia Zhang, Junle Wang, Qiang Hu, Junqing Yu, Wei Yang

The framework mainly consists of a sparse encoder, a multi-view feature mathcing module, and a feature consolidation decoder.

Person Re-Identification

Paper
Code

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition

1 code implementation • 22 Nov 2022 • Jiaxiang Tang, Kaisiyuan Wang, Hang Zhou, Xiaokang Chen, Dongliang He, Tianshu Hu, Jingtuo Liu, Gang Zeng, Jingdong Wang

While dynamic Neural Radiance Fields (NeRF) have shown success in high-fidelity 3D modeling of talking portraits, the slow training and inference speed severely obstruct their potential usage.

Talking Face Generation

823

Paper
Code

Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation

1 code implementation • 16 Nov 2022 • Fan Li, Hang Zhou, Huafeng Li, Yafei Zhang, Zhengtao Yu

Specifically, we improve the interpretability of text features by providing them with consistent semantic information with image features to achieve the alignment of text and describe image region features. To address the challenges posed by the diversity of text and the corresponding person images, we treat the variation caused by diversity to features as caused by perturbation information and propose a novel adversarial attack and defense method to solve it.

Adversarial Attack Person Search +1

Paper
Code

Learning to Immunize Images for Tamper Localization and Self-Recovery

no code implementations • 28 Oct 2022 • Qichao Ying, Hang Zhou, Zhenxing Qian, Sheng Li, Xinpeng Zhang

Image immunization (Imuge) is a technology of protecting the images by introducing trivial perturbation, so that the protected images are immune to the viruses in that the tampered contents can be auto-recovered.

Paper
Add Code

TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis

3 code implementations • 5 Oct 2022 • Haixu Wu, Tengge Hu, Yong liu, Hang Zhou, Jianmin Wang, Mingsheng Long

TimesBlock can discover the multi-periodicity adaptively and extract the complex temporal variations from transformed 2D tensors by a parameter-efficient inception block.

Action Recognition Anomaly Detection +4

4,388

Paper
Code

StyleSwap: Style-Based Generator Empowers Robust Face Swapping

no code implementations • 27 Sep 2022 • Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.

Face Swapping

Paper
Add Code

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

no code implementations • 16 Sep 2022 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Kui Zhang, Gang Hua, Nenghai Yu

Notwithstanding the prominent performance achieved in various applications, point cloud recognition models have often suffered from natural corruptions and adversarial perturbations.

Paper
Add Code

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu

Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.

Image Generation Video Generation

131

Paper
Code

Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption

no code implementations • 21 Jul 2022 • Jiazhi Guan, Hang Zhou, Mingming Gong, Errui Ding, Jingdong Wang, Youjian Zhao

Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator and create a wide range of pseudo-fake videos for training.

DeepFake Detection Face Swapping

Paper
Add Code

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers

1 code implementation • 18 Jul 2022 • Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li, Yu Liu

In this paper, we propose a novel data augmentation technique TokenMix to improve the performance of vision transformers.

Data Augmentation

Paper
Code

Delving into Sequential Patches for Deepfake Detection

no code implementations • 6 Jul 2022 • Jiazhi Guan, Hang Zhou, Zhibin Hong, Errui Ding, Jingdong Wang, Chengbin Quan, Youjian Zhao

Recent advances in face forgery techniques produce nearly visually untraceable deepfake videos, which could be leveraged with malicious intentions.

DeepFake Detection Face Swapping

Paper
Add Code

Agriculture-Vision Challenge 2022 -- The Runner-Up Solution for Agricultural Pattern Recognition via Transformer-based Models

no code implementations • 23 Jun 2022 • Zhicheng Yang, Jui-Hsin Lai, Jun Zhou, Hang Zhou, Chen Du, Zhongcheng Lai

The Agriculture-Vision Challenge in CVPR is one of the most famous and competitive challenges for global researchers to break the boundary between computer vision and agriculture sectors, aiming at agricultural pattern recognition from aerial images.

Data Augmentation

Paper
Add Code

MultiEarth 2022 -- The Champion Solution for Image-to-Image Translation Challenge via Generation Models

no code implementations • 17 Jun 2022 • Yuchuan Gou, Bo Peng, Hongchen Liu, Hang Zhou, Jui-Hsin Lai

The MultiEarth 2022 Image-to-Image Translation challenge provides a well-constrained test bed for generating the corresponding RGB Sentinel-2 imagery with the given Sentinel-1 VV & VH imagery.

Image-to-Image Translation Translation

Paper
Add Code

MultiEarth 2022 -- The Champion Solution for the Matrix Completion Challenge via Multimodal Regression and Generation

no code implementations • 17 Jun 2022 • Bo Peng, Hongchen Liu, Hang Zhou, Yuchuan Gou, Jui-Hsin Lai

Earth observation satellites have been continuously monitoring the earth environment for years at different locations and spectral bands with different modalities.

Earth Observation Matrix Completion +2

Paper
Add Code

Image Protection for Robust Cropping Localization and Recovery

no code implementations • 6 Jun 2022 • Qichao Ying, Hang Zhou, Xiaoxiao Hu, Zhenxing Qian, Sheng Li, Xinpeng Zhang

Existing image cropping detection schemes ignore that recovering the cropped-out contents can unveil the purpose of the behaved cropping attack.

Image Cropping

Paper
Add Code

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

no code implementations • 30 May 2022 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, Xun Cao

Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects.

Talking Face Generation

Paper
Add Code

Few-Shot Head Swapping in the Wild

no code implementations • CVPR 2022 • Changyong Shu, Hemao Wu, Hang Zhou, Jiaming Liu, Zhibin Hong, Changxing Ding, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Particularly, seamless blending is achieved with the help of a Semantic-Guided Color Reference Creation procedure and a Blending UNet.

Face Swapping

Paper
Add Code

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing

2 code implementations • 25 Apr 2022 • Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, LiMin Wang

This paper focuses on the weakly-supervised audio-visual video parsing task, which aims to recognize all events belonging to each modality and localize their temporal boundaries.

Denoising valid

Paper
Code

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

no code implementations • 25 Mar 2022 • Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, Di Hu

Recent years have witnessed the success of deep learning on the visual sound separation task.

Paper
Add Code

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou

To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.

Ranked #3 on Gesture Generation on TED Gesture Dataset

Contrastive Learning Gesture Generation

117

Paper
Code

Shape-invariant 3D Adversarial Point Clouds

1 code implementation • CVPR 2022 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Nenghai Yu

In this paper, we propose a novel Point-Cloud Sensitivity Map to boost both the efficiency and imperceptibility of point perturbations.

Paper
Code

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation • 13 Feb 2022 • Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Paper
Code

STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation

no code implementations • 8 Feb 2022 • Zhengkai Jiang, Zhangxuan Gu, Jinlong Peng, Hang Zhou, Liang Liu, Yabiao Wang, Ying Tai, Chengjie Wang, Liqing Zhang

In contrast, we present a simple and efficient single-stage VIS framework based on the instance segmentation method CondInst by adding an extra tracking head.

Ranked #36 on Video Instance Segmentation on YouTube-VIS validation

Contrastive Learning Instance Segmentation +3

Paper
Add Code

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

no code implementations • 19 Jan 2022 • Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou

Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.

Paper
Add Code

Expressive Talking Head Generation With Granular Audio-Visual Control

no code implementations • CVPR 2022 • Borong Liang, Yan Pan, Zhizhi Guo, Hang Zhou, Zhibin Hong, Xiaoguang Han, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Generating expressive talking heads is essential for creating virtual humans.

Talking Head Generation

Paper
Add Code

SAC-GAN: Structure-Aware Image Composition

1 code implementation • 13 Dec 2021 • Hang Zhou, Rui Ma, Ling-Xiao Zhang, Lin Gao, Ali Mahdavi-Amiri, Hao Zhang

Specifically, our network takes the semantic layout features from the input scene image, features encoded from the edges and silhouette in the input object patch, as well as a latent code as inputs, and generates a 2D spatial affine transform defining the translation and scaling of the object patch.

Image Augmentation Object

Paper
Code

SUB-Depth: Self-distillation and Uncertainty Boosting Self-supervised Monocular Depth Estimation

1 code implementation • 18 Nov 2021 • Hang Zhou, Sarah Taylor, David Greenwood, Michal Mackiewicz

Depth models trained with SUB-Depth outperform the same models trained in a standard single-task SDE framework.

Image Reconstruction Monocular Depth Estimation

Paper
Code

Hyperspectral Mixed Noise Removal via Subspace Representation and Weighted Low-rank Tensor Regularization

no code implementations • 13 Nov 2021 • Hang Zhou, Yanchi Su, Zhanshan Li

Recently, the low-rank property of different components extracted from the image has been considered in man hyperspectral image denoising methods.

Hyperspectral Image Denoising Image Denoising

Paper
Add Code

From Image to Imuge: Immunized Image Generation

1 code implementation • 27 Oct 2021 • Qichao Ying, Zhenxing Qian, Hang Zhou, Haisheng Xu, Xinpeng Zhang, Siyi Li

At the recipient's side, the verifying network localizes the malicious modifications, and the original content can be approximately recovered by the decoder, despite the presence of the attacks.

Image Cropping Image Generation

Paper
Code

Self-Supervised Monocular Depth Estimation with Internal Feature Fusion

1 code implementation • 18 Oct 2021 • Hang Zhou, David Greenwood, Sarah Taylor

Therefore, it is natural to exploit semantic segmentation networks for depth estimation.

Ranked #5 on Unsupervised Monocular Depth Estimation on KITTI-C

Monocular Depth Estimation Segmentation +2

109

Paper
Code

Hiding Images into Images with Real-world Robustness

no code implementations • 12 Oct 2021 • Qichao Ying, Hang Zhou, Xianhan Zeng, Haisheng Xu, Zhenxing Qian, Xinpeng Zhang

The existing image embedding networks are basically vulnerable to malicious attacks such as JPEG compression and noise adding, not applicable for real-world copyright protection tasks.

Paper
Add Code

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

1 code implementation • CVPR 2021 • Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.

Talking Face Generation

903

Paper
Code

Audio-Driven Emotional Video Portraits

1 code implementation • CVPR 2021 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.

Disentanglement Face Generation

284

Paper
Code

Visually Informed Binaural Audio Generation without Binaural Audios

no code implementations • CVPR 2021 • Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.

Audio Generation

Paper
Add Code

Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication

no code implementations • 9 Apr 2021 • Xiquan Guan, Huamin Feng, Weiming Zhang, Hang Zhou, Jie Zhang, Nenghai Yu

Specifically, we present the reversible watermarking problem of deep convolutional neural networks and utilize the pruning theory of model compression technology to construct a host sequence used for embedding watermarking information by histogram shift.

Model Compression

Paper
Add Code

Adversarial Examples Detection beyond Image Space

1 code implementation • 23 Feb 2021 • Kejiang Chen, Yuefeng Chen, Hang Zhou, Chuan Qin, Xiaofeng Mao, Weiming Zhang, Nenghai Yu

To detect both few-perturbation attacks and large-perturbation attacks, we propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.

Paper
Code

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud-based Deep Networks

no code implementations • 1 Nov 2020 • Hang Zhou, Dongdong Chen, Jing Liao, Weiming Zhang, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Gang Hua, Nenghai Yu

To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.

Paper
Add Code

Discriminability Distillation in Group Representation Learning

no code implementations • ECCV 2020 • Manyuan Zhang, Guanglu Song, Hang Zhou, Yu Liu

We show the discrimiability knowledge has good properties that can be distilled by a light-weight distillation network and can be generalized on the unseen target set.

Representation Learning

Paper
Add Code

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation

no code implementations • ECCV 2020 • Hang Zhou, Xudong Xu, Dahua Lin, Xiaogang Wang, Ziwei Liu

Stereophonic audio is an indispensable ingredient to enhance human auditory experience.

Audio Generation

Paper
Add Code

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks

no code implementations • CVPR 2020 • Hang Zhou, Dongdong Chen, Jing Liao, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Weiming Zhang, Gang Hua, Nenghai Yu

To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.

Paper
Add Code

Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

1 code implementation • CVPR 2020 • Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, Xiaogang Wang

Though face rotation has achieved rapid progress in recent years, the lack of high-quality paired training data remains a great hurdle for existing methods.

3D Face Modelling Data Augmentation +1

482

Paper
Code

Powerful Speaker Embedding Training Framework by Adversarially Disentangled Identity Representation

no code implementations • 27 Nov 2019 • Jianwei Tai, Hang Zhou, Qingjia Huang, Xiaoqi Jia

The main challenge of speaker verification in the wild is the interference caused by irrelevant information in speech and the lack of speaker labels in speech datasets.

Speaker Verification

Paper
Add Code

Visual Summarization of Scholarly Videos using Word Embeddings and Keyphrase Extraction

no code implementations • 25 Nov 2019 • Hang Zhou, Christian Otto, Ralph Ewerth

Effective learning with audiovisual content depends on many factors.

Keyphrase Extraction Optical Character Recognition +5

Paper
Add Code

Self-supervised Adversarial Training

1 code implementation • 15 Nov 2019 • Kejiang Chen, Hang Zhou, Yuefeng Chen, Xiaofeng Mao, Yuhong Li, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu

Recent work has demonstrated that neural networks are vulnerable to adversarial examples.

Self-Supervised Learning

Paper
Code

A Graph-Based Framework to Bridge Movies and Synopses

no code implementations • ICCV 2019 • Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin

On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs.

Paper
Add Code

Vision-Infused Deep Audio Inpainting

no code implementations • ICCV 2019 • Hang Zhou, Ziwei Liu, Xudong Xu, Ping Luo, Xiaogang Wang

Extensive experiments demonstrate that our framework is capable of inpainting realistic and varying audio segments with or without visual contexts.

Audio inpainting Image Inpainting

Paper
Add Code

ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks

no code implementations • 27 May 2019 • Xiaoqi Jia, Jianwei Tai, Hang Zhou, Yakai Li, Weijuan Zhang, Haichao Du, Qingjia Huang

Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments.

Domain Adaptation Generative Adversarial Network +2

Paper
Add Code

DUP-Net: Denoiser and Upsampler Network for 3D Adversarial Point Clouds Defense

1 code implementation • ICCV 2019 • Hang Zhou, Kejiang Chen, Weiming Zhang, Han Fang, Wenbo Zhou, Nenghai Yu

We propose a Denoiser and UPsampler Network (DUP-Net) structure as defenses for 3D adversarial point cloud classification, where the two modules reconstruct surface smoothness by dropping or adding points.

Denoising Point Cloud Classification