Search Results for author: Yong Zhang

Found 170 papers, 81 papers with code

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

1 code implementation CVPR 2023 Wenxuan Zhang, Xiaodong Cun, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang

We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.

Image Animation Talking Head Generation

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

1 code implementation27 Nov 2022 Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang

Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism.

Video Editing Video Generation

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

3 code implementations30 Oct 2023 Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan

The I2V model is designed to produce videos that strictly adhere to the content of the provided reference image, preserving its content, structure, and style.

Text-to-Video Generation Video Generation

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

2 code implementations17 Jan 2024 Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan

Based on this stronger coupling, we shift the distribution to higher quality without motion degradation by finetuning spatial modules with high-quality images, resulting in a generic high-quality video model.

Text-to-Video Generation Video Generation

Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning

1 code implementation7 Jan 2019 Baoyuan Wu, Weidong Chen, Yanbo Fan, Yong Zhang, Jinlong Hou, Jie Liu, Tong Zhang

In this work, we propose to train CNNs from images annotated with multiple tags, to enhance the quality of visual representation of the trained CNN model.

Image Classification object-detection +5

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

1 code implementation8 Mar 2022 Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang

Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.

Facial Editing Talking Face Generation +1

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

1 code implementation16 Mar 2024 Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, GuanYing Chen, Wei Liu, Wenhan Luo

We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout.

Denoising Text-to-Image Generation

Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

2 code implementations CVPR 2023 Jingxiang Sun, Xuan Wang, Lizhen Wang, Xiaoyu Li, Yong Zhang, Hongwen Zhang, Yebin Liu

We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.

Face Model

High-Fidelity GAN Inversion for Image Attribute Editing

1 code implementation CVPR 2022 Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen

With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.

Attribute Generative Adversarial Network +2

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models

1 code implementation11 Oct 2023 Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan

Our work also suggests that a pre-trained diffusion model trained on low-resolution images can be directly used for high-resolution visual generation without further tuning, which may provide insights for future research on ultra-high-resolution image and video synthesis.

Image Generation

Latent Video Diffusion Models for High-Fidelity Long Video Generation

1 code implementation23 Nov 2022 Yingqing He, Tianyu Yang, Yong Zhang, Ying Shan, Qifeng Chen

Diffusion models have shown remarkable results recently but require significant computational resources.

Denoising Image Generation +3

DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

1 code implementation CVPR 2023 Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dong-Ming Yan

In this paper, we introduce a novel self-supervised disentanglement framework to decouple pose and expression without 3DMMs and paired data, which consists of a motion editing module, a pose generator, and an expression generator.

Disentanglement Talking Face Generation +1

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

1 code implementation6 Dec 2023 Jiwen Yu, Xiaodong Cun, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang

For appearance control, we borrow intermediate latents and their features from the text-to-image (T2I) generation for ensuring the generated first frame is equal to the given generated image.

Image Animation Video Generation

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling

3 code implementations23 Oct 2023 Haonan Qiu, Menghan Xia, Yong Zhang, Yingqing He, Xintao Wang, Ying Shan, Ziwei Liu

With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress.

Video Generation

Inserting Anybody in Diffusion Models via Celeb Basis

1 code implementation NeurIPS 2023 Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng

Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods.

TaleCrafter: Interactive Story Visualization with Multiple Characters

1 code implementation29 May 2023 Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

Story Visualization Text-to-Image Generation

Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

1 code implementation25 Mar 2024 Ziyao Huang, Fan Tang, Yong Zhang, Xiaodong Cun, Juan Cao, Jintao Li, Tong-Yee Lee

We adopt a two-stage training strategy for the diffusion model, effectively binding movements with specific appearances.

Denoising

DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection

1 code implementation NeurIPS 2023 Zhiyuan Yan, Yong Zhang, Xinhang Yuan, Siwei Lyu, Baoyuan Wu

To fill this gap, we present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions: 1) a unified data management system to ensure consistent input across all detectors, 2) an integrated framework for state-of-the-art methods implementation, and 3) standardized evaluation metrics and protocols to promote transparency and reproducibility.

DeepFake Detection Face Swapping

ReliableSwap: Boosting General Face Swapping Via Reliable Supervision

1 code implementation8 Jun 2023 Ge Yuan, Maomao Li, Yong Zhang, Huicheng Zheng

To avoid the potential artifacts and drive the distribution of the network output close to the natural one, we reversely take synthetic images as input while the real face as reliable supervision during the training stage of face swapping.

Face Reenactment Face Swapping

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

2 code implementations1 Dec 2023 Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan

To address these challenges, we introduce StyleCrafter, a generic method that enhances pre-trained T2V models with a style control adapter, enabling video generation in any style by providing a reference image.

Disentanglement Text-to-Video Generation +1

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

3 code implementations12 Oct 2022 Zeyu Qin, Yanbo Fan, Yi Liu, Li Shen, Yong Zhang, Jue Wang, Baoyuan Wu

Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability.

Adversarial Attack

Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection

1 code implementation CVPR 2022 Liang Chen, Yong Zhang, Yibing Song, Lingqiao Liu, Jue Wang

Following this principle, we propose to enrich the "diversity" of forgeries by synthesizing augmented forgeries with a pool of forgery configurations and strengthen the "sensitivity" to the forgeries by enforcing the model to predict the forgery configurations.

DeepFake Detection Face Swapping +1

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

1 code implementation19 Nov 2022 Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu

Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.

Denoising Image Stylization

LAS-AT: Adversarial Training with Learnable Attack Strategy

1 code implementation CVPR 2022 Xiaojun Jia, Yong Zhang, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao

In this paper, we propose a novel framework for adversarial training by introducing the concept of "learnable attack strategy", dubbed LAS-AT, which learns to automatically produce attack strategies to improve the model robustness.

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

1 code implementation ICCV 2023 Zunnan Xu, Zhihong Chen, Yong Zhang, Yibing Song, Xiang Wan, Guanbin Li

Parameter Efficient Tuning (PET) has gained attention for reducing the number of parameters while maintaining performance and providing better hardware resource savings, but few studies investigate dense prediction tasks and interaction between modalities.

Image Segmentation Referring Expression Segmentation +2

An Effective and Robust Detector for Logo Detection

2 code implementations1 Aug 2021 Xiaojun Jia, Huanqian Yan, Yonglin Wu, Xingxing Wei, Xiaochun Cao, Yong Zhang

Moreover, we have applied the proposed methods to competition ACM MM2021 Robust Logo Detection that is organized by Alibaba on the Tianchi platform and won top 2 in 36489 teams.

Data Augmentation

CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying

1 code implementation15 Mar 2023 Weihuang Liu, Xiaodong Cun, Chi-Man Pun, Menghan Xia, Yong Zhang, Jue Wang

Thanks to the proposed structure, we only encode the high-resolution image in a relatively low resolution for larger reception field capturing.

Image Inpainting Vocal Bursts Intensity Prediction

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

1 code implementation17 Oct 2023 Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond Chan, Ying Shan

For video generation, various open-sourced models and public-available services have been developed to generate high-quality videos.

Benchmarking Language Modelling +4

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

1 code implementation16 Feb 2024 Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, YuFei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen

Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data.

Video Generation

Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection

1 code implementation CVPR 2022 Yong Zhang, Yingwei Pan, Ting Yao, Rui Huang, Tao Mei, Chang-Wen Chen

Such design decomposes the process of HOI set prediction into two subsequent phases, i. e., an interaction proposal generation is first performed, and then followed by transforming the non-parametric interaction proposals into HOI predictions via a structure-aware Transformer.

Human-Object Interaction Detection Object

VDTR: Video Deblurring with Transformer

1 code implementation17 Apr 2022 Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, Yujiu Yang

For multi-frame temporal modeling, we adapt Transformer to fuse multiple spatial features efficiently.

Deblurring Video Restoration

E4SRec: An Elegant Effective Efficient Extensible Solution of Large Language Models for Sequential Recommendation

1 code implementation5 Dec 2023 Xinhang Li, Chong Chen, Xiangyu Zhao, Yong Zhang, Chunxiao Xing

Furthermore, practical ID-based recommendation strategies, reliant on a huge number of unique identities (IDs) to represent users and items, have gained prominence in real-world recommender systems due to their effectiveness and efficiency.

Sequential Recommendation Text Generation

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

1 code implementation1 Feb 2024 Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou

Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process.

Language Modelling

IMF: Interactive Multimodal Fusion Model for Link Prediction

1 code implementation20 Mar 2023 Xinhang Li, Xiangyu Zhao, Jiaxing Xu, Yong Zhang, Chunxiao Xing

To this end, we propose a two-stage multimodal fusion framework to preserve modality-specific knowledge as well as take advantage of the complementarity between different modalities.

Contrastive Learning Knowledge Graphs +1

Generalizable Black-Box Adversarial Attack with Meta Learning

1 code implementation1 Jan 2023 Fei Yin, Yong Zhang, Baoyuan Wu, Yan Feng, Jingyi Zhang, Yanbo Fan, Yujiu Yang

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget.

Adversarial Attack Meta-Learning

Sparse Adversarial Attack via Perturbation Factorization

1 code implementation ECCV 2020 Yanbo Fan, Baoyuan Wu, Tuanhui Li, Yong Zhang, Mingyang Li, Zhifeng Li, Yujiu Yang

Based on this factorization, we formulate the sparse attack problem as a mixed integer programming (MIP) to jointly optimize the binary selection factors and continuous perturbation magnitudes of all pixels, with a cardinality constraint on selection factors to explicitly control the degree of sparsity.

Adversarial Attack

Semi-Autoregressive Transformer for Image Captioning

1 code implementation17 Jun 2021 Yuanen Zhou, Yong Zhang, Zhenzhen Hu, Meng Wang

To tackle this issue, non-autoregressive image captioning models have recently been proposed to significantly accelerate the speed of inference by generating all words in parallel.

Image Captioning

ChatTraffic: Text-to-Traffic Generation via Diffusion Model

1 code implementation27 Nov 2023 Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

BjTT: A Large-scale Multimodal Dataset for Traffic Prediction

2 code implementations8 Mar 2024 Chengyang Zhang, Yong Zhang, Qitan Shao, Jiangtao Feng, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

DAE-GAN: Dynamic Aspect-aware GAN for Text-to-Image Synthesis

1 code implementation ICCV 2021 Shulan Ruan, Yong Zhang, Kun Zhang, Yanbo Fan, Fan Tang, Qi Liu, Enhong Chen

Text-to-image synthesis refers to generating an image from a given text description, the key goal of which lies in photo realism and semantic consistency.

Image Generation Sentence +2

Improved Test-Time Adaptation for Domain Generalization

1 code implementation CVPR 2023 Liang Chen, Yong Zhang, Yibing Song, Ying Shan, Lingqiao Liu

Generally, a TTT strategy hinges its performance on two main factors: selecting an appropriate auxiliary TTT task for updating and identifying reliable parameters to update during the test phase.

Domain Generalization Test-time Adaptation

NL4Opt Competition: Formulating Optimization Problems Based on Their Natural Language Descriptions

1 code implementation14 Mar 2023 Rindranirina Ramamonjison, Timothy T. Yu, Raymond Li, Haley Li, Giuseppe Carenini, Bissan Ghaddar, Shiqi He, Mahdi Mostajabdaveh, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang

The Natural Language for Optimization (NL4Opt) Competition was created to investigate methods of extracting the meaning and formulation of an optimization problem based on its text description.

Language Modelling Large Language Model

Domain Generalization via Rationale Invariance

1 code implementation ICCV 2023 Liang Chen, Yong Zhang, Yibing Song, Anton Van Den Hengel, Lingqiao Liu

Specifically, we propose treating the element-wise contributions to the final results as the rationale for making a decision and representing the rationale for each sample as a matrix.

Decision Making Domain Generalization

Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits

2 code implementations ICLR 2021 Jiawang Bai, Baoyuan Wu, Yong Zhang, Yiming Li, Zhifeng Li, Shu-Tao Xia

By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem, which can be effectively and efficiently solved using the alternating direction method of multipliers (ADMM) method.

Backdoor Attack

Prior-Guided Adversarial Initialization for Fast Adversarial Training

1 code implementation18 Jul 2022 Xiaojun Jia, Yong Zhang, Xingxing Wei, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao

Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting after investigating several initialization strategies, improving the quality of the AEs during the whole training process.

Adversarial Attack Adversarial Attack on Video Classification

CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional Network for Clustering

1 code implementation18 Jan 2021 Guangyu Huo, Yong Zhang, Junbin Gao, Boyue Wang, Yongli Hu, BaoCai Yin

In this paper, we propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN), which contains four main modules: the cross-attention fusion module which innovatively concatenates the Content Auto-encoder module (CAE) relating to the individual data and Graph Convolutional Auto-encoder module (GAE) relating to the relationship between the data in a layer-by-layer manner, and the self-supervised model that highlights the discriminative information for clustering tasks.

Clustering Deep Clustering

A deep learning-based remaining useful life prediction approach for bearings

1 code implementation8 Dec 2018 Cheng Cheng, Guijun Ma, Yong Zhang, Mingyang Sun, Fei Teng, Han Ding, Ye Yuan

In industrial applications, nearly half the failures of motors are caused by the degradation of rolling element bearings (REBs).

Exact Adversarial Attack to Image Captioning via Structured Output Learning with Latent Variables

1 code implementation CVPR 2019 Yan Xu, Baoyuan Wu, Fumin Shen, Yanbo Fan, Yong Zhang, Heng Tao Shen, Wei Liu

Due to the sequential dependencies among words in a caption, we formulate the generation of adversarial noises for targeted partial captions as a structured output learning problem with latent variables.

Adversarial Attack Image Captioning

Online Portfolio Management via Deep Reinforcement Learning with High-Frequency Data

1 code implementation Information Processing & Management 2023 Jiahao Li, Yong Zhang, Xingyu Yang, LiangWei Chen

In addition, while the vast majority of SOTA strategies maintain a poor turnover rate of approximately greater than 50% on average, our framework enjoys a relatively low turnover rate on all datasets, efficiency analysis illustrates that our framework no longer has the quadratic dependency limitation.

Management Policy Gradient Methods +3

Auto-Split: A General Framework of Collaborative Edge-Cloud AI

1 code implementation30 Aug 2021 Amin Banitalebi-Dehkordi, Naveen Vedula, Jian Pei, Fei Xia, Lanjun Wang, Yong Zhang

At the same time, large amounts of input data are collected at the edge of cloud.

Towards Real-World Video Deblurring by Exploring Blur Formation Process

1 code implementation28 Aug 2022 Mingdeng Cao, Zhihang Zhong, Yanbo Fan, Jiahao Wang, Yong Zhang, Jue Wang, Yujiu Yang, Yinqiang Zheng

We believe the novel realistic synthesis pipeline and the corresponding RAW video dataset can help the community to easily construct customized blur datasets to improve real-world video deblurring performance largely, instead of laboriously collecting real data pairs.

Deblurring

Unsupervised Sentence Textual Similarity with Compositional Phrase Semantics

1 code implementation COLING 2022 ZiHao Wang, Jiaheng Dou, Yong Zhang

Measuring Sentence Textual Similarity (STS) is a classic task that can be applied to many downstream NLP applications such as text generation and retrieval.

Retrieval Sentence +3

Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning

1 code implementation EMNLP 2021 Li Zhou, Kevin Small, Yong Zhang, Sandeep Atluri

Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers.

Imitation Learning News Recommendation +4

GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer

1 code implementation4 Jun 2021 Weiming Hu, Chen Li, Xiaoyan Li, Md Mamunur Rahaman, Jiquan Ma, Yong Zhang, HaoYuan Chen, Wanli Liu, Changhao Sun, YuDong Yao, Hongzan Sun, Marcin Grzegorzek

In order to prove that the methods of different periods in the field of image classification have discrepancies on GasHisSDB, we select a variety of classifiers for evaluation.

BIG-bench Machine Learning Image Classification +1

Few-shot Object Localization

1 code implementation19 Mar 2024 Yunhan Ren, Bo Li, Chengyang Zhang, Yong Zhang, BaoCai Yin

This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.

Model Optimization Object +2

Estimating Visual Information From Audio Through Manifold Learning

1 code implementation3 Aug 2022 Fabrizio Pedersoli, Dryden Wiebe, Amin Banitalebi, Yong Zhang, George Tzanetakis, Kwang Moo Yi

Therefore, audio-based methods can be useful even for applications in which only visual information is of interest Our framework is based on Manifold Learning and consists of two steps.

Semantic Segmentation

AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics

1 code implementation28 Aug 2023 Vahid Ghafouri, Vibhor Agarwal, Yong Zhang, Nishanth Sastry, Jose Such, Guillermo Suarez-Tangil

The introduction of ChatGPT and the subsequent improvement of Large Language Models (LLMs) have prompted more and more individuals to turn to the use of ChatBots, both for information and assistance with decision-making.

Decision Making

A Causal Inspired Early-Branching Structure for Domain Generalization

1 code implementation13 Mar 2024 Liang Chen, Yong Zhang, Yibing Song, Zhen Zhang, Lingqiao Liu

By d-separation, we observe that the causal feature can be further characterized by being independent of the domain conditioned on the object, and we propose the following two strategies as complements for the basic framework.

Domain Generalization

EBJR: Energy-Based Joint Reasoning for Adaptive Inference

1 code implementation20 Oct 2021 Mohammad Akbari, Amin Banitalebi-Dehkordi, Yong Zhang

To this end, we propose an Energy-Based Joint Reasoning (EBJR) framework that adaptively distributes the samples between shallow and deep models to achieve an accuracy close to the deep model, but latency close to the shallow one.

Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

1 code implementation20 Oct 2021 Amin Banitalebi-Dehkordi, Xinyu Kang, Yong Zhang

As an attempt to mitigate this dilemma, this paper investigates the idea of combining multiple trained neural networks using unlabeled data.

object-detection Object Detection

Repaint: Improving the Generalization of Down-Stream Visual Tasks by Generating Multiple Instances of Training Examples

1 code implementation20 Oct 2021 Amin Banitalebi-Dehkordi, Yong Zhang

Through an extensive set of experiments, we demonstrate the usefulness of the repainted examples in training, for the tasks of image classification (ImageNet) and object detection (COCO), over several state-of-the-art network architectures at different capacities, and across different data availability regimes.

Colorization Image Classification +2

Targeted Advertising Based on Browsing History

no code implementations13 Nov 2017 Yong Zhang, Hongming Zhou, Nganmeng Tan, Saeed Bagheri, Meng Joo Er

Audience interest, demography, purchase behavior and other possible classifications are ex- tremely important factors to be carefully studied in a targeting campaign.

Schatten-$p$ Quasi-Norm Regularized Matrix Optimization via Iterative Reweighted Singular Value Minimization

no code implementations5 Jan 2014 Zhaosong Lu, Yong Zhang

In particular, we first introduce a class of first-order stationary points for them, and show that the first-order stationary points introduced in [11] for an SPQN regularized $vector$ minimization problem are equivalent to those of an SPQN regularized $matrix$ minimization reformulation.

Siamese Networks for Semantic Pattern Similarity

no code implementations17 Dec 2018 Yassine Benajiba, Jin Sun, Yong Zhang, Longquan Jiang, Zhiliang Weng, Or Biran

Semantic Pattern Similarity is an interesting, though not often encountered NLP task where two sentences are compared not by their specific meaning, but by their more abstract semantic pattern (e. g., preposition or frame).

Question Answering

Unifying Topic, Sentiment & Preference in an HDP-Based Rating Regression Model for Online Reviews

1 code implementation19 Dec 2018 Zheng Chen, Yong Zhang, Yue Shang, Xiaohua Hu

TSPRA combines topics (i. e. product aspects), word sentiment and user preference as regression factors, and is able to perform topic clustering, review rating prediction, sentiment analysis and what we invent as "critical aspect" analysis altogether in one framework.

Clustering Collaborative Filtering +3

The Sentimental Value of Chinese Sub-Character Components

no code implementations WS 2017 Yassine Benajiba, Or Biran, Zhiliang Weng, Yong Zhang, Jin Sun

Sub-character components of Chinese characters carry important semantic information, and recent studies have shown that utilizing this information can improve performance on core semantic tasks.

Sentiment Analysis Word Embeddings

MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction

no code implementations IJCNLP 2017 Yassine Benajiba, Jin Sun, Yong Zhang, Zhiliang Weng, Or Biran

This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments.

Sentiment Analysis Task 2 +1

Penalty Decomposition Methods for Rank Minimization

no code implementations NeurIPS 2011 Yong Zhang, Zhaosong Lu

In this paper we consider general rank minimization problems with rank appearing in either objective function or constraint.

Matrix Completion

Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation

no code implementations CVPR 2018 Yong Zhang, Wei-Ming Dong, Bao-Gang Hu, Qiang Ji

Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction.

Classifier Learning With Prior Probabilities for Facial Action Unit Recognition

no code implementations CVPR 2018 Yong Zhang, Wei-Ming Dong, Bao-Gang Hu, Qiang Ji

To alleviate this issue, we propose a knowledge-driven method for jointly learning multiple AU classifiers without any AU annotation by leveraging prior probabilities on AUs, including expression-independent and expression-dependent AU probabilities.

Anatomy Facial Action Unit Detection

Bilateral Ordinal Relevance Multi-Instance Regression for Facial Action Unit Intensity Estimation

no code implementations CVPR 2018 Yong Zhang, Rui Zhao, Wei-Ming Dong, Bao-Gang Hu, Qiang Ji

The majority of methods directly apply supervised learning techniques to AU intensity estimation while few methods exploit unlabeled samples to improve the performance.

regression

Wasserstein-Fisher-Rao Document Distance

no code implementations23 Apr 2019 Zihao Wang, Datong Zhou, Yong Zhang, Hao Wu, Chenglong Bao

As a fundamental problem of natural language processing, it is important to measure the distance between different documents.

Semantic Similarity Semantic Textual Similarity

Improving Distributed Similarity Join in Metric Space with Error-bounded Sampling

no code implementations15 May 2019 Jiacheng Wu, Yong Zhang, Jin Wang, Chunbin Lin, Yingjia Fu, Chunxiao Xing

To address the limitation, we propose SP-Join, an end-to-end framework to support distributed similarity join in metric space based on the MapReduce paradigm, which (i) employs an estimation-based stratified sampling method to produce pivots with quality guarantees for any sample size, and (ii) devises an effective cost model as the guideline to split the whole datasets into partition in map and reduce phases according to the sampled pivots.

Databases

Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation

no code implementations CVPR 2019 Yong Zhang, Baoyuan Wu, Weiming Dong, Zhifeng Li, Wei Liu, Bao-Gang Hu, Qiang Ji

Accurate AU intensity estimation depends on three major elements: image representation, intensity estimator, and supervisory information.

A Novel Deep Neural Network Based Approach for Sparse Code Multiple Access

no code implementations4 Jun 2019 Jinzhi Lin, Shengzhong Feng, Zhile Yang, Yun Zhang, Yong Zhang

Furthermore, by manipulating the mapping vectors, an autoencoder is able to generalize SCMA, thus a dense code multiple access (DCMA) scheme is proposed.

LAC-Nav: Collision-Free Mutiagent Navigation Based on The Local Action Cells

no code implementations12 Nov 2019 Li Ning, Yong Zhang

Based on the realtime updated local action cells, we propose the LAC-Nav approach to navigate the agent with the properly selected velocity; and furthermore, we coupled the local action cell with an adaptive learning framework, in which the effect of selections are evaluated and used as the references for making decisions in the following updates.

Collision Avoidance Navigate

Ensemble emotion recognizing with multiple modal physiological signals

no code implementations1 Jan 2020 Jing Zhang, Yong Zhang, Suhua Zhan, Cheng Cheng

Multiple physiological signals fusing models, building the uniform classification model by means of consistent and complementary information from different emotions to improve recognition performance.

Classification EEG +3

Structural-Aware Sentence Similarity with Recursive Optimal Transport

no code implementations28 Jan 2020 Zihao Wang, Yong Zhang, Hao Wu

Moreover, we further develop Recursive Optimal Similarity (ROTS) for sentences with the valuable semantic insights from the connections between cosine similarity of weighted average of word vectors and optimal transport.

Sentence Sentence Similarity +1

Controllable Descendant Face Synthesis

no code implementations26 Feb 2020 Yong Zhang, Le Li, Zhilei Liu, Baoyuan Wu, Yanbo Fan, Zhifeng Li

Most of the existing methods train models for one-versus-one kin relation, which only consider one parent face and one child face by directly using an auto-encoder without any explicit control over the resemblance of the synthesized face to the parent face.

Attribute Face Generation +1

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

no code implementations12 May 2020 Chengcheng Ma, Baoyuan Wu, Shibiao Xu, Yanbo Fan, Yong Zhang, Xiaopeng Zhang, Zhifeng Li

In this work, we study the detection of adversarial examples, based on the assumption that the output and internal responses of one DNN model for both adversarial and benign examples follow the generalized Gaussian distribution (GGD), but with different parameters (i. e., shape factor, mean, and variance).

Image Classification

Too Much Information Kills Information: A Clustering Perspective

no code implementations16 Sep 2020 Yicheng Xu, Vincent Chau, Chenchen Wu, Yong Zhang, Vassilis Zissimopoulos, Yifei Zou

Clustering is one of the most fundamental tools in the artificial intelligence area, particularly in the pattern recognition and learning theory.

Clustering Learning Theory

Adaptive Tree Wasserstein Minimization for Hierarchical Generative Modeling

no code implementations1 Jan 2021 ZiHao Wang, Xu Zhao, Tam Le, Hao Wu, Yong Zhang, Makoto Yamada

In this work, we consider OT over tree metrics, which is more general than the sliced Wasserstein and includes the sliced Wasserstein as a special case, and we propose a fast minimization algorithm in $O(n)$ for the optimal Wasserstein-1 transport plan between two distributions in the tree structure.

Unsupervised Domain Adaptation

Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction

1 code implementation EMNLP 2020 Xu Zhao, ZiHao Wang, Hao Wu, Yong Zhang

In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment.

Bilingual Lexicon Induction Vocal Bursts Valence Prediction

A Relaxed Matching Procedure for Unsupervised BLI

no code implementations ACL 2020 Xu Zhao, ZiHao Wang, Hao Wu, Yong Zhang

Recently unsupervised Bilingual Lexicon Induction (BLI) without any parallel corpus has attracted much research interest.

Bilingual Lexicon Induction Translation

Dual ResGCN for Balanced Scene GraphGeneration

no code implementations9 Nov 2020 Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yanbo Fan, Fumin Shen, Heng Tao Shen

We propose to incorporate the prior about the co-occurrence of relation pairs into the graph to further help alleviate the class imbalance issue.

Graph Generation Relation +1

Automated Prostate Cancer Diagnosis Based on Gleason Grading Using Convolutional Neural Network

no code implementations29 Nov 2020 Haotian Xie, Yong Zhang, Jun Wang, Jingjing Zhang, Yifan Ma, Zhaogang Yang

The Gleason grading system using histological images is the most powerful diagnostic and prognostic predictor of prostate cancer.

Data Augmentation Image Reconstruction

1.23-Tb/s per Wavelength Single-Waveguide On-Chip Optical Interconnect Enabled by Mode-division Multiplexing

no code implementations17 Oct 2020 Hanzi Huang, Yetian Huang, Yu He, Haoshuo Chen, Yong Zhang, Qianwu Zhang, Nicolas K. Fontaine, Roland Ryf, Yingxiong Song, Yikai Su

We experimentally demonstrate a record net capacity per wavelength of 1. 23~Tb/s over a single silicon-on-insulator (SOI) multimode waveguide for optical interconnects employing on-chip mode-division multiplexing and 11$\times$11 multiple-in-multiple-out (MIMO) digital signal processing.

QoS-aware Link Scheduling Strategy for Data Transmission in SDVN

no code implementations1 Feb 2021 Yong Zhang, Mao Ye, Lin Guan

The original contributions of this paper are summarized as follows: (1) Model the packets collision probability of broadcast or NACK transmission in VANET with the combination theory and investigate the potential influence of miss my packets (MMP) problem.

Networking and Internet Architecture

Generalizing Face Forgery Detection with High-frequency Features

no code implementations CVPR 2021 Yuchen Luo, Yong Zhang, Junchi Yan, Wei Liu

The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.

Vocal Bursts Intensity Prediction

Robust Counterfactual Explanations on Graph Neural Networks

no code implementations NeurIPS 2021 Mohit Bajaj, Lingyang Chu, Zi Yu Xue, Jian Pei, Lanjun Wang, Peter Cho-Ho Lam, Yong Zhang

Massive deployment of Graph Neural Networks (GNNs) in high-stake applications generates a strong demand for explanations that are robust to noise and align well with human intuition.

counterfactual

Finding Representative Interpretations on Convolutional Neural Networks

no code implementations ICCV 2021 Peter Cho-Ho Lam, Lingyang Chu, Maxim Torgonskiy, Jian Pei, Yong Zhang, Lanjun Wang

Interpreting the decision logic behind effective deep convolutional neural networks (CNN) on images complements the success of deep learning models.

Data Pricing in Machine Learning Pipelines

no code implementations18 Aug 2021 Zicun Cong, Xuan Luo, Pei Jian, Feida Zhu, Yong Zhang

We also investigate pricing in the step of collaborative training of machine learning models, and overview pricing machine learning models for end users in the step of machine learning deployment.

BIG-bench Machine Learning

An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud

no code implementations8 Sep 2021 Liang Hu, Jiangcheng Zhu, Zirui Zhou, Ruiqing Cheng, Xiaolong Bai, Yong Zhang

Cloud training platforms, such as Amazon Web Services and Huawei Cloud provide users with computational resources to train their deep learning jobs.

Decision Making

FedFair: Training Fair Models In Cross-Silo Federated Learning

no code implementations13 Sep 2021 Lingyang Chu, Lanjun Wang, Yanjie Dong, Jian Pei, Zirui Zhou, Yong Zhang

In this paper, we first propose a federated estimation method to accurately estimate the fairness of a model without infringing the data privacy of any party.

Fairness Federated Learning

Achieving Model Fairness in Vertical Federated Learning

1 code implementation17 Sep 2021 Changxin Liu, Zhenan Fan, Zirui Zhou, Yang Shi, Jian Pei, Lingyang Chu, Yong Zhang

To solve it in a federated and privacy-preserving manner, we consider the equivalent dual form of the problem and develop an asynchronous gradient coordinate-descent ascent algorithm, where some active data parties perform multiple parallelized local updates per communication round to effectively reduce the number of communication rounds.

BIG-bench Machine Learning Fairness +2

Robust Physical-World Attacks on Face Recognition

no code implementations20 Sep 2021 Xin Zheng, Yanbo Fan, Baoyuan Wu, Yong Zhang, Jue Wang, Shirui Pan

Face recognition has been greatly facilitated by the development of deep neural networks (DNNs) and has been widely applied to many safety-critical applications.

Adversarial Attack Adversarial Robustness +1

Meta-Attack: Class-Agnostic and Model-Agnostic Physical Adversarial Attack

no code implementations ICCV 2021 Weiwei Feng, Baoyuan Wu, Tianzhu Zhang, Yong Zhang, Yongdong Zhang

To tackle these issues, we propose a class-agnostic and model-agnostic physical adversarial attack model (Meta-Attack), which is able to not only generate robust physical adversarial examples by simulating color and shape distortions, but also generalize to attacking novel images and novel DNN models by accessing a few digital and physical images.

Adversarial Attack Few-Shot Learning

Boosting Fast Adversarial Training with Learnable Adversarial Initialization

no code implementations11 Oct 2021 Xiaojun Jia, Yong Zhang, Baoyuan Wu, Jue Wang, Xiaochun Cao

Adversarial training (AT) has been demonstrated to be effective in improving model robustness by leveraging adversarial examples for training.

Adaptive Multi-receptive Field Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

no code implementations1 Nov 2021 Xing Wang, Juan Zhao, Lin Zhu, Xu Zhou, Zhao Li, Junlan Feng, Chao Deng, Yong Zhang

AMF-STGCN extends GCN by (1) jointly modeling the complex spatial-temporal dependencies in mobile networks, (2) applying attention mechanisms to capture various Receptive Fields of heterogeneous base stations, and (3) introducing an extra decoder based on a fully connected deep network to conquer the error propagation challenge with multi-step forecasting.

Metro Passenger Flow Prediction via Dynamic Hypergraph Convolution Networks

no code implementations IEEE Transactions on Intelligent Transportation Systems 2021 Jingcheng Wang, Yong Zhang, Yun Wei, Yongli Hu, Xinglin Piao, BaoCai Yin

Metro passenger flow prediction is a strategically necessary demand in an intelligent transportation system to alleviate traffic pressure, coordinate operation schedules, and plan future constructions.

A recurrent neural network approach for remaining useful life prediction utilizing a novel trend features construction method

no code implementations10 Dec 2021 Sen Zhao, Yong Zhang, Shang Wang, Beitong Zhou, Cheng Cheng

Data-driven methods for remaining useful life (RUL) prediction normally learn features from a fixed window size of a priori of degradation, which may lead to less accurate prediction results on different datasets because of the variance of local features.

Mining Minority-class Examples With Uncertainty Estimates

no code implementations15 Dec 2021 Gursimran Singh, Lingyang Chu, Lanjun Wang, Jian Pei, Qi Tian, Yong Zhang

In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions, which results in poor performance on the statistically rare classes.

ML4CO: Is GCNN All You Need? Graph Convolutional Neural Networks Produce Strong Baselines For Combinatorial Optimization Problems, If Tuned and Trained Properly, on Appropriate Data

no code implementations22 Dec 2021 Amin Banitalebi-Dehkordi, Yong Zhang

The 2021 NeurIPS Machine Learning for Combinatorial Optimization (ML4CO) competition was designed with the goal of improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components with machine learning models.

BIG-bench Machine Learning Combinatorial Optimization

Fair and efficient contribution valuation for vertical federated learning

no code implementations7 Jan 2022 Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander, Yong Zhang

We show that VerFedSV not only satisfies many desirable properties for fairness but is also efficient to compute, and can be adapted to both synchronous and asynchronous vertical federated learning algorithms.

Fairness Vertical Federated Learning

Jointly Learning Knowledge Embedding and Neighborhood Consensus with Relational Knowledge Distillation for Entity Alignment

no code implementations25 Jan 2022 Xinhang Li, Yong Zhang, Chunxiao Xing

We adopt GCN-based models to learn the representation of entities by considering the graph structure and incorporating the relation semantic information into GCN via knowledge distillation.

Benchmarking Entity Alignment +4

Self-Attention for Incomplete Utterance Rewriting

no code implementations24 Feb 2022 Yong Zhang, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao

In this paper, we propose a novel method by directly extracting the coreference and omission relationship from the self-attention weight matrix of the transformer instead of word embeddings and edit the original text accordingly to generate the complete utterance.

Word Embeddings

How and what to learn:The modes of machine learning

no code implementations28 Feb 2022 Sihan Feng, Yong Zhang, Fuming Wang, Hong Zhao

We consider weights in pathways that link neurons longitudinally from input neurons to output neurons, or simply weight pathways, as the basic units for understanding a neural network, and decompose a neural network into a series of subnetworks of such weight pathways.

BIG-bench Machine Learning

E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models

no code implementations ACL 2022 Mohammad Akbari, Amin Banitalebi-Dehkordi, Yong Zhang

As such, it can be applied to black-box pre-trained models without a need for architectural manipulations, reassembling of modules, or re-training.

Decision Making Model Compression

Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

no code implementations10 Mar 2022 Saeed Ranjbar Alvar, Lanjun Wang, Jian Pei, Yong Zhang

Image-to-image translation models are shown to be vulnerable to the Membership Inference Attack (MIA), in which the adversary's goal is to identify whether a sample is used to train the model or not.

Image-to-Image Translation Inference Attack +3

Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach

no code implementations29 Mar 2022 Xubo Lyu, Amin Banitalebi-Dehkordi, Mo Chen, Yong Zhang

In complex problems with large state and action spaces, it is advantageous to extend MAPG methods to use higher-level actions, also known as options, to improve the policy search efficiency.

Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +3

Cross-subject Action Unit Detection with Meta Learning and Transformer-based Relation Modeling

no code implementations18 May 2022 Jiyuan Cao, Zhilei Liu, Yong Zhang

Ablation study and visualization show that our MARL can eliminate identity-caused differences, thus obtaining a robust and generalized AU discriminative embedding representation.

Action Unit Detection Emotion Recognition +3

Fast Adversarial Training with Adaptive Step Size

no code implementations6 Jun 2022 Zhichao Huang, Yanbo Fan, Chen Liu, Weizhong Zhang, Yong Zhang, Mathieu Salzmann, Sabine Süsstrunk, Jue Wang

While adversarial training and its variants have shown to be the most effective algorithms to defend against adversarial attacks, their extremely slow training process makes it hard to scale to large datasets like ImageNet.

Spatial Cross-Attention Improves Self-Supervised Visual Representation Learning

no code implementations7 Jun 2022 Mehdi Seyfi, Amin Banitalebi-Dehkordi, Yong Zhang

Unsupervised representation learning methods like SwAV are proved to be effective in learning visual semantics of a target dataset.

object-detection Object Detection +1

Extending Momentum Contrast with Cross Similarity Consistency Regularization

no code implementations7 Jun 2022 Mehdi Seyfi, Amin Banitalebi-Dehkordi, Yong Zhang

Contrastive self-supervised representation learning methods maximize the similarity between the positive pairs, and at the same time tend to minimize the similarity between the negative pairs.

Representation Learning Self-Supervised Learning

Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch

no code implementations14 Jun 2022 Tianyu Zhang, Amin Banitalebi-Dehkordi, Yong Zhang

We propose a new approach for solving the data labeling and inference latency issues in combinatorial optimization based on the use of the reinforcement learning (RL) paradigm.

BIG-bench Machine Learning Combinatorial Optimization +4

AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

no code implementations14 Jun 2022 Amin Banitalebi-Dehkordi, Pratik Gujjar, Yong Zhang

Critically, most recent work assume that such unlabeled data is drawn from the same distribution as the labeled data.

4k Self-Supervised Learning

Multi-Camera View Based Proactive BS Selection and Beam Switching for V2X

no code implementations12 Jul 2022 Bo Lin, Feifei Gao, Yong Zhang, Chengkang Pan, Guangyi Liu

In this paper, we proposed a multi-camera view based proactive BS selection and beam switching that can predict the optimal BS of the user in the future frame and switch the corresponding beam pair.

Multi-Task Learning

Knowledge-Injected Federated Learning

1 code implementation16 Aug 2022 Zhenan Fan, Zirui Zhou, Jian Pei, Michael P. Friedlander, Jiajie Hu, Chengliang Li, Yong Zhang

Federated learning is an emerging technique for training models from decentralized data sets.

Federated Learning

SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

no code implementations15 Aug 2022 Morgan Heisler, Amin Banitalebi-Dehkordi, Yong Zhang

Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects that can be placed into relevant locations in the image (the what and where problems).

Image Augmentation Object +2

A multi view multi stage and multi window framework for pulmonary artery segmentation from CT scans

no code implementations8 Sep 2022 Zeyu Liu, Yi Wang, Jing Wen, Yong Zhang, Hao Yin, Chao Guo, Zhongyu Wang

In addition, in order to improve the segmentation performance, we adopt multi-view and multi-window level method, at the same time we employ a fine-tune strategy to mitigate the impact of inconsistent labeling.

Segmentation

Fine-Grained Face Swapping via Regional GAN Inversion

no code implementations CVPR 2023 Zhian Liu, Maomao Li, Yong Zhang, Cairong Wang, Qi Zhang, Jue Wang, Yongwei Nie

We rethink face swapping from the perspective of fine-grained face editing, \textit{i. e., ``editing for swapping'' (E4S)}, and propose a framework that is based on the explicit disentanglement of the shape and texture of facial components.

Disentanglement Face Swapping

3D GAN Inversion with Facial Symmetry Prior

no code implementations CVPR 2023 Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang

It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.

Image Reconstruction Neural Rendering

Improving Fast Adversarial Training with Prior-Guided Knowledge

no code implementations1 Apr 2023 Xiaojun Jia, Yong Zhang, Xingxing Wei, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao

This initialization is generated by using high-quality adversarial perturbations from the historical training process.

Multi-User Matching and Resource Allocation in Vision Aided Communications

no code implementations18 Apr 2023 Weihua Xu, Feifei Gao, Yong Zhang, Chengkang Pan, Guangyi Liu

Visual perception is an effective way to obtain the spatial characteristics of wireless channels and to reduce the overhead for communications system.

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance

no code implementations1 Jun 2023 Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong

Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.

Image Generation Video Generation

Robust Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

no code implementations1 Jun 2023 Ruotong Wang, Hongrui Chen, Zihao Zhu, Li Liu, Yong Zhang, Yanbo Fan, Baoyuan Wu

We hope that the proposed VSSC trigger and implementation approach could inspire future studies on designing more practical triggers in backdoor attacks.

Backdoor Attack backdoor defense +1

Data-Driven Bilateral Generalized Two-Dimensional Quaternion Principal Component Analysis with Application to Color Face Recognition

no code implementations12 Jun 2023 Mei-Xiang Zhao, Zhi-Gang Jia, Dun-Wei Gong, Yong Zhang

A new data-driven bilateral generalized two-dimensional quaternion principal component analysis (BiG2DQPCA) is presented to extract the features of matrix samples from both row and column directions.

Face Recognition Image Reconstruction

Intelligence of Astronomical Optical Telescope: Present Status and Future Perspectives

no code implementations29 Jun 2023 Kang Huang, Tianzhu Hu, Jingyi Cai, Xiushan Pang, Yonghui Hou, Yong Zhang, Huaiqing Wang, Xiangqun Cui

Artificial intelligence technology has been widely used in astronomy, and new artificial intelligence technologies and application scenarios are constantly emerging.

Astronomy

OpenSiteRec: An Open Dataset for Site Recommendation

no code implementations3 Jul 2023 Xinhang Li, Xiangyu Zhao, Yejing Wang, Yu Liu, Yong Li, Cheng Long, Yong Zhang, Chunxiao Xing

As a representative information retrieval task, site recommendation, which aims at predicting the optimal sites for a brand or an institution to open new branches in an automatic data-driven way, is beneficial and crucial for brand development in modern business.

Benchmarking Information Retrieval +1

On the Cultural Gap in Text-to-Image Generation

no code implementations6 Jul 2023 Bingshuai Liu, Longyue Wang, Chenyang Lyu, Yong Zhang, Jinsong Su, Shuming Shi, Zhaopeng Tu

Accordingly, we propose a novel multi-modal metric that considers object-text alignment to filter the fine-tuning data in the target culture, which is used to fine-tune a T2I model to improve cross-cultural generation.

Text-to-Image Generation

NOFA: NeRF-based One-shot Facial Avatar Reconstruction

no code implementations7 Jul 2023 Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu

In this work, we propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.

Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism

no code implementations7 Aug 2023 Jiaxin Fan, Yong Zhang, Hanzhang Li, Jianzong Wang, Zhitao Li, Sheng Ouyang, Ning Cheng, Jing Xiao

Chinese Automatic Speech Recognition (ASR) error correction presents significant challenges due to the Chinese language's unique features, including a large character set and borderless, morpheme-based structure.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Prompt Guided Copy Mechanism for Conversational Question Answering

no code implementations7 Aug 2023 Yong Zhang, Zhitao Li, Jianzong Wang, Yiming Gao, Ning Cheng, Fengying Yu, Jing Xiao

Conversational Question Answering (CQA) is a challenging task that aims to generate natural answers for conversational flow questions.

Conversational Question Answering

ToonTalker: Cross-Domain Face Reenactment

no code implementations ICCV 2023 Yuan Gong, Yong Zhang, Xiaodong Cun, Fei Yin, Yanbo Fan, Xuan Wang, Baoyuan Wu, Yujiu Yang

Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint.

Face Reenactment Talking Face Generation

SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network

no code implementations13 Oct 2023 Lei Yao, Yong Zhang, Zilong Yan, Jialu Tian

In the rapid development of artificial intelligence, solving complex AI tasks is a crucial technology in intelligent mobile networks.

ArchBERT: Bi-Modal Understanding of Neural Architectures and Natural Languages

no code implementations26 Oct 2023 Mohammad Akbari, Saeed Ranjbar Alvar, Behnam Kamranian, Amin Banitalebi-Dehkordi, Yong Zhang

Despite the success of these multi-modal language models with different modalities, there is no existing solution for neural network architectures and natural languages.

AutoML Question Answering

Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process

no code implementations6 Jan 2024 Zhenan Fan, Bissan Ghaddar, Xinglu Wang, Linzi Xing, Yong Zhang, Zirui Zhou

The rapid advancement of artificial intelligence (AI) techniques has opened up new opportunities to revolutionize various fields, including operations research (OR).

Decision Making Model Optimization

Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

no code implementations11 Jan 2024 Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang, Zijie Geng, Yang Li, Haoyang Liu, Zhiwu An, Muming Yang, Jianshu Li, Jie Wang, Junchi Yan, Defeng Sun, Tao Zhong, Yong Zhang, Jia Zeng, Mingxuan Yuan, Jianye Hao, Jun Yao, Kun Mao

To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional optimization techniques.

Decision Making Management

Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

no code implementations18 Jan 2024 Yong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng, Ming Li, Jing Xiao, Jianzong Wang

Large Language Models (LLMs) have shown significant promise in various applications, including zero-shot and few-shot learning.

Few-Shot Learning In-Context Learning +2

Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning

no code implementations30 Mar 2024 Xiaopeng Xie, Ming Yan, Xiwen Zhou, Chenlong Zhao, Suli Wang, Yong Zhang, Joey Tianyi Zhou

In addressing this issue, we are inspired by the notion that a backdoor acts as a shortcut and posit that this shortcut stems from the contrast between the trigger and the data utilized for poisoning.

Data Augmentation Few-Shot Text Classification +1

Fast Gradient Computation for Gromov-Wasserstein Distance

no code implementations13 Apr 2024 Wei zhang, ZiHao Wang, Jie Fan, Hao Wu, Yong Zhang

In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity.

Cannot find the paper you are looking for? You can Submit a new open access paper.