Search Results for author: Siyuan Li

Found 77 papers, 50 papers with code

An Optimal Online Method of Selecting Source Policies for Reinforcement Learning

no code implementations24 Sep 2017 Siyuan Li, Chongjie Zhang

In this paper, we develop an optimal online method to select source policies for reinforcement learning.

Q-Learning reinforcement-learning +3

Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

no code implementations29 Nov 2017 Xinqing Guo, Zhang Chen, Siyuan Li, Yang Yang, Jingyi Yu

We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching.

Stereo Matching Stereo Matching Hand

Fast Single Image Rain Removal via a Deep Decomposition-Composition Network

no code implementations8 Apr 2018 Siyuan LI, Wenqi Ren, Jiawan Zhang, Jinke Yu, Xiaojie Guo

Rain effect in images typically is annoying for many multimedia and computer vision tasks.

Rain Removal

Context-Aware Policy Reuse

no code implementations11 Jun 2018 Siyuan Li, Fangda Gu, Guangxiang Zhu, Chongjie Zhang

Transfer learning can greatly speed up reinforcement learning for a new task by leveraging policies of relevant tasks.

Transfer Learning

PFLD: A Practical Facial Landmark Detector

18 code implementations28 Feb 2019 Xiaojie Guo, Siyuan Li, Jinke Yu, Jiawan Zhang, Jiayi Ma, Lin Ma, Wei Liu, Haibin Ling

Being accurate, efficient, and compact is essential to a facial landmark detector for practical use.

Face Alignment Facial Landmark Detection

Single Image Deraining: A Comprehensive Benchmark Analysis

1 code implementation CVPR 2019 Siyuan Li, Iago Breno Araujo, Wenqi Ren, Zhangyang Wang, Eric K. Tokuda, Roberto Hirata Junior, Roberto Cesar-Junior, Jiawan Zhang, Xiaojie Guo, Xiaochun Cao

We present a comprehensive study and evaluation of existing single image deraining algorithms, using a new large-scale benchmark consisting of both synthetic and real-world rainy images. This dataset highlights diverse data sources and image contents, and is divided into three subsets (rain streak, rain drop, rain and mist), each serving different training or evaluation purposes.

Single Image Deraining

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

1 code implementation NeurIPS 2019 Siyuan Li, Rui Wang, Minxue Tang, Chongjie Zhang

In addition, we also theoretically prove that optimizing low-level skills with this auxiliary reward will increase the task return for the joint policy.

Hierarchical Reinforcement Learning reinforcement-learning +1

Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals

no code implementations CVPR 2020 Siyuan Li, Semih Günel, Mirela Ostrek, Pavan Ramdya, Pascal Fua, Helge Rhodin

We compare our approach with existing domain transfer methods and demonstrate improved pose estimation accuracy on Drosophila melanogaster (fruit fly), Caenorhabditis elegans (worm) and Danio rerio (zebrafish), without requiring any manual annotation on the target domain and despite using simplistic off-the-shelf animal characters for simulation, or simple geometric shapes as models.

Pose Estimation Translation

Generalized Clustering and Multi-Manifold Learning with Geometric Structure Preservation

1 code implementation21 Sep 2020 Lirong Wu, Zicheng Liu, Zelin Zang, Jun Xia, Siyuan Li, Stan Z. Li

Though manifold-based clustering has become a popular research topic, we observe that one important factor has been omitted by these works, namely that the defined clustering loss may corrupt the local and global structure of the latent space.

Clustering Deep Clustering +1

Deep Clustering and Representation Learning that Preserves Geometric Structures

no code implementations28 Sep 2020 Lirong Wu, Zicheng Liu, Zelin Zang, Jun Xia, Siyuan Li, Stan Z. Li

To overcome the problem that clusteringoriented losses may deteriorate the geometric structure of embeddings in the latent space, an isometric loss is proposed for preserving intra-manifold structure locally and a ranking loss for inter-manifold structure globally.

Clustering Deep Clustering +1

Invertible Manifold Learning for Dimension Reduction

1 code implementation7 Oct 2020 Siyuan Li, Haitao Lin, Zelin Zang, Lirong Wu, Jun Xia, Stan Z. Li

Dimension reduction (DR) aims to learn low-dimensional representations of high-dimensional data with the preservation of essential information.

Dimensionality Reduction

Learning Subgoal Representations with Slow Dynamics

no code implementations ICLR 2021 Siyuan Li, Lulu Zheng, Jianhao Wang, Chongjie Zhang

In goal-conditioned Hierarchical Reinforcement Learning (HRL), a high-level policy periodically sets subgoals for a low-level policy, and the low-level policy is trained to reach those subgoals.

Continuous Control Hierarchical Reinforcement Learning +1

Towards Robust Graph Neural Networks against Label Noise

no code implementations1 Jan 2021 Jun Xia, Haitao Lin, Yongjie Xu, Lirong Wu, Zhangyang Gao, Siyuan Li, Stan Z. Li

A pseudo label is computed from the neighboring labels for each node in the training set using LP; meta learning is utilized to learn a proper aggregation of the original and pseudo label as the final label.

Attribute Learning with noisy labels +3

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers

2 code implementations24 Mar 2021 Zicheng Liu, Siyuan Li, Di wu, Zihan Liu, ZhiYuan Chen, Lirong Wu, Stan Z. Li

Specifically, AutoMix reformulates the mixup classification into two sub-tasks (i. e., mixed sample generation and mixup classification) with corresponding sub-networks and solves them in a bi-level optimization framework.

Classification Data Augmentation +3

Unsupervised Deep Manifold Attributed Graph Embedding

1 code implementation27 Apr 2021 Zelin Zang, Siyuan Li, Di wu, Jianzhu Guo, Yongjie Xu, Stan Z. Li

Unsupervised attributed graph representation learning is challenging since both structural and feature information are required to be represented in the latent space.

Clustering Graph Embedding +3

Active Hierarchical Exploration with Stable Subgoal Representation Learning

1 code implementation ICLR 2022 Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang

Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.

Continuous Control Hierarchical Reinforcement Learning +1

Improving Discriminative Visual Representation Learning via Automatic Mixup

no code implementations29 Sep 2021 Siyuan Li, Zicheng Liu, Di wu, Stan Z. Li

In this paper, we decompose mixup into two sub-tasks of mixup generation and classification and formulate it for discriminative representations as class- and instance-level mixup.

Data Augmentation Representation Learning

Offline Reinforcement Learning with Reverse Model-based Imagination

1 code implementation NeurIPS 2021 Jianhao Wang, Wenzhe Li, Haozhe Jiang, Guangxiang Zhu, Siyuan Li, Chongjie Zhang

These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset.

Data Augmentation Offline RL +2

GenURL: A General Framework for Unsupervised Representation Learning

1 code implementation27 Oct 2021 Siyuan Li, Zicheng Liu, Zelin Zang, Di wu, ZhiYuan Chen, Stan Z. Li

Unsupervised representation learning (URL) that learns compact embeddings of high-dimensional data without supervision has achieved remarkable progress recently.

Contrastive Learning Dimensionality Reduction +4

A Shared Representation for Photorealistic Driving Simulators

1 code implementation9 Dec 2021 Saeed Saadatnejad, Siyuan Li, Taylor Mordan, Alexandre Alahi

We build on successful cGAN models to propose a new semantically-aware discriminator that better guides the generator.

Autonomous Vehicles Image Generation +1

Style Transformer for Image Inversion and Editing

1 code implementation CVPR 2022 Xueqi Hu, Qiusheng Huang, Zhengyi Shi, Siyuan Li, Changxin Gao, Li Sun, Qingli Li

Existing GAN inversion methods fail to provide latent codes for reliable reconstruction and flexible editing simultaneously.

Attribute Image-to-Image Translation

Harnessing Hard Mixed Samples with Decoupled Regularizer

1 code implementation NeurIPS 2023 Zicheng Liu, Siyuan Li, Ge Wang, Cheng Tan, Lirong Wu, Stan Z. Li

However, we found that the extra optimizing step may be redundant because label-mismatched mixed samples are informative hard mixed samples for deep models to localize discriminative features.

Data Augmentation

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

1 code implementation CVPR 2022 Ye Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, XiaoHu Qie

Finding relevant moments and highlights in videos according to natural language queries is a natural and highly valuable common need in the current video content explosion era.

Highlight Detection Moment Retrieval +3

neuro2vec: Masked Fourier Spectrum Prediction for Neurophysiological Representation Learning

1 code implementation20 Apr 2022 Di wu, Siyuan Li, Jie Yang, Mohamad Sawan

Extensive data labeling on neurophysiological signals is often prohibitively expensive or impractical, as it may require particular infrastructure or domain expertise.

EEG Electroencephalogram (EEG) +3

A heuristic method for data allocation and task scheduling on heterogeneous multiprocessor systems under memory constraints

no code implementations9 May 2022 Junwen Ding, Liangcai Song, Siyuan Li, Chen Wu, Ronghua He, Zhouxing Su, Zhipeng Lü

Computing workflows in heterogeneous multiprocessor systems are frequently modeled as directed acyclic graphs of tasks and data blocks, which represent computational modules and their dependencies in the form of data produced by a task and used by others.

Job Shop Scheduling Scheduling

Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions

1 code implementation15 May 2022 Fang Wu, Siyuan Li, Lirong Wu, Dragomir Radev, Stan Z. Li

Graph neural networks (GNNs) mainly rely on the message-passing paradigm to propagate node features and build interactions, and different graph learning tasks require different ranges of node interactions.

graph construction Graph Learning +2

Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN

3 code implementations27 May 2022 Siyuan Li, Di wu, Fang Wu, Zelin Zang, Stan. Z. Li

We then propose an Architecture-Agnostic Masked Image Modeling framework (A$^2$MIM), which is compatible with both Transformers and CNNs in a unified way.

Instance Segmentation Object Detection +3

Hyperspherical Consistency Regularization

1 code implementation CVPR 2022 Cheng Tan, Zhangyang Gao, Lirong Wu, Siyuan Li, Stan Z. Li

Though it benefits from taking advantage of both feature-dependent information from self-supervised learning and label-dependent information from supervised learning, this scheme remains suffering from bias of the classifier.

Contrastive Learning Self-Supervised Learning +1

DLME: Deep Local-flatness Manifold Embedding

2 code implementations7 Jul 2022 Zelin Zang, Siyuan Li, Di wu, Ge Wang, Lei Shang, Baigui Sun, Hao Li, Stan Z. Li

To overcome the underconstrained embedding problem, we design a loss and theoretically demonstrate that it leads to a more suitable embedding based on the local flatness.

Contrastive Learning Data Augmentation +1

Tracking Every Thing in the Wild

1 code implementation26 Jul 2022 Siyuan Li, Martin Danelljan, Henghui Ding, Thomas E. Huang, Fisher Yu

Our experiments show that TETA evaluates trackers more comprehensively, and TETer achieves significant improvements on the challenging large-scale datasets BDD100K and TAO compared to the state-of-the-art.

Benchmarking Classification +2

Are Gradients on Graph Structure Reliable in Gray-box Attacks?

1 code implementation7 Aug 2022 Zihan Liu, Yun Luo, Lirong Wu, Siyuan Li, Zicheng Liu, Stan Z. Li

These errors arise from rough gradient usage due to the discreteness of the graph structure and from the unreliability in the meta-gradient on the graph structure.

Computational Efficiency

MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization

no code implementations1 Sep 2022 Hui Niu, Siyuan Li, Jian Li

We evaluate the proposed approach on three real-world index datasets and compare it to state-of-the-art baselines.

Imitation Learning Management +3

OpenMixup: A Comprehensive Mixup Benchmark for Visual Classification

1 code implementation11 Sep 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Di wu, Cheng Tan, Weiyang Jin, Stan Z. Li

Data mixing, or mixup, is a data-dependent augmentation technique that has greatly enhanced the generalizability of modern deep neural networks.

Benchmarking Classification +3

CUP: Critic-Guided Policy Reuse

1 code implementation15 Oct 2022 Jin Zhang, Siyuan Li, Chongjie Zhang

The ability to reuse previous policies is an important aspect of human intelligence.

Classifying Ambiguous Identities in Hidden-Role Stochastic Games with Multi-Agent Reinforcement Learning

1 code implementation24 Oct 2022 Shijie Han, Siyuan Li, Bo An, Wei Zhao, Peng Liu

In this work, we develop a novel identity detection reinforcement learning (IDRL) framework that allows an agent to dynamically infer the identities of nearby agents and select an appropriate policy to accomplish the task.

Multi-agent Reinforcement Learning reinforcement-learning +2

Leveraging Graph-based Cross-modal Information Fusion for Neural Sign Language Translation

no code implementations1 Nov 2022 Jiangbin Zheng, Siyuan Li, Cheng Tan, Chong Wu, Yidong Chen, Stan Z. Li

Therefore, we propose to introduce additional word-level semantic knowledge of sign language linguistics to assist in improving current end-to-end neural SLT models.

Sign Language Translation Translation

MogaNet: Multi-order Gated Aggregation Network

6 code implementations7 Nov 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Cheng Tan, Haitao Lin, Di wu, ZhiYuan Chen, Jiangbin Zheng, Stan Z. Li

Notably, MogaNet hits 80. 0\% and 87. 8\% accuracy with 5. 2M and 181M parameters on ImageNet-1K, outperforming ParC-Net and ConvNeXt-L, while saving 59\% FLOPs and 17M parameters, respectively.

3D Human Pose Estimation Image Classification +6

Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning

2 code implementations12 Nov 2022 Ziyi Zhang, Weikai Chen, Hui Cheng, Zhen Li, Siyuan Li, Liang Lin, Guanbin Li

We investigate a practical domain adaptation task, called source-free domain adaptation (SFUDA), where the source-pretrained model is adapted to the target domain without access to the source data.

Contrastive Learning Source-Free Domain Adaptation

SimVP: Towards Simple yet Powerful Spatiotemporal Predictive Learning

2 code implementations22 Nov 2022 Cheng Tan, Zhangyang Gao, Siyuan Li, Stan Z. Li

Without introducing any extra tricks and strategies, SimVP can achieve superior performance on various benchmark datasets.

Video Prediction

CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text Labels

1 code implementation25 Nov 2022 Siyuan Li, Li Sun, Qingli Li

The key idea is to fully exploit the cross-modal description ability in CLIP through a set of learnable text tokens for each ID and give them to the text encoder to form ambiguous descriptions.

Image Classification Language Modelling +2

Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching

1 code implementation7 Jan 2023 Fang Wu, Siyuan Li, Xurui Jin, Yinghui Jiang, Dragomir Radev, Zhangming Niu, Stan Z. Li

It takes advantage of MatchExplainer to fix the most informative portion of the graph and merely operates graph augmentations on the rest less informative part.

Graph Sampling

RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design

1 code implementation25 Jan 2023 Cheng Tan, Yijie Zhang, Zhangyang Gao, Bozhen Hu, Siyuan Li, Zicheng Liu, Stan Z. Li

We crafted a large, well-curated benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.

Contrastive Learning Protein Design +2

CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment

1 code implementation CVPR 2023 Jiangbin Zheng, Yile Wang, Cheng Tan, Siyuan Li, Ge Wang, Jun Xia, Yidong Chen, Stan Z. Li

In this work, we propose a novel contrastive visual-textual transformation for SLR, CVT-SLR, to fully explore the pretrained knowledge of both the visual and language modalities.

Sign Language Recognition

Lightweight Contrastive Protein Structure-Sequence Transformation

no code implementations19 Mar 2023 Jiangbin Zheng, Ge Wang, Yufei Huang, Bozhen Hu, Siyuan Li, Cheng Tan, Xinwen Fan, Stan Z. Li

In this work, we introduce a novel unsupervised protein structure representation pretraining with a robust protein language model.

Masked Language Modeling Protein Design +1

InstructBio: A Large-scale Semi-supervised Learning Paradigm for Biochemical Problems

1 code implementation8 Apr 2023 Fang Wu, Huiling Qin, Siyuan Li, Stan Z. Li, Xianyuan Zhan, Jinbo Xu

In the field of artificial intelligence for science, it is consistently an essential challenge to face a limited amount of labeled data for real-world problems.

molecular representation Representation Learning

Behavior Contrastive Learning for Unsupervised Skill Discovery

1 code implementation8 May 2023 Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li

Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill, which serves as an upper bound of the previous MI objective.

Continuous Control Contrastive Learning

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

2 code implementations NeurIPS 2023 Cheng Tan, Siyuan Li, Zhangyang Gao, Wenfei Guan, Zedong Wang, Zicheng Liu, Lirong Wu, Stan Z. Li

Spatio-temporal predictive learning is a learning paradigm that enables models to learn spatial and temporal patterns by predicting future frames from given past frames in an unsupervised manner.

Weather Forecasting

Learning to Solve Tasks with Exploring Prior Behaviours

1 code implementation6 Jul 2023 Ruiqi Zhu, Siyuan Li, Tianhong Dai, Chongjie Zhang, Oya Celiktutan

Our method can endow agents with the ability to explore and acquire the required prior behaviours and then connect to the task-specific behaviours in the demonstration to solve sparse-reward tasks without requiring additional demonstration of the prior behaviours.

Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework

1 code implementation24 Jul 2023 Jingxuan Wei, Cheng Tan, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence, especially when tackling complex tasks.

Contrastive Learning Multimodal Reasoning +2

Striking The Right Balance: Three-Dimensional Ocean Sound Speed Field Reconstruction Using Tensor Neural Networks

1 code implementation9 Aug 2023 Siyuan Li, Lei Cheng, Ting Zhang, Hangfang Zhao, Jianlong Li

Accurately reconstructing a three-dimensional ocean sound speed field (3D SSF) is essential for various ocean acoustic applications, but the sparsity and uncertainty of sound speed samples across a vast ocean region make it a challenging task.

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

no code implementations14 Aug 2023 Siyuan Li, Hao Li, Jin Zhang, Zhen Wang, Peng Liu, Chongjie Zhang

Humans have the ability to reuse previously learned policies to solve new tasks quickly, and reinforcement learning (RL) agents can do the same by transferring knowledge from source policies to a related target task.

Continual Learning Reinforcement Learning (RL)

Rethinking Memory and Communication Cost for Efficient Large Language Model Training

no code implementations9 Oct 2023 Chan Wu, Hanxiao Zhang, Lin Ju, Jinjing Huang, Youshao Xiao, ZhaoXin Huan, Siyuan Li, Fanzhuang Meng, Lei Liang, Xiaolu Zhang, Jun Zhou

In this paper, we rethink the impact of memory consumption and communication costs on the training speed of large language models, and propose a memory-communication balanced strategy set Partial Redundancy Optimizer (PaRO).

Language Modelling Large Language Model

Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning under A Unified View

no code implementations9 Oct 2023 Cheng Tan, Jue Wang, Zhangyang Gao, Siyuan Li, Lirong Wu, Jun Xia, Stan Z. Li

In this paper, we re-examine the two dominant temporal modeling approaches within the realm of spatio-temporal predictive learning, offering a unified perspective.

Self-Supervised Learning

Protein 3D Graph Structure Learning for Robust Structure-based Protein Property Prediction

no code implementations14 Oct 2023 Yufei Huang, Siyuan Li, Jin Su, Lirong Wu, Odin Zhang, Haitao Lin, Jingqi Qi, Zihan Liu, Zhangyang Gao, Yuyang Liu, Jiangbin Zheng, Stan. ZQ. Li

To study this problem, we identify a Protein 3D Graph Structure Learning Problem for Robust Protein Property Prediction (PGSL-RP3), collect benchmark datasets, and present a protein Structure embedding Alignment Optimization framework (SAO) to mitigate the problem of structure embedding bias between the predicted and experimental protein structures.

Graph structure learning Property Prediction +2

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

1 code implementation23 Nov 2023 Cheng Tan, Jingxuan Wei, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Xihong Yang, Stan Z. Li

Remarkably, we show that even smaller base models, when equipped with our proposed approach, can achieve results comparable to those of larger models, illustrating the potential of our approach in harnessing the power of rationales for improved multimodal reasoning.

Multimodal Reasoning

Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

1 code implementation31 Dec 2023 Siyuan Li, Luyuan Zhang, Zedong Wang, Di wu, Lirong Wu, Zicheng Liu, Jun Xia, Cheng Tan, Yang Liu, Baigui Sun, Stan Z. Li

As the deep learning revolution marches on, self-supervised learning has garnered increasing attention in recent years thanks to its remarkable representation learning ability and the low dependence on labeled data.

Representation Learning Self-Supervised Learning

Auxiliary Reward Generation with Transition Distance Representation Learning

no code implementations12 Feb 2024 Siyuan Li, Shijie Han, Yingnan Zhao, By Liang, Peng Liu

To achieve automatic auxiliary reward generation, we propose a novel representation learning approach that can measure the ``transition distance'' between states.

Decision Making Reinforcement Learning (RL) +1

Switch EMA: A Free Lunch for Better Flatness and Sharpness

2 code implementations14 Feb 2024 Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di wu, Cheng Tan, Tao Lin, Yang Liu, Baigui Sun, Stan Z. Li

Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization.

Attribute Image Classification +7

Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge

no code implementations18 Feb 2024 Yufei Huang, Odin Zhang, Lirong Wu, Cheng Tan, Haitao Lin, Zhangyang Gao, Siyuan Li, Stan. Z. Li

Accurate prediction of protein-ligand binding structures, a task known as molecular docking is crucial for drug design but remains challenging.

Molecular Docking

MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding

1 code implementation22 Feb 2024 Lirong Wu, Yijun Tian, Yufei Huang, Siyuan Li, Haitao Lin, Nitesh V Chawla, Stan Z. Li

In addition, microenvironments defined in previous work are largely based on experimentally assayed physicochemical properties, for which the "vocabulary" is usually extremely small.

Computational Efficiency

FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio

1 code implementation4 Mar 2024 Chao Xu, Yang Liu, Jiazheng Xing, Weida Wang, Mingze Sun, Jun Dan, Tianxin Huang, Siyuan Li, Zhi-Qi Cheng, Ying Tai, Baigui Sun

In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generation from a single audio.

Disentanglement

Cannot find the paper you are looking for? You can Submit a new open access paper.