no code implementations • 6 Dec 2024 • Jixun Yao, Yuguang Yang, Yu Pan, Ziqian Ning, Jiaohao Ye, Hongbin Zhou, Lei Xie
Zero-shot voice conversion (VC) aims to transfer the timbre from the source speaker to an arbitrary unseen speaker while preserving the original linguistic content.
no code implementations • 28 Nov 2024 • Chenyu Tang, Ruizhi Zhang, Shuo Gao, Zihe Zhao, Zibo Zhang, Jiaqi Wang, Cong Li, Junliang Chen, Yanning Dai, Shengbo Wang, Ruoyu Juan, Qiaoying Li, Ruimou Xie, Xuhang Chen, Xinkai Zhou, Yunjia Xia, Jianan Chen, Fanghao Lu, Xin Li, Ninglli Wang, Peter Smielewski, Yu Pan, Hubin Zhao, Luigi G. Occhipinti
At-home rehabilitation for post-stroke patients presents significant challenges, as continuous, personalized care is often limited outside clinical settings.
no code implementations • 27 Nov 2024 • Chenyu Tang, Shuo Gao, Cong Li, Wentian Yi, Yuxuan Jin, Xiaoxue Zhai, Sixuan Lei, Hongbei Meng, Zibo Zhang, Muzi Xu, Shengbo Wang, Xuhang Chen, Chenxi Wang, Hongyun Yang, Ningli Wang, Wenyu Wang, Jin Cao, Xiaodong Feng, Peter Smielewski, Yu Pan, Wenhui Song, Martin Birchall, Luigi G. Occhipinti
Wearable silent speech systems hold significant potential for restoring communication in patients with speech impairments.
no code implementations • 25 Nov 2024 • Dun Zeng, Zheshun Wu, Shiyu Liu, Yu Pan, Xiaoying Tang, Zenglin Xu
Through this framework, we show how the generalization of FL algorithms is affected by the interplay of algorithmic stability and optimization.
no code implementations • 19 Nov 2024 • Shang Liu, Yu Pan, Guanting Chen, Xiaocheng Li
In this paper, we propose a framework for learning RMs under ordinal feedback which generalizes the case of binary preference feedback to any arbitrary granularity.
no code implementations • 4 Nov 2024 • Yu Pan, Hongfeng Yu, Tianjiao Zhao, Jianxin Sun
A benchmark named InConDB is presented and extensive experiments are conducted to show the performance of different language models in enabling in-context database by varying the database encoding method, prompting method, operation type and input data distribution, revealing both the proficiency and limitations.
no code implementations • 4 Nov 2024 • Yu Pan, Yuguang Yang, Jixun Yao, Jianhao Ye, Hongbin Zhou, Lei Ma, Jianjun Zhao
Zero-shot voice conversion (VC) aims to transform the timbre of a source speaker into any previously unseen target speaker, while preserving the original linguistic content.
no code implementations • 31 Oct 2024 • Yu Pan, Jianxin Sun, Hongfeng Yu, Joe Luck, Geng Bai, Nipuna Chamara, Yufeng Ge, Tala Awada
Current agricultural data management and analysis paradigms are to large extent traditional, in which data collecting, curating, integration, loading, storing, sharing and analyzing still involve too much human effort and know-how.
no code implementations • 8 Oct 2024 • Gongxin Yao, Xinyang Li, Luowei Fu, Yu Pan
To this end, one of the key challenges is cross-modal place recognition, which involves retrieving 3D scenes (point clouds) from a LiDAR map according to online RGB images.
no code implementations • 2 Oct 2024 • Yuguang Yang, Yu Pan, Jixun Yao, Xiang Zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao
Expressive zero-shot voice conversion (VC) is a critical and challenging task that aims to transform the source timbre into an arbitrary unseen speaker while preserving the original content and expressive qualities.
no code implementations • 18 Sep 2024 • Sijing Chen, Yuan Feng, Laipeng He, Tianwei He, Wendi He, Yanni Hu, Bin Lin, Yiting Lin, Yu Pan, Pengfei Tan, Chengwei Tian, Chen Wang, Zhicheng Wang, Ruoye Xie, Jixun Yao, Quanlei Yan, Yuguang Yang, Jianhao Ye, JingJing Yin, Yanzhen Yu, Huimin Zhang, Xiang Zhang, Guangcheng Zhao, Hongbin Zhou, Pengpeng Zou
In this report, we introduce Takin AudioLLM, a series of techniques and models, mainly including Takin TTS, Takin VC, and Takin Morphing, specifically designed for audiobook production.
no code implementations • 5 Aug 2024 • Gongxin Yao, Yixin Xuan, Xinyang Li, Yu Pan
Image-to-point cloud registration aims to determine the relative camera pose of an RGB image with respect to a point cloud.
no code implementations • 5 Aug 2024 • Gongxin Yao, Xinyang Li, Yixin Xuan, Yu Pan
Image-to-point cloud registration seeks to estimate their relative camera pose, which remains an open question due to the data modality gaps.
no code implementations • 31 Jul 2024 • An Wu, Yu Pan, Fuqi Zhou, Jinghui Yan, Chuanlu Liu
Persistent homology is an effective method for extracting topological information, represented as persistent diagrams, of spatial structure data.
no code implementations • 16 Jul 2024 • Jixun Yao, Qing Wang, Pengcheng Guo, Ziqian Ning, Yuguang Yang, Yu Pan, Lei Xie
Meanwhile, we propose a straightforward anonymization strategy that employs empty embedding with zero values to simulate the speaker identity concealment process, eliminating the need for conversion to a pseudo-speaker identity and thereby reducing the complexity of speaker anonymization process.
no code implementations • 27 Jun 2024 • Yixin Xuan, Xinyang Li, Gongxin Yao, Shiwei Zhou, Donghui Sun, Xiaoxin Chen, Yu Pan
High-fidelity reconstruction of 3D human avatars has a wild application in visual reality.
no code implementations • 23 May 2024 • Hanzhao Wang, Yu Pan, Fupeng Sun, Shang Liu, Kalyan Talluri, Guanting Chen, Xiaocheng Li
In this paper, we consider the supervised pre-trained transformer for a class of sequential decision-making problems.
no code implementations • 20 May 2024 • Xinyang Li, Jiaxin Wang, Yixin Xuan, Gongxin Yao, Yu Pan
We propose GGAvatar, a novel 3D avatar representation designed to robustly model dynamic head avatars with complex identities and deformations.
no code implementations • 3 May 2024 • Yu Pan, Yuguang Yang, Heng Lu, Lei Ma, Jianjun Zhao
The continuous evolution of pre-trained speech models has greatly advanced Speech Emotion Recognition (SER).
no code implementations • 26 Apr 2024 • Xindi Zheng, Yuwei Wu, Yu Pan, WanYu Lin, Lei Ma, Jianjun Zhao
The crux of our work is that it admits both global and local representations of the input graph signal, which can capture the long-range dependencies.
1 code implementation • 24 Apr 2024 • Linyu Liu, Yu Pan, Xiaocheng Li, Guanting Chen
In this paper, we study the problem of uncertainty estimation and calibration for LLMs.
no code implementations • 3 Apr 2024 • Yu Pan, Xiang Zhang, Yuguang Yang, Jixun Yao, Yanni Hu, Jianhao Ye, Hongbin Zhou, Lei Ma, Jianjun Zhao
In this paper, we propose PSCodec, a series of neural speech codecs based on prompt encoders, comprising PSCodec-Base, PSCodec-DRL-ICT, and PSCodec-CasAN, which are capable of delivering high-performance speech reconstruction with low bandwidths.
no code implementations • 1 Feb 2024 • Maolin Wang, Yu Pan, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao, Wanyu Wang, Yiqi Wang, Zitao Liu, Langming Liu
Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction, and empirical validation of CuFun's effectiveness through extensive experimentation on synthetic and real-world datasets.
1 code implementation • 17 Jan 2024 • Yu Pan, Ye Yuan, Yichun Yin, Jiaxin Shi, Zenglin Xu, Ming Zhang, Lifeng Shang, Xin Jiang, Qun Liu
The rapid progress of Transformers in artificial intelligence has come at the cost of increased resource consumption and greenhouse gas emissions due to growing model sizes.
no code implementations • 7 Nov 2023 • Yu Pan, Jianxin Sun, Hongfeng Yu, Geng Bai, Yufeng Ge, Joe Luck, Tala Awada
At the same time, the sheer amount of data poses a great challenge to data storage and analysis, and the \textit{de facto} data management and analysis practices adopted by scientists have become increasingly inefficient.
no code implementations • 9 Oct 2023 • Xin Liu, Wei Li, Dazhi Zhan, Yu Pan, Xin Ma, Yu Ding, Zhisong Pan
Federated learning (FL) is a widely employed distributed paradigm for collaboratively training machine learning models from multiple clients without sharing local data.
no code implementations • 4 Oct 2023 • Dun Zeng, Zenglin Xu, Yu Pan, Xu Luo, Qifan Wang, Xiaoying Tang
Central to this process is the technique of unbiased client sampling, which ensures a representative selection of clients.
2 code implementations • 4 Oct 2023 • Dun Zeng, Zenglin Xu, Shiyu Liu, Yu Pan, Qifan Wang, Xiaoying Tang
Federated averaging (FedAvg) is the most fundamental algorithm in Federated learning (FL).
no code implementations • 17 Sep 2023 • Jixun Yao, Yuguang Yang, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, JingJing Yin, Hongbin Zhou, Heng Lu, Lei Xie
In this study, we propose PromptVC, a novel style voice conversion approach that employs a latent diffusion model to generate a style vector driven by natural language prompts.
no code implementations • 8 Aug 2023 • Yu Pan, Yuguang Yang, Yuheng Huang, Jixun Yao, JingJing Yin, Yanni Hu, Heng Lu, Lei Ma, Jianjun Zhao
Despite notable progress, speech emotion recognition (SER) remains challenging due to the intricate and ambiguous nature of speech emotion, particularly in wild world.
no code implementations • 14 Jul 2023 • Gongxin Yao, Yixin Xuan, YiWei Chen, Yu Pan
Image-to-point cloud registration aims to determine the relative camera pose between an RGB image and a reference point cloud, serving as a general solution for locating 3D objects from 2D observations.
no code implementations • 13 Jun 2023 • Yu Pan, Yanni Hu, Yuguang Yang, Wen Fei, Jixun Yao, Heng Lu, Lei Ma, Jianjun Zhao
Contrastive cross-modality pretraining has recently exhibited impressive success in diverse fields, whereas there is limited research on their merits in speech emotion recognition (SER).
no code implementations • 5 Jun 2023 • Maolin Wang, Yaoming Zhen, Yu Pan, Yao Zhao, Chenyi Zhuang, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao
THNN is a faithful hypergraph modeling framework through high-order outer product feature message passing and is a natural tensor extension of the adjacency-matrix-based graph neural networks.
1 code implementation • 15 Mar 2023 • Yuguang Yang, Yu Pan, JingJing Yin, Jiangyu Han, Lei Ma, Heng Lu
SqueezeFormer has recently shown impressive performance in automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 3 Mar 2023 • YiWei Chen, Chen Jiang, Yu Pan
Single-Photon Image Super-Resolution (SPISR) aims to recover a high-resolution volumetric photon counting cube from a noisy low-resolution one by computational imaging algorithms.
1 code implementation • 22 Jan 2023 • Maolin Wang, Yu Pan, Zenglin Xu, Xiangli Yang, Guangxi Li, Andrzej Cichocki
Interestingly, although these two types of networks originate from different observations, they are inherently linked through the common multilinearity structure underlying both TNs and NNs, thereby motivating a significant number of intellectual developments regarding combinations of TNs and NNs.
1 code implementation • 5 Dec 2022 • Yuguang Yang, Yu Pan, JingJing Yin, Heng Lu
This paper proposes a Learnable Multiplicative absolute position Embedding based Conformer (LMEC).
1 code implementation • 28 May 2022 • Yu Pan, Zeyong Su, Ao Liu, Jingquan Wang, Nannan Li, Zenglin Xu
To address this problem, we propose a universal weight initialization paradigm, which generalizes Xavier and Kaiming methods and can be widely applicable to arbitrary TCNNs.
no code implementations • 23 May 2022 • Lei Zhang, Yu Pan, Yi Liu, Qibin Zheng, Zhisong Pan
In order to improve the defense ability of defender, a game model based on reward randomization reinforcement learning is proposed.
no code implementations • 16 May 2022 • Lei Zhang, Yu Pan, Yi Liu, Qibin Zheng, Zhisong Pan
Following that, we proposed a user's permissions reasoning method based on reinforcement learning.
no code implementations • 17 Feb 2022 • Jingquan Wang, Jing Xu, Yu Pan, Zenglin Xu
Few-shot learning aims to classify unseen classes with only a limited number of labeled data.
2 code implementations • 7 Jan 2022 • YiWei Chen, Gongxin Yao, Yong liu, Hongye Su, Xiaomin Hu, Yu Pan
Photon-efficient imaging with the single-photon light detection and ranging (LiDAR) captures the three-dimensional (3D) structure of a scene by only a few detected signal photons per pixel.
2 code implementations • 5 Jan 2022 • Gongxin Yao, YiWei Chen, Yong liu, Xiaomin Hu, Yu Pan
Single-photon light detection and ranging (LiDAR) has been widely applied to 3D imaging in challenging scenarios.
no code implementations • 16 Nov 2021 • Yu Pan, Kwo-Sen Kuo, Michael L. Rilee, Hongfeng Yu
Deep Neural Networks (DNNs) have performed admirably in classification tasks.
no code implementations • 18 Oct 2021 • Langzhang Liang, Cuiyun Gao, Shiyi Chen, Shishi Duan, Yu Pan, Junjin Zheng, Lei Wang, Zenglin Xu
Graph Convolutional Networks (GCNs) are powerful for processing graph-structured data and have achieved state-of-the-art performance in several tasks such as node classification, link prediction, and graph classification.
no code implementations • 18 Sep 2021 • Yuguang Yang, Yu Pan, Xin Dong, Minqiang Xu
Second, we design a novel model inference scheme based on RepVGG which can efficiently improve the QbE search quality.
1 code implementation • 19 Aug 2021 • YiWei Chen, Yu Pan, Daoyi Dong
We prove that such a rule is much more relaxed than that of TT, which means ResTT can easily address the vanishing and exploding gradient problem that exists in the existing TT models.
no code implementations • 10 May 2021 • Xinglin Pan, Jing Xu, Yu Pan, Liangjian Wen, WenXiang Lin, Kun Bai, Zenglin Xu
Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks including image classification.
1 code implementation • 11 Apr 2021 • Yu Pan, Maolin Wang, Zenglin Xu
Tensor Decomposition Networks (TDNs) prevail for their inherent compact architectures.
1 code implementation • 8 Mar 2021 • YiWei Chen, Yu Pan, Guofeng Zhang, Shuming Cheng
Quantum properties, such as entanglement and coherence, are indispensable resources in various quantum information processing tasks.
no code implementations • 25 Jan 2021 • Shibo Zhou, Yu Pan
Since time series always contains a lot of noise, which has a negative impact on network training, people usually filter the original data before training the network.
14 code implementations • 3 Jan 2021 • Jing Xu, Yu Pan, Xinglin Pan, Steven Hoi, Zhang Yi, Zenglin Xu
The ResNet and its variants have achieved remarkable successes in various computer vision tasks.
Ranked #3 on Medical Image Classification on NCT-CRC-HE-100K
no code implementations • 1 Jan 2021 • Xinglin Pan, Jing Xu, Yu Pan, WenXiang Lin, Liangjian Wen, Zenglin Xu
Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks, e. g., image classification.
no code implementations • 22 Sep 2020 • Nannan Li, Yu Pan, Yaran Chen, Zixiang Ding, Dongbin Zhao, Zenglin Xu
Interestingly, we discover that part of the rank elements is sensitive and usually aggregate in a narrow region, namely an interest region.
no code implementations • 23 Aug 2020 • Yi-Wei Chen, Yu Pan, Daoyi Dong
Quantum Language Models (QLMs) in which words are modelled as quantum superposition of sememes have demonstrated a high level of model transparency and good post-hoc interpretability.
2 code implementations • 8 Jul 2020 • Junhua Zou, Yexin Duan, Boyu Li, Wu Zhang, Yu Pan, Zhisong Pan
Fast gradient sign attack series are popular methods that are used to generate adversarial examples.
16 code implementations • CVPR 2020 • Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom
Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar.
Ranked #316 on 3D Object Detection on nuScenes (using extra training data)
1 code implementation • NIPS Workshop CDNNRIA 2018 • Yu Pan, Jing Xu, Maolin Wang, Jinmian Ye, Fei Wang, Kun Bai, Zenglin Xu
Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling.