no code implementations • ICML 2020 • Yonggang Zhang, Ya Li, Tongliang Liu, Xinmei Tian
To obtain sufficient knowledge for crafting adversarial examples, previous methods query the target model with inputs that are perturbed with different searching directions.
1 code implementation • EMNLP 2021 • Dan Liu, Mengge Du, Xiaoxi Li, Ya Li, Enhong Chen
This paper proposes a novel architecture, Cross Attention Augmented Transducer (CAAT), for simultaneous translation.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 5 Mar 2025 • Keqi Chen, Zekai Sun, Yuhua Wen, Huijun Lian, Yingming Gao, Ya Li
Psy-Insight is not only suitable for tasks such as label recognition but also meets the need for training LLMs to act as empathetic counselors through logical reasoning.
no code implementations • 5 Mar 2025 • Keqi Chen, Zekai Sun, Huijun Lian, Yingming Gao, Ya Li
Moreover, we have developed Psy-Copilot, which is a conversational AI assistant designed to assist human psychological therapists in their consultations.
1 code implementation • 18 Aug 2024 • Qifei Li, Yingming Gao, Yuhua Wen, Cong Wang, Ya Li
To address the limitation in multimodal emotion recognition (MER) performance arising from inter-modal information fusion, we propose a novel MER framework based on multitask learning where fusion occurs after alignment, called Foal-Net.
no code implementations • 9 Jun 2024 • Bingsong Bai, Fengping Wang, Yingming Gao, Ya Li
Diffusion-based singing voice conversion (SVC) models have shown better synthesis quality compared to traditional methods.
no code implementations • 8 Jun 2024 • Songjie Yang, Yizhou Peng, Wanting Lyu, Ya Li, Hongjun He, Zhongpei Zhang, Chau Yuen
Building on this, we propose four estimation frameworks using the sparse recovery theory: polar-domain estimation (PD-E), multi-angular-domain estimation (MAD-E), two-stage polar-angular-domain estimation (TS-PAD-E), and two-dimensional polar-angular-domain estimation (2D-PAD-E).
no code implementations • 6 Jun 2024 • Jinlong Xue, Yayue Deng, Yicheng Han, Yingming Gao, Ya Li
Recent advances in large language models (LLMs) and development of audio codecs greatly propel the zero-shot TTS.
no code implementations • 18 Mar 2024 • Yue Ding, Hongqiao Shi, Shuang Song, Yonghui Wang, Ya Li
The integration of local elements into shape contours is critical for target detection and identification in cluttered scenes.
1 code implementation • 2 Jan 2024 • Jinlong Xue, Yayue Deng, Yingming Gao, Ya Li
Drawing inspiration from state-of-the-art Text-to-Image (T2I) diffusion models, we introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment.
Ranked #6 on
Audio Generation
on AudioCaps
(FAD metric)
no code implementations • 2 Jan 2024 • Wanting Lyu, Songjie Yang, Yue Xiu, Ya Li, Hongjun He, Chau Yuen, Zhongpei Zhang
In this paper, reconfigurable intelligent surface (RIS) is employed in a millimeter wave (mmWave) integrated sensing and communications (ISAC) system.
1 code implementation • 27 Dec 2023 • Qifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li
To address this problem, we propose a frame-level emotional state alignment method for SER.
no code implementations • 26 Dec 2023 • Zi-Feng Mai, Chang-Dong Wang, Zhongjie Zeng, Ya Li, Jiaquan Chen, Philip S. Yu
To settle the above challenges, we propose a novel method HEKP4NBR, which transforms the knowledge graph (KG) into prompts, namely Knowledge Tree Prompt (KTP), to help PLM encode the OOV item IDs in the user's basket sequence.
no code implementations • 16 Dec 2023 • Yayue Deng, Jinlong Xue, Yukang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li
In this paper, we introduce a contrastive learning-based CSS framework, CONCSS.
no code implementations • 19 Oct 2023 • Yizhou Peng, Songjie Yang, Wanting Lyu, Ya Li, Hongjun He, Zhongpei Zhang, Chadi Assi
In this letter, a weighted minimum mean square error (WMMSE) empowered integrated sensing and communication (ISAC) system is investigated.
no code implementations • 5 Jun 2023 • Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang, Binghuai Lin
Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence.
no code implementations • 3 May 2023 • Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, JianHua Tao, Jianqing Sun, Jiaen Liang
However, it is still a challenge to comprehensively model the conversation, and a majority of conversational TTS systems only focus on extracting global information and omit local prosody features, which contain important fine-grained information like keywords and emphasis.
no code implementations • 9 Mar 2023 • Caiyuan Chu, Ya Li, Yifan Liu, Jia-Chen Gu, Quan Liu, Yongxin Ge, Guoping Hu
The key to automatic intention induction is that, for any given set of new data, the sentence representation obtained by the model can be well distinguished from different labels.
no code implementations • 15 Feb 2023 • Liting Lyu, Zhifeng Wang, Haihong Yun, Zexue Yang, Ya Li
Then, the spatial features are connected with the original students' exercise features as joint learning features.
no code implementations • 7 Oct 2022 • Yichen Han, Ya Li, Yingming Gao, Jinlong Xue, Songpo Wang, Lei Yang
Then we used keypoint decomposition to extract video synthesis controlling parameters from the backend output and the source image.
1 code implementation • 29 Sep 2022 • Chenghao Sun, Yonggang Zhang, Wan Chaoqun, Qizhou Wang, Ya Li, Tongliang Liu, Bo Han, Xinmei Tian
As it is hard to mitigate the approximation error with few available samples, we propose Error TransFormer (ETF) for lightweight attacks.
1 code implementation • 20 Mar 2022 • Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun, Jiaen Liang
In recent years, neural network based methods for multi-speaker text-to-speech synthesis (TTS) have made significant progress.
1 code implementation • CVPR 2020 • Hongjun Wang, Guangrun Wang, Ya Li, Dongyu Zhang, Liang Lin
To examine the robustness of ReID systems is rather important because the insecurity of ReID systems may cause severe losses, e. g., the criminals may use the adversarial perturbations to cheat the CCTV systems.
no code implementations • 23 Oct 2019 • Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang
It outperforms the baseline system that is optimized without the contrastive loss function with 1. 14% and 2. 55% in the weighted accuracy and the unweighted accuracy, respectively.
no code implementations • 23 Oct 2019 • Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang, Ming-Yue Niu
To sum up, the contributions of this paper lie in two areas: 1) We visualize concerned areas of human faces in emotion recognition; 2) We analyze the contribution of different face areas to different emotions in real-world conditions through experimental analysis.
no code implementations • 3 Apr 2019 • Ya Li, Xinmei Tian, Tongliang Liu, DaCheng Tao
The objective of our proposed method is to transform the features from different tasks into a common feature space in which the tasks are closely related and the shared parameters can be better optimized.
no code implementations • 31 Jan 2019 • Ya Li, Xinyu Liu, Dan Liu, Xueqiang Zhang, Junhua Liu
Recent years has witnessed dramatic progress of neural machine translation (NMT), however, the method of manually guiding the translation procedure remains to be better explored.
no code implementations • 11 Nov 2018 • Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang
I have submitted a new version to arXiv:1910. 13806.
1 code implementation • 13 Sep 2018 • Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang
We test our method in the EmotiW 2018 challenge and we gain promising results.
no code implementations • ECCV 2018 • Ya Li, Xinmei Tian, Mingming Gong, Yajing Liu, Tongliang Liu, Kun Zhang, DaCheng Tao
Under the assumption that the conditional distribution $P(Y|X)$ remains unchanged across domains, earlier approaches to domain generalization learned the invariant representation $T(X)$ by minimizing the discrepancy of the marginal distribution $P(T(X))$.
Ranked #76 on
Domain Generalization
on PACS
1 code implementation • 23 Jul 2018 • Ya Li, Mingming Gong, Xinmei Tian, Tongliang Liu, DaCheng Tao
With the conditional invariant representation, the invariance of the joint distribution $\mathbb{P}(h(X), Y)$ can be guaranteed if the class prior $\mathbb{P}(Y)$ does not change across training and test domains.
4 code implementations • 13 Jan 2017 • Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, Liang Lin
In this paper, we propose a novel active learning framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner.
no code implementations • 15 Apr 2016 • Guangrun Wang, Liang Lin, Shengyong Ding, Ya Li, Qing Wang
The past decade has witnessed the rapid development of feature representation learning and distance metric learning, whereas the two steps are often discussed separately.
Ranked #7 on
Person Re-Identification
on SYSU-30k
(using extra training data)
no code implementations • 28 Mar 2016 • Linlin Chao, Jian-Hua Tao, Minghao Yang, Ya Li, Zhengqi Wen
The other one is locating and re-weighting the perception attentions in the whole audio-visual stream for better recognition.
no code implementations • 8 Aug 2015 • Zhanglin Peng, Ya Li, Zhaoquan Cai, Liang Lin
In each layer, we construct a dictionary of filters by combining the filters from the lower layer, and iteratively optimize the image representation with a joint discriminative-generative formulation, i. e. minimization of empirical classification error plus regularization of analysis image generation over training images.
no code implementations • 20 Mar 2014 • Yunpeng Li, Ya Li, Jie Liu, Yong Deng
The results of defuzzification at the first step are not coincide with the results of defuzzification at the final step. It seems that the alternative is to defuzzification in the final step in fuzzy DEMATEL.