1 code implementation • 23 Mar 2025 • Hongyu Yan, Zijun Li, Kunming Luo, Li Lu, Ping Tan
Second, SGFormer leverages the geometric features of partial-missing pairs as the explicit symmetric guidance that can constrain the refinement process for initial point clouds.
no code implementations • 31 Jan 2025 • Zhanpeng Luo, Linna Wang, Guangwu Qian, Li Lu
Motivated by the knowledge distillation's teacher-student learning strategy, we design a knowledge transfer way for completing 3d shape.
no code implementations • 15 Jan 2025 • Qianniu Chen, Xiaoyang Hao, Bowen Li, Yue Liu, Li Lu
Furthermore, we present a two-stage self-distillation framework that constructs parallel data pairs for effectively disentangling linguistic content and speakers from the perspective of training data.
1 code implementation • 3 Aug 2024 • Peng Cheng, Yuwei Wang, Peng Huang, Zhongjie Ba, Xiaodong Lin, Feng Lin, Li Lu, Kui Ren
Based on the ALIF pipeline, we present the ALIF-OTL and ALIF-OTA schemes for launching attacks in both the digital domain and the physical playback environment on four commercial ASRs and voice assistants.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 4 Mar 2024 • Zhongjie Ba, Qingyu Liu, Zhenguang Liu, Shuang Wu, Feng Lin, Li Lu, Kui Ren
In this paper, we try to tackle these challenges through three designs: (1) We present a novel framework to capture broader forgery clues by extracting multiple non-overlapping local representations and fusing them into a global semantic-rich feature.
1 code implementation • 20 Oct 2023 • Xinyu Zhang, Qingyu Liu, Zhongjie Ba, Yuan Hong, Tianhang Zheng, Feng Lin, Li Lu, Kui Ren
In this paper, we first conduct a comprehensive study on prior FL attacks and detection methods.
no code implementations • 10 Nov 2022 • Meng Chen, Li Lu, Jiadi Yu, Yingying Chen, Zhongjie Ba, Feng Lin, Kui Ren
In this paper, we propose a voice de-identification system, which uses adversarial examples to balance the privacy and utility of voice services.
1 code implementation • 8 Oct 2022 • Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, ShiLiang Pu, Li Lu
The rapid development of point cloud learning has driven point cloud completion into a new era.
no code implementations • 21 Mar 2022 • Zewang Zhang, Yibin Zheng, Xinhui Li, Li Lu
To improve the accuracy and naturalness of synthesized singing voice, we design several specifical modules and techniques: 1) A deep bi-directional LSTM-based duration model with multi-scale rhythm loss and post-processing step; 2) A Transformer-alike acoustic model with progressive pitch-weighted decoder loss; 3) a 24 kHz pitch-aware LPCNet neural vocoder to produce high-quality singing waveforms; 4) A novel data augmentation method with multi-singer pre-training for stronger robustness and naturalness.
no code implementations • 28 Sep 2021 • Shilun Lin, Wenchao Su, Li Meng, Fenglong Xie, Xinhui Li, Li Lu
Thirdly, a duration predictor instead of an attention model that connects the above hybrid encoder and decoder.
1 code implementation • ICCV 2021 • Haoxi Ran, Wei Zhuo, Jun Liu, Li Lu
We further verify the expandability of RPNet, in terms of both depth and width, on the tasks of classification and segmentation.
Ranked #21 on
Semantic Segmentation
on S3DIS
2 code implementations • 4 Jul 2021 • Mingbo Hong, Shuiwang Li, Yuchao Yang, Feiyu Zhu, Qijun Zhao, Li Lu
With the increasing demand for search and rescue, it is highly demanded to detect objects of interest in large-scale images captured by Unmanned Aerial Vehicles (UAVs), which is quite challenging due to extremely small scales of objects.
no code implementations • 1 May 2021 • Shuiwang Li, Qijun Zhao, Ziliang Feng, Li Lu
On the surface, correlation filter and convolution filter are usually used for different purposes.
no code implementations • 23 Apr 2021 • Cheng Luo, Dayiheng Liu, Chanjuan Li, Li Lu, Jiancheng Lv
The system includes modules such as dialogue topic prediction, knowledge matching and dialogue generation.
no code implementations • 30 Jan 2021 • Shilun Lin, Fenglong Xie, Li Meng, Xinhui Li, Li Lu
In this work, a robust and efficient text-to-speech (TTS) synthesis system named Triple M is proposed for large-scale online application.
1 code implementation • 29 Nov 2020 • Haoxi Ran, Li Lu
Specifically, SepNet achieves state-of-the-art for the tasks of classification and segmentation on most of the datasets.
no code implementations • 24 Nov 2020 • Haoxi Ran, Guangfu Wang, Li Lu
Human imitation has become topical recently, driven by GAN's ability to disentangle human pose and body content.
no code implementations • 8 Oct 2020 • Cheng Shen, Jianghua Ying, Le Liu, Jianpeng Liu, Na Li, Shuopei Wang, Jian Tang, Yanchong Zhao, Yanbang Chu, Kenji Watanabe, Takashi Taniguchi, Rong Yang, Dongxia Shi, Fanming Qu, Li Lu, Wei Yang, Guangyu Zhang
For {\theta}=1. 25{\deg}, we observe an emergence of topological insulating states at hole side with a sequence of Chern number |C|=4-|v|, where v is the number of electrons (holes) in moir\'e unite cell.
Mesoscale and Nanoscale Physics Materials Science