1 code implementation • Findings (EMNLP) 2021 • Lei He, Suncong Zheng, Tao Yang, Feng Zhang
In this work, we propose to incorporate KG (including both entities and relations) into the language learning process to obtain KG-enhanced pretrained Language Model, namely KLMo.
no code implementations • 9 May 2022 • Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, YuanHao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu
In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on a benchmark dataset.
Ranked #1 on
Text-To-Speech Synthesis
on LJSpeech
no code implementations • 1 Apr 2022 • Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu
We model the speaker characteristics systematically to improve the generalization on new speakers.
no code implementations • 8 Feb 2022 • Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo Mandic, Lei He, Sheng Zhao
In this paper, we propose InferGrad, a diffusion model for vocoder that incorporates inference process into training, to reduce the inference iterations while maintaining high generation quality.
no code implementations • 20 Jan 2022 • J. Yang, Lei He
To further improve the speaker similarity, joint training with a speaker classifier is proposed.
1 code implementation • NeurIPS 2021 • Yuan Liang, Weikun Han, Liang Qiu, Chen Wu, Yiting shao, Kun Wang, Lei He
In this work, we pioneer to study deep learning for dental forensic identification based on panoramic radiographs.
no code implementations • 4 Nov 2021 • Zhuofu Tao, Chen Wu, Yuan Liang, Lei He
In this work, we propose LW-GCN, a lightweight FPGA-based accelerator with a software-hardware co-designed process to tackle irregularity in computation and memory access in GCN inference.
1 code implementation • 28 Oct 2021 • Moyun Liu, Youping Chen, Lei He, Yang Zhang, Jingming Xie
To further prove the ability of our method, we test it on public dataset MS COCO, and the results show that our LF-YOLO has a outstanding versatility detection performance.
1 code implementation • 25 Oct 2021 • Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao
The goal of this challenge is to synthesize natural and high-quality speech from text, and we approach this goal in two perspectives: The first is to directly model and generate waveform in 48 kHz sampling rate, which brings higher perception quality than previous systems with 16 kHz or 24 kHz sampling rate; The second is to model the variation information in speech through a systematic design, which improves the prosody and naturalness.
no code implementations • 19 Oct 2021 • Mutian He, Jingzhou Yang, Lei He, Frank K. Soong
End-to-end TTS suffers from high data requirements as it is difficult for both costly speech corpora to cover all necessary knowledge and neural models to learn the knowledge, hence additional knowledge needs to be injected manually.
no code implementations • 30 Aug 2021 • Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He
Different from single object reconstruction from photos, this task has the unique challenge of constructing multiple objects at high resolutions.
no code implementations • 27 Jul 2021 • Shifeng Pan, Lei He
Secondly, in these models the content/text, prosody, and speaker timbre are usually highly entangled, it's therefore not realistic to expect a satisfied result when freely combining these components, such as to transfer speaking style between speakers.
1 code implementation • 21 Jul 2021 • Jiawei Yang, Yao Zhang, Yuan Liang, Yang Zhang, Lei He, Zhiqiang He
Experiments on kidney tumor segmentation task demonstrate that TumorCP surpasses the strong baseline by a remarkable margin of 7. 12% on tumor Dice.
no code implementations • 14 Jul 2021 • Lei He, Shengjie Jiang, Xiaoqing Liang, Ning Wang, Shiyu Song
Compared to traditional methods based on object detectors, the essential design in our work is a parallel feature difference calculation structure that infers map changes by comparing features extracted from the camera and rasterized images.
no code implementations • 8 Jun 2021 • Liping Chen, Yan Deng, Xi Wang, Frank K. Soong, Lei He
Experimental results obtained by the Transformer TTS show that the proposed BERT can extract fine-grained, segment-level prosody, which is complementary to utterance-level prosody to improve the final prosody of the TTS speech.
no code implementations • 27 Apr 2021 • Rui Zhao, Jian Xue, Jinyu Li, Wenning Wei, Lei He, Yifan Gong
The first challenge is solved with a splicing data method which concatenates the speech segments extracted from the source domain data.
no code implementations • 8 Apr 2021 • Fengpeng Yue, Yan Deng, Lei He, Tom Ko
Machine Speech Chain, which integrates both end-to-end (E2E) automatic speech recognition (ASR) and text-to-speech (TTS) into one circle for joint training, has been proven to be effective in data augmentation by leveraging large amounts of unpaired data.
2 code implementations • 5 Mar 2021 • Mutian He, Jingzhou Yang, Lei He, Frank K. Soong
To scale neural speech synthesis to various real-world languages, we present a multilingual end-to-end framework that maps byte inputs to spectrograms, thus allowing arbitrary input scripts.
no code implementations • 2 Feb 2021 • Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He
Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets.
no code implementations • 19 Jan 2021 • Lei He, Jiwen Lu, Guanghui Wang, Shiyu Song, Jie zhou
In this paper, we first introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks through an analysis of the imaging process, then propose a Semantic Object Segmentation and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
Ranked #39 on
Semantic Segmentation
on NYU Depth v2
no code implementations • 23 Dec 2020 • Jiawei Yang, Yuan Liang, Yao Zhang, Weinan Song, Kun Wang, Lei He
The ability of deep learning to predict with uncertainty is recognized as key for its adoption in clinical routines.
no code implementations • 30 Jul 2020 • Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong
Because of its streaming nature, recurrent neural network transducer (RNN-T) is a very promising end-to-end (E2E) model that may replace the popular hybrid model for automatic speech recognition.
no code implementations • 13 Jun 2020 • Shengyun Peng, Yunxuan Yu, Kun Wang, Lei He
Specifically, a target object is defined by a bounding box center, tracking offset, and object size.
no code implementations • 18 Mar 2020 • Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He
In this paper, we propose a framework, named Oral-3D, to reconstruct the 3D oral cavity from a single PX image and prior information of the dental arch.
no code implementations • 19 Feb 2020 • Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He
The encoder-decoder network is widely used to learn deep feature representations from pixel-wise annotations in biomedical image analysis.
no code implementations • 7 Jan 2020 • Yinqiu Liu, Kai Qian, Jianli Chen, Kun Wang, Lei He
As an emerging technology, blockchain has achieved great success in numerous application scenarios, from intelligent healthcare to smart cities.
Cryptography and Security Distributed, Parallel, and Cluster Computing 68M14 C.2.2
no code implementations • 10 Oct 2019 • Yuan Liang, Weinan Song, J. P. Dym, Kun Wang, Lei He
Label propagation is a popular technique for anatomical segmentation.
no code implementations • 4 Oct 2019 • Lei He, Arthur Guijt, Mathijs de Weerdt, Lining Xing, Neil Yorke-Smith
Sparrow integrates the exploration ability of BRKGA and the exploitation ability of ALNS.
no code implementations • 31 Aug 2019 • Lei He
Inspired by the great success of convolutional neural networks on structural data like videos and images, graph neural network (GNN) emerges as a powerful approach to process non-euclidean data structures and has been proved powerful in various application domains such as social network, e-commerce, and knowledge graph.
Distributed, Parallel, and Cluster Computing
1 code implementation • 18 Jul 2019 • Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jian-Hua Tao
Experimental results show our proposed methods especially the second one (bidirectional decoder regularization), leads a significantly improvement on both robustness and overall naturalness, as outperforming baseline (the revised version of Tacotron2) with a MOS gap of 0. 14 in a challenging test, and achieving close to human quality (4. 42 vs. 4. 49 in MOS) on general test.
1 code implementation • 3 Jun 2019 • Mutian He, Yan Deng, Lei He
In this paper, we propose a novel stepwise monotonic attention method in sequence-to-sequence acoustic modeling to improve the robustness on out-of-domain inputs.
no code implementations • 9 Apr 2019 • Haohan Guo, Frank K. Soong, Lei He, Lei Xie
The end-to-end TTS, which can predict speech directly from a given sequence of graphemes or phonemes, has shown improved performance over the conventional TTS.
no code implementations • 9 Apr 2019 • Haohan Guo, Frank K. Soong, Lei He, Lei Xie
However, the autoregressive module training is affected by the exposure bias, or the mismatch between the different distributions of real and predicted data.
no code implementations • 3 Jan 2019 • Huaiping Ming, Lei He, Haohan Guo, Frank K. Soong
In this paper, we propose a feature reinforcement method under the sequence-to-sequence neural text-to-speech (TTS) synthesis framework.
no code implementations • 13 Dec 2018 • Yan Deng, Lei He, Frank Soong
Neural TTS has shown it can generate high quality synthesized speech.
1 code implementation • 12 Dec 2018 • Liang Qiu, Yuanyi Ding, Lei He
In recent years, Recurrent Neural Networks (RNNs) based models have been applied to the Slot Filling problem of Spoken Language Understanding and achieved the state-of-the-art performances.
2 code implementations • 11 Dec 2018 • Ya-Jie Zhang, Shifeng Pan, Lei He, Zhen-Hua Ling
In this paper, we introduce the Variational Autoencoder (VAE) to an end-to-end speech synthesis model, to learn the latent representation of speaking styles in an unsupervised manner.
no code implementations • 27 Mar 2018 • Lei He, Guanghui Wang, Zhanyi Hu
In order to learn monocular depth by embedding the focal length, we propose a method to generate synthetic varying-focal-length dataset from fixed-focal-length datasets, and a simple and effective method is implemented to fill the holes in the newly generated images.
no code implementations • COLING 2016 • Wei Li, Lei He, Hai Zhuge
This paper studies the abstractive multi-document summarization for event-oriented news texts through event information extraction and abstract representation.
no code implementations • COLING 2016 • Lei He, Wei Li, Hai Zhuge
This paper investigates differential topic models (dTM) for summarizing the differences among document groups.
no code implementations • 1 Nov 2015 • Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for modeling and predicting sequential data, e. g. speech utterances or handwritten documents.
3 code implementations • 21 Oct 2015 • Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for tagging sequential data, e. g. speech utterances or handwritten documents.
no code implementations • 18 Nov 2014 • Chen Chen, Junzhou Huang, Lei He, Hongsheng Li
The convergence rate of the proposed algorithm is almost the same as that of the traditional IRLS algorithms, that is, exponentially fast.
no code implementations • CVPR 2014 • Chen Chen, Junzhou Huang, Lei He, Hongsheng Li
In this paper, we propose a novel algorithm for structured sparsity reconstruction.