no code implementations • 16 Nov 2021 • Nianzu Zheng, Liqun Deng, Wenyong Huang, Yu Ting Yeung, Baohua Xu, Yuanyuan Guo, Yasheng Wang, Xiao Chen, Xin Jiang, Qun Liu
We utilize conv-transformer structure to encode input speech in a streaming manner.
Multi-Task Learning Phone-level pronunciation scoring