1 code implementation • 14 Mar 2023 • Yunjie Ji, Yan Gong, Yiping Peng, Chao Ni, Peiyan Sun, Dongyu Pan, Baochang Ma, Xiangang Li
The results on the test set show that ChatGPT's ranking preferences are consistent with human to a certain extent.
1 code implementation • 26 Mar 2023 • Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Lei Zhang, Baochang Ma, Xiangang Li
However current research rarely studies the impact of different amounts of instruction data on model performance, especially in the real-world use cases.
1 code implementation • 17 Apr 2023 • Xianghui Sun, Yunjie Ji, Baochang Ma, Xiangang Li
In this study, we undertook experimental comparisons between full-parameter fine-tuning and LoRA-based tuning methods, utilizing LLaMA as the base model.
2 code implementations • 16 Apr 2023 • Yunjie Ji, Yan Gong, Yong Deng, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li
Recently, significant public efforts have been directed towards developing low-cost models with capabilities akin to ChatGPT, thereby fostering the growth of open-source conversational models.
1 code implementation • 20 Sep 2023 • Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu
Specifically, we consider the general SFT training data, consisting of a small amount of expert data mixed with a large proportion of sub-optimal data, without any preference labels.
Ranked #20 on Code Generation on HumanEval
2 code implementations • 2 Aug 2019 • Kun Han, Junwen Chen, HUI ZHANG, Haiyang Xu, Yiping Peng, Yun Wang, Ning Ding, Hui Deng, Yonghu Gao, Tingwei Guo, Yi Zhang, Yahao He, Baochang Ma, Yu-Long Zhou, Kangli Zhang, Chao Liu, Ying Lyu, Chenxi Wang, Cheng Gong, Yunbo Wang, Wei Zou, Hui Song, Xiangang Li
In this paper we present DELTA, a deep learning based language technology platform.
Ranked #3 on Text Classification on Yahoo! Answers
1 code implementation • 22 Oct 2019 • Dongwei Jiang, Xiaoning Lei, Wubo Li, Ne Luo, Yuxuan Hu, Wei Zou, Xiangang Li
Speech recognition technologies are gaining enormous popularity in various industrial applications.
1 code implementation • 20 May 2020 • Dongwei Jiang, Wubo Li, Ruixiong Zhang, Miao Cao, Ne Luo, Yang Han, Wei Zou, Xiangang Li
In this paper, we conduct a further study on MPC and focus on three important aspects: the effect of pre-training data speaking style, its extension on streaming model, and how to better transfer learned knowledge from pre-training stage to downstream tasks.
1 code implementation • 27 Oct 2020 • Dongwei Jiang, Wubo Li, Miao Cao, Wei Zou, Xiangang Li
Self-supervised visual pretraining has shown significant progress recently.
15 code implementations • 5 May 2017 • Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu
We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.
2 code implementations • 13 Jun 2021 • Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan
This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10, 000 hours of high quality labeled audio suitable for supervised training, and 40, 000 hours of total audio suitable for semi-supervised and unsupervised training.
Ranked #1 on Speech Recognition on GigaSpeech
1 code implementation • 6 Sep 2019 • Haiyang Xu, HUI ZHANG, Kun Han, Yun Wang, Yiping Peng, Xiangang Li
Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality.
Multimodal Emotion Recognition Speech Emotion Recognition +2
1 code implementation • CVPR 2023 • Linzhi Huang, Yulong Li, Hongbo Tian, Yue Yang, Xiangang Li, Weihong Deng, Jieping Ye
The previous method ignored two problems: (i) When conducting interactive training between large model and lightweight model, the pseudo label of lightweight model will be used to guide large models.
no code implementations • 10 May 2018 • Wei Zou, Dongwei Jiang, Shuaijiang Zhao, Xiangang Li
We find that all types of modeling units can achieve approximate character error rate (CER) in CTC model and the performance of Chinese character attention model is better than syllable attention model.
no code implementations • ICML 2017 • Hairong Liu, Zhenyao Zhu, Xiangang Li, Sanjeev Satheesh
These methods suffer from two major drawbacks: 1) the set of basic units is fixed, such as the set of words, characters or phonemes in speech recognition, and 2) the decomposition of target sequences is fixed.
no code implementations • 11 Oct 2016 • Xiangang Li, Xihong Wu
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence history.
no code implementations • 16 Oct 2014 • Xiangang Li, Xihong Wu
Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks.
no code implementations • 31 Oct 2018 • Ne Luo, Dongwei Jiang, Shuaijiang Zhao, Caixia Gong, Wei Zou, Xiangang Li
Code-switching speech recognition has attracted an increasing interest recently, but the need for expert linguistic knowledge has always been a big issue.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • COLING 2018 • Fengyu Guo, Ruifang He, Di Jin, Jianwu Dang, Longbiao Wang, Xiangang Li
In this paper, we propose a novel neural Tensor network framework with Interactive Attention and Sparse Learning (TIASL) for implicit discourse relation recognition.
no code implementations • COLING 2018 • Ruifang He, Xuefei Zhang, Di Jin, Longbiao Wang, Jianwu Dang, Xiangang Li
They ignore that one discusses diverse topics when dynamically interacting with different people.
no code implementations • 22 Oct 2019 • Ruixiong Zhang, Wei Zou, Xiangang Li
To utilize the acoustic event information to improve the performance of ASC tasks, we present the cross-task pre-training mechanism which utilizes acoustic event information from the pre-trained AED model for ASC tasks.
no code implementations • 23 Oct 2019 • Wubo Li, Wei Zou, Xiangang Li
Multimodalities provide promising performance than unimodality in most tasks.
no code implementations • 18 Mar 2020 • Haiyang Xu, Yun Wang, Kun Han, Baochang Ma, Junwen Chen, Xiangang Li
Abstractive text summarization is a challenging task, and one need to design a mechanism to effectively extract salient information from the source text and then generate a summary.
no code implementations • 25 Mar 2020 • Haiyang Xu, Junwen Chen, Kun Han, Xiangang Li
Multi-class text classification is one of the key problems in machine learning and natural language processing.
no code implementations • 25 Mar 2020 • Haiyang Xu, Yahao He, Kun Han, Junwen Chen, Xiangang Li
Our approach has the following contributions: first, we incorporate syntactic information such as constituency parsing trees into the encoding sequence to learn both the semantic and syntactic information from the document, resulting in more accurate summary; second, we propose a dynamic gate network to select the salient information based on the context of the decoder state, which is essential to document summarization.
no code implementations • 29 Jul 2020 • Zhuohuang Zhang, Chengyun Deng, Yi Shen, Donald S. Williamson, Yongtao Sha, Yi Zhang, Hui Song, Xiangang Li
Recent work has shown that it is feasible to use generative adversarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches.
Audio and Speech Processing Sound
no code implementations • 16 Oct 2020 • Tanfang Chen, Weiwei Wang, Wenyang Wei, Xing Shi, Xiangang Li, Jieping Ye, Kevin Knight
This paper describes DiDi AI Labs' submission to the WMT2020 news translation shared task.
no code implementations • 21 Oct 2020 • Wubo Li, Dongwei Jiang, Wei Zou, Xiangang Li
Audio Visual Scene-aware Dialog (AVSD) is a task to generate responses when discussing about a given video.
no code implementations • 19 Oct 2020 • Tingwei Guo, Cheng Wen, Dongwei Jiang, Ne Luo, Ruixiong Zhang, Shuaijiang Zhao, Wubo Li, Cheng Gong, Wei Zou, Kun Han, Xiangang Li
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech.
Audio and Speech Processing
no code implementations • 26 Apr 2021 • Jianwei Sun, Zhiyuan Tang, Hengxin Yin, Wei Wang, Xi Zhao, Shuaijiang Zhao, Xiaoning Lei, Wei Zou, Xiangang Li
Augmentation related issues, such as comparison of different strategies and ratios for data combination are also investigated.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • WMT (EMNLP) 2020 • Tanfang Chen, Weiwei Wang, Wenyang Wei, Xing Shi, Xiangang Li, Jieping Ye, Kevin Knight
This paper describes the DiDi AI Labs’ submission to the WMT2020 news translation shared task.
no code implementations • 19 Apr 2022 • Rui Yan, Cheng Wen, Shuran Zhou, Tingwei Guo, Wei Zou, Xiangang Li
This paper describes our best system and methodology for ADD 2022: The First Audio Deep Synthesis Detection Challenge\cite{Yi2022ADD}.
no code implementations • 19 Apr 2022 • Cheng Wen, Tingwei Guo, Xingjun Tan, Rui Yan, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li
In this paper, we describe our speech generation system for the first Audio Deep Synthesis Detection Challenge (ADD 2022).
no code implementations • SemEval (NAACL) 2022 • Yong Deng, Chenxiao Dou, Liangyu Chen, Deqiang Miao, Xianghui Sun, Baochang Ma, Xiangang Li
PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media. Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappointed.
Ranked #1 on Multi-label Condescension Detection on DPM
Binary Condescension Detection Multi-Label Classification +1
no code implementations • Findings (NAACL) 2022 • Yunjie Ji, Liangyu Chen, Chenxiao Dou, Baochang Ma, Xiangang Li
Machine Reading Comprehension with Unanswerable Questions is a difficult NLP task, challenged by the questions which can not be answered from passages.
no code implementations • 4 Nov 2020 • Chengyun Deng, Shiqian Ma, Yi Zhang, Yongtao Sha, HUI ZHANG, Hui Song, Xiangang Li
dataset confirm the superior performance of the proposed method over the network without IRA in terms of SI-SDR and PESQ improvement.