1 code implementation • Findings (ACL) 2022 • Xingzhang Ren, Baosong Yang, Dayiheng Liu, Haibo Zhang, Xiaoyu Lv, Liang Yao, Jun Xie
Recognizing the language of ambiguous texts has become a main challenge in language identification (LID).
1 code implementation • Findings (NAACL) 2022 • Huan Lin, Baosong Yang, Liang Yao, Dayiheng Liu, Haibo Zhang, Jun Xie, Min Zhang, Jinsong Su
Diverse NMT aims at generating multiple diverse yet faithful translations given a source sentence.
no code implementations • Findings (ACL) 2022 • Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Haibo Zhang, Xue Zhao, Wenqing Yao, Boxing Chen
Under GCPG, we reconstruct commonly adopted lexical condition (i. e., Keywords) and syntactical conditions (i. e., Part-Of-Speech sequence, Constituent Tree, Masked Template and Sentential Exemplar) and study the combination of the two types.
no code implementations • WMT (EMNLP) 2021 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao
After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.
no code implementations • CL (ACL) 2022 • Yu Wan, Baosong Yang, Derek Fai Wong, Lidia Sam Chao, Liang Yao, Haibo Zhang, Boxing Chen
After empirically investigating the rationale behind this, we summarize two challenges in NMT for STs associated with translation error types above, respectively: (1) the imbalanced length distribution in training set intensifies model inference calibration over STs, leading to more over-translation cases on STs; and (2) the lack of contextual information forces NMT to have higher data uncertainty on short sentences, and thus NMT model is troubled by considerable mistranslation errors.
no code implementations • 13 Mar 2024 • Haibo Zhang, Zhihua Yao, Kouichi Sakurai
When facing the PGD attack and the MI-FGSM attack, versatile defense model even outperforms the attack-specific models trained based on these two attacks.
no code implementations • 22 Jul 2022 • Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang, Fangfang Zhang
Our results also show that WRHT can reduce the communication time of all-reduce operation by 92. 42% and 91. 31% compared to two existing all-reduce algorithms running in the electrical interconnect system.
2 code implementations • ACL 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao
Translation quality evaluation plays a crucial role in machine translation.
no code implementations • 28 Apr 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao
After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.
1 code implementation • Findings (ACL) 2022 • Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek F. Wong, Haibo Zhang, Boxing Chen, Lidia S. Chao
Attention mechanism has become the dominant module in natural language processing models.
no code implementations • 29 Dec 2021 • Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao
Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.
1 code implementation • 15 Dec 2021 • Xin Liu, Dayiheng Liu, Baosong Yang, Haibo Zhang, Junwei Ding, Wenqing Yao, Weihua Luo, Haiying Zhang, Jinsong Su
Generative commonsense reasoning requires machines to generate sentences describing an everyday scenario given several concepts, which has attracted much attention recently.
no code implementations • 3 Nov 2021 • Linlong Xu, Baosong Yang, Xiaoyu Lv, Tianchi Bi, Dayiheng Liu, Haibo Zhang
Interactive and non-interactive model are the two de-facto standard frameworks in vector-based cross-lingual information retrieval (V-CLIR), which embed queries and documents in synchronous and asynchronous fashions, respectively.
Computational Efficiency Cross-Lingual Information Retrieval +4
no code implementations • 30 Sep 2021 • Fei Dai, Yawen Chen, Haibo Zhang, Zhiyi Huang
Compared with ENoC, simulation results show that under batch sizes of 64 and 128, on average ONoC can achieve 21. 02% and 12. 95% on reducing training time with 47. 85% and 39. 27% on saving energy, respectively.
no code implementations • ACL 2021 • Xin Liu, Baosong Yang, Dayiheng Liu, Haibo Zhang, Weihua Luo, Min Zhang, Haiying Zhang, Jinsong Su
A well-known limitation in pretrain-finetune paradigm lies in its inflexibility caused by the one-size-fits-all vocabulary.
1 code implementation • ACL 2021 • Huan Lin, Liang Yao, Baosong Yang, Dayiheng Liu, Haibo Zhang, Weihua Luo, Degen Huang, Jinsong Su
Furthermore, we contribute the first Chinese-English parallel corpus annotated with user behavior called UDT-Corpus.
no code implementations • NAACL 2021 • Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye, Shikun Zhang
Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information.
no code implementations • COLING 2020 • Liang Yao, Baosong Yang, Haibo Zhang, Boxing Chen, Weihua Luo
Query translation (QT) serves as a critical factor in successful cross-lingual information retrieval (CLIR).
no code implementations • 26 Oct 2020 • Tianchi Bi, Liang Yao, Baosong Yang, Haibo Zhang, Weihua Luo, Boxing Chen
Query translation (QT) is a key component in cross-lingual information retrieval system (CLIR).
no code implementations • 26 Oct 2020 • Liang Yao, Baosong Yang, Haibo Zhang, Weihua Luo, Boxing Chen
As a crucial role in cross-language information retrieval (CLIR), query translation has three main challenges: 1) the adequacy of translation; 2) the lack of in-domain parallel training data; and 3) the requisite of low latency.
1 code implementation • EMNLP 2020 • Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen
Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.
no code implementations • WS 2018 • Yongchao Deng, Shanbo Cheng, Jun Lu, Kai Song, Jingang Wang, Shenglan Wu, Liang Yao, Guchun Zhang, Haibo Zhang, Pei Zhang, Changfeng Zhu, Boxing Chen
We participated in 5 translation directions including English ↔ Russian, English ↔ Turkish in both directions and English → Chinese.