no code implementations • NAACL (AutoSimTrans) 2022 • Xingshan Zeng, Pengfei Li, Liangyou Li, Qun Liu
This paper describes the system submitted to AutoSimTrans 2022 from Huawei Noah’s Ark Lab, which won the first place in the audio input track of the Chinese-English translation task.
no code implementations • ACL (IWSLT) 2021 • Xingshan Zeng, Liangyou Li, Qun Liu
We use a unified transformer architecture for our MultiST model, so that the data from different modalities (i. e., speech and text) and different tasks (i. e., Speech Recognition, Machine Translation, and Speech Translation) can be exploited to enhance the model’s ability.
1 code implementation • Findings (ACL) 2022 • Qiwei Bi, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Hanfang Yang
With the adoption of large pre-trained models like BERT in news recommendation, the above way to incorporate multi-field information may encounter challenges: the shallow feature encoding to compress the category and entity information is not compatible with the deep BERT encoding.
no code implementations • ACL 2022 • Ningning Wang, Guobing Gan, Peng Zhang, Shuai Zhang, Junqiu Wei, Qun Liu, Xin Jiang
Other sparse methods use clustering patterns to select words, but the clustering process is separate from the training process of the target task, which causes a decrease in effectiveness.
no code implementations • NAACL (AutoSimTrans) 2022 • Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang, Liang Huang, Qun Liu, Julia Ive, Wolfgang Macherey
This paper reports the results of the shared task we hosted on the Third Workshop of Automatic Simultaneous Translation (AutoSimTrans).
no code implementations • WMT (EMNLP) 2020 • Wei Peng, Jianfeng Liu, Minghan Wang, Liangyou Li, Xupeng Meng, Hao Yang, Qun Liu
This paper describes Huawei’s submissions to the WMT20 biomedical translation shared task.
1 code implementation • EMNLP 2021 • Weixuan Wang, Wei Peng, Meng Zhang, Qun Liu
Neural Machine Translation (NMT) has shown a strong ability to utilize local context to disambiguate the meaning of words.
no code implementations • EMNLP 2021 • Yuanhang Zheng, Zhixing Tan, Meng Zhang, Mieradilijiang Maimaiti, Huanbo Luan, Maosong Sun, Qun Liu, Yang Liu
Quality estimation (QE) of machine translation (MT) aims to evaluate the quality of machine-translated sentences without references and is important in practical applications of MT.
no code implementations • EMNLP 2021 • Huibin Ge, Chenxi Sun, Deyi Xiong, Qun Liu
Experiment results show that the Chinese pretrained language model PanGu-\alpha is 45 points behind human in terms of top-1 word prediction accuracy, indicating that Chinese WPLC is a challenging dataset.
no code implementations • WMT (EMNLP) 2021 • Meng Zhang, Minghao Wu, Pengfei Li, Liangyou Li, Qun Liu
This paper describes the NoahNMT system submitted to the WMT 2021 shared task of Very Low Resource Supervised Machine Translation.
1 code implementation • Findings (ACL) 2022 • Jian Li, Jieming Zhu, Qiwei Bi, Guohao Cai, Lifeng Shang, Zhenhua Dong, Xin Jiang, Qun Liu
Accurately matching user’s interests and candidate news is the key to news recommendation.
no code implementations • Findings (ACL) 2022 • Xianghong Fang, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Dit-yan Yeung
While variational autoencoders (VAEs) have been widely applied in text generation tasks, they are troubled by two challenges: insufficient representation capacity and poor controllability.
no code implementations • WMT (EMNLP) 2021 • Weixuan Wang, Wei Peng, Xupeng Meng, Qun Liu
This paper describes Huawei Artificial Intelligence Application Research Center’s neural machine translation systems and submissions to the WMT21 biomedical translation shared task.
no code implementations • 2 Apr 2025 • Xingshan Zeng, Weiwen Liu, Xu Huang, Zezhong Wang, Lingzhi Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruiming Tang, Qun Liu
Tool learning, which allows Large Language Models (LLMs) to leverage external tools for solving complex user tasks, has emerged as a promising avenue for extending model capabilities.
no code implementations • 12 Mar 2025 • Boyang Xue, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Hongling Xu, Fei Mi, Yasheng Wang, Lifeng Shang, Qun Liu, Kam-Fai Wong
Present Large Language Models (LLM) self-training methods always under-sample on challenging queries, leading to inadequate learning on difficult problems which limits LLMs' ability.
1 code implementation • 8 Mar 2025 • Weidong Zhan, Yue Wang, Nan Hu, Liming Xiao, Jingyuan Ma, Yuhang Qin, Zheng Li, Yixin Yang, Sirui Deng, Jinkun Ding, Wenhan Ma, Rui Li, Weilin Luo, Qun Liu, Zhifang Sui
This approach, along with our benchmark, provides a valuable tool for assessing and enhancing LLMs' commonsense reasoning capabilities and can be applied to a wide range of knowledge domains.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jie He, Tao Wang, Deyi Xiong, Qun Liu
Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy (60. 1%) and reasoning consistency (31%).
no code implementations • 26 Feb 2025 • Kaishuai Xu, Tiezheng Yu, Wenjun Hou, Yi Cheng, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
Large Language Models (LLMs) are being used more and more extensively for automated evaluation in various scenarios.
no code implementations • 24 Feb 2025 • Xiaoran Liu, Ruixiao Li, Mianqiu Huang, Zhigeng Liu, Yuerong Song, Qipeng Guo, Siyang He, Qiqi Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xuanjing Huang, Xipeng Qiu
Moreover, the research on long-context LLMs has expanded from length extrapolation to a comprehensive focus on architecture, infrastructure, training, and evaluation technologies.
no code implementations • 17 Feb 2025 • Xin Xu, Yan Xu, Tianhao Chen, Yuchen Yan, Chengwu Liu, Zaoyu Chen, YuFei Wang, Yichun Yin, Yasheng Wang, Lifeng Shang, Qun Liu
Existing approaches to mathematical reasoning with large language models (LLMs) rely on Chain-of-Thought (CoT) for generalizability or Tool-Integrated Reasoning (TIR) for precise computation.
no code implementations • 23 Dec 2024 • Ge Zhang, Mohammad Ali Alomrani, Hongjian Gu, Jiaming Zhou, Yaochen Hu, Bin Wang, Qun Liu, Mark Coates, Yingxue Zhang, Jianye Hao
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational reasoning problems such as kinship or spatial reasoning.
1 code implementation • 23 Dec 2024 • Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Qun Liu, Dongyan Zhao
Based on this Friends-MMC dataset, we further study two fundamental MMC tasks: conversation speaker identification and conversation response prediction, both of which have the multi-party nature with the video or image as visual context.
no code implementations • 17 Dec 2024 • Jiebin Zhang, Dawei Zhu, YiFan Song, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu, Sujian Li
As large language models (LLMs) process increasing context windows, the memory usage of KV cache has become a critical bottleneck during inference.
no code implementations • 11 Nov 2024 • Mianqiu Huang, Xiaoran Liu, Shaojun Zhou, Mozhi Zhang, Chenkun Tan, Pengyu Wang, Qipeng Guo, Zhe Xu, Linyang Li, Zhikai Lei, Linlin Li, Qun Liu, Yaqian Zhou, Xipeng Qiu, Xuanjing Huang
With the development of large language models (LLMs), the sequence length of these models continues to increase, drawing significant attention to long-context language models.
no code implementations • 24 Oct 2024 • Zezhong Wang, Xingshan Zeng, Weiwen Liu, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
Data quality assessments demonstrate improvements in the naturalness and coherence of our synthesized dialogues.
no code implementations • 17 Oct 2024 • Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li
The success of large language models (LLMs) has prompted efforts to integrate speech and audio data, aiming to create general foundation models capable of processing both textual and non-textual inputs.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 9 Oct 2024 • Kaishuai Xu, Tiezheng Yu, Wenjun Hou, Yi Cheng, Chak Tou Leong, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
In this work, we propose a novel preference learning framework called eRror-Injected Self-Editing (RISE), which injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
no code implementations • 27 Sep 2024 • Yu Fu, Jie He, Yifan Yang, Qun Liu, Deyi Xiong
In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning.
no code implementations • 26 Sep 2024 • Kai Chen, Yunhao Gou, Runhui Huang, Zhili Liu, Daxin Tan, Jing Xu, Chunwei Wang, Yi Zhu, Yihan Zeng, Kuo Yang, Dingdong Wang, Kun Xiang, Haoyuan Li, Haoli Bai, Jianhua Han, Xiaohui Li, Weike Jin, Nian Xie, Yu Zhang, James T. Kwok, Hengshuang Zhao, Xiaodan Liang, Dit-yan Yeung, Xiao Chen, Zhenguo Li, Wei zhang, Qun Liu, Jun Yao, Lanqing Hong, Lu Hou, Hang Xu
GPT-4o, an omni-modal model that enables vocal conversations with diverse emotions and tones, marks a milestone for omni-modal foundation models.
no code implementations • 24 Sep 2024 • Xingping Xian, Jianlu Liu, Chao Wang, Tao Wu, Shaojie Qiao, Xiaochuan Tang, Qun Liu
First, GISExplainer defines a causal attribution mechanism that considers the game-theoretic interaction of multi-granularity coalitions in candidate explanatory subgraph to quantify the causal effect of an edge on the prediction.
no code implementations • 4 Sep 2024 • Zhe Xu, Jiasheng Ye, Xiangyang Liu, Tianxiang Sun, Xiaoran Liu, Qipeng Guo, Linlin Li, Qun Liu, Xuanjing Huang, Xipeng Qiu
DetectiveQA focuses on evaluating the long-context reasoning ability of LLMs, which not only requires a full understanding of context but also requires extracting important evidences from the context and reasoning according to extracted evidences to answer the given questions.
no code implementations • 2 Sep 2024 • Weiwen Liu, Xu Huang, Xingshan Zeng, Xinlong Hao, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Zhengying Liu, Yuanqing Yu, Zezhong Wang, Yuxian Wang, Wu Ning, Yutai Hou, Bin Wang, Chuhan Wu, Xinzhi Wang, Yong liu, Yasheng Wang, Duyu Tang, Dandan Tu, Lifeng Shang, Xin Jiang, Ruiming Tang, Defu Lian, Qun Liu, Enhong Chen
Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability.
no code implementations • 21 Jul 2024 • Jianxin Liang, Xiaojun Meng, Yueqian Wang, Chang Liu, Qun Liu, Dongyan Zhao
Video Question Answering (VideoQA) has emerged as a challenging frontier in the field of multimedia processing, requiring intricate interactions between visual and textual modalities.
no code implementations • 21 Jul 2024 • Xiaoran Liu, Ruixiao Li, Qipeng Guo, Zhigeng Liu, Yuerong Song, Kai Lv, Hang Yan, Linlin Li, Qun Liu, Xipeng Qiu
The long-context capability of the Large Language Models (LLM) has made significant breakthroughs, but the maximum supported context length remains a critical bottleneck limiting their practical applications.
no code implementations • 23 Jun 2024 • Zezhong Wang, Xingshan Zeng, Weiwen Liu, YuFei Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
To address these questions, we propose a method, namely Chain-of-Probe (CoP), to probe changes in the mind during the model's reasoning.
no code implementations • 14 Jun 2024 • Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye Hao, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh
To mitigate these issues, we propose our enhanced web and efficient knowledge graph (KG) retrieval solution (EWEK-QA) to enrich the content of the extracted knowledge fed to the system.
1 code implementation • 7 Jun 2024 • Fengran Mo, Abbas Ghaddar, Kelong Mao, Mehdi Rezagholizadeh, Boxing Chen, Qun Liu, Jian-Yun Nie
In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries.
no code implementations • 29 May 2024 • Hao Zhang, Yuyang Zhang, Xiaoguang Li, Wenxuan Shi, Haonan Xu, Huanshuo Liu, Yasheng Wang, Lifeng Shang, Qun Liu, Yong liu, Ruiming Tang
Integrating external knowledge into large language models (LLMs) presents a promising solution to overcome the limitations imposed by their antiquated and static parametric memory.
1 code implementation • 16 May 2024 • Linhao Yu, Qun Liu, Deyi Xiong
The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions.
no code implementations • 1 May 2024 • Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok
As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge.
1 code implementation • 25 Mar 2024 • Zhiming Mao, Haoli Bai, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu, Kam-Fai Wong
Prior study shows that pre-training techniques can boost the performance of visual document understanding (VDU), which typically requires models to gain abilities to perceive and reason both document texts and layouts (e. g., locations of texts and table-cells).
1 code implementation • 15 Mar 2024 • Yueqian Wang, Xiaojun Meng, Jianxin Liang, Yuxuan Wang, Qun Liu, Dongyan Zhao
Video-text Large Language Models (video-text LLMs) have shown remarkable performance in answering questions and holding conversations on simple videos.
Ranked #13 on
Video Question Answering
on MVBench
1 code implementation • 28 Feb 2024 • Jiebin Zhang, Eugene J. Yu, Qinyu Chen, Chenhao Xiong, Dawei Zhu, Han Qian, Mingbo Song, Weimin Xiong, Xiaoguang Li, Qun Liu, Sujian Li
It presents significant challenges to generate comprehensive and accurate Wikipedia articles for newly emerging events under a real-world scenario.
no code implementations • 25 Feb 2024 • Xuming Hu, Xiaochuan Li, Junzhe Chen, Yinghui Li, Yangning Li, Xiaoguang Li, Yasheng Wang, Qun Liu, Lijie Wen, Philip S. Yu, Zhijiang Guo
To this end, we propose evaluating the robustness of generative search engines in the realistic and high-risk setting, where adversaries have only black-box system access and seek to deceive the model into returning incorrect responses.
no code implementations • 24 Feb 2024 • Shu-Ting Pi, Michael Yang, Qun Liu
To tackle this, a machine learning model that accurately predicts the complexity of customer issues is highly desirable.
no code implementations • 24 Feb 2024 • Shu-Ting Pi, Michael Yang, Yuying Zhu, Qun Liu
Customer service is often the most time-consuming aspect for e-commerce websites, with each contact typically taking 10-15 minutes.
no code implementations • 24 Feb 2024 • Shu-Ting Pi, Cheng-Ping Hsieh, Qun Liu, Yuying Zhu
Our novel approach involves using machine learning techniques to tag customer questions in transcripts and create a repository of questions and corresponding labels.
1 code implementation • 19 Feb 2024 • Yuxin Jiang, YuFei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang
Knowledge editing techniques, aiming to efficiently modify a minor proportion of knowledge in large language models (LLMs) without negatively impacting performance across other inputs, have garnered widespread attention.
no code implementations • 9 Feb 2024 • Yvette Graham, Mohammed Rameez Qureshi, Haider Khalid, Gerasimos Lampouras, Ignacio Iacobacci, Qun Liu
SCI-CHAT follows previous workshops on open domain dialogue but in contrast the focus of the shared task is simulation of intelligent conversation as judged in a live human evaluation.
1 code implementation • 30 Jan 2024 • Shijue Huang, Wanjun Zhong, Jianqiao Lu, Qi Zhu, Jiahui Gao, Weiwen Liu, Yutai Hou, Xingshan Zeng, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruifeng Xu, Qun Liu
The recent trend of using Large Language Models (LLMs) as tool agents in real-world applications underscores the necessity for comprehensive evaluations of their capabilities, particularly in complex scenarios involving planning, creating, and using tools.
1 code implementation • 30 Jan 2024 • Wai-Chung Kwan, Xingshan Zeng, Yuxin Jiang, YuFei Wang, Liangyou Li, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
Large language models (LLMs) are increasingly relied upon for complex multi-turn conversations across diverse real-world applications.
no code implementations • 28 Jan 2024 • Jianqiao Lu, Wanjun Zhong, YuFei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu
With the teacher's guidance, the student learns to iteratively refine its answer with feedback, and forms a robust and comprehensive understanding of the posed questions.
1 code implementation • 26 Jan 2024 • Haochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Yunlong Feng, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song
Large Language Models (LLMs) have succeeded remarkably in understanding long-form contents.
1 code implementation • 17 Jan 2024 • Yu Pan, Ye Yuan, Yichun Yin, Jiaxin Shi, Zenglin Xu, Ming Zhang, Lifeng Shang, Xin Jiang, Qun Liu
The rapid progress of Transformers in artificial intelligence has come at the cost of increased resource consumption and greenhouse gas emissions due to growing model sizes.
no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao
We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.
1 code implementation • 18 Dec 2023 • Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin
NoMIRACL includes both a non-relevant and a relevant subset.
1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
no code implementations • 12 Dec 2023 • Renlong Jie, Xiaojun Meng, Xin Jiang, Qun Liu
Different from the centrality-based ranking methods, our extractive scorer can be trained in an end-to-end manner, with no other requirement of positional assumption.
1 code implementation • 4 Dec 2023 • Zige Wang, Wanjun Zhong, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Lifeng Shang, Xin Jiang, Qun Liu
This survey aims to provide a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs, covering various aspects of data management strategy design.
1 code implementation • 31 Oct 2023 • Yuxin Jiang, YuFei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang
To fill this research gap, in this paper, we propose FollowBench, a Multi-level Fine-grained Constraints Following Benchmark for LLMs.
1 code implementation • 30 Oct 2023 • Wai-Chung Kwan, Xingshan Zeng, YuFei Wang, Yusen Sun, Liangyou Li, Lifeng Shang, Qun Liu, Kam-Fai Wong
In this paper, we propose M4LE, a Multi-ability, Multi-range, Multi-task, Multi-domain benchmark for Long-context Evaluation.
1 code implementation • 16 Oct 2023 • Jing Xiong, Jianhao Shen, Ye Yuan, Haiming Wang, Yichun Yin, Zhengying Liu, Lin Li, Zhijiang Guo, Qingxing Cao, Yinya Huang, Chuanyang Zheng, Xiaodan Liang, Ming Zhang, Qun Liu
Automated theorem proving (ATP) has become an appealing domain for exploring the reasoning ability of the recent successful generative language models.
no code implementations • 16 Oct 2023 • Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu
The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.
1 code implementation • 12 Oct 2023 • Boyang Xue, Weichao Wang, Hongru Wang, Fei Mi, Rui Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability {of FFNs} by knowledge enhancement and alignment respectively.
no code implementations • 1 Oct 2023 • Jianqiao Lu, Wanjun Zhong, Wenyong Huang, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Weichao Wang, Xingshan Zeng, Lifeng Shang, Xin Jiang, Qun Liu
SELF initiates with a meta-skill learning process that equips the LLMs with capabilities for self-feedback and self-refinement.
1 code implementation • 8 Sep 2023 • Chengwu Liu, Jianhao Shen, Huajian Xin, Zhengying Liu, Ye Yuan, Haiming Wang, Wei Ju, Chuanyang Zheng, Yichun Yin, Lin Li, Ming Zhang, Qun Liu
We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems.
no code implementations • 23 Aug 2023 • Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu
Large language models (LLMs) like ChatGPT and GPT-4 have attracted great attention given their surprising performance on a wide range of NLP tasks.
no code implementations • 12 Aug 2023 • Siheng Li, Cheng Yang, Yichun Yin, Xinyu Zhu, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang
Information-seeking conversation, which aims to help users gather information through conversation, has achieved great progress in recent years.
1 code implementation • 12 Aug 2023 • Siheng Li, Yichun Yin, Cheng Yang, Wangjie Jiang, Yiwei Li, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang
In this paper, we propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news.
1 code implementation • 24 Jul 2023 • YuFei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, Qun Liu
(2) Training methodologies: a detailed review of the prevailing training methods employed for LLM alignment.
1 code implementation • 13 Jul 2023 • Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang
Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability.
1 code implementation • ACL 2023 • Guanhua Chen, Lu Hou, Yun Chen, Wenliang Dai, Lifeng Shang, Xin Jiang, Qun Liu, Jia Pan, Wenping Wang
Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization.
1 code implementation • 3 Jul 2023 • Min Li, Hao Zhou, Qun Liu, Yabin Shao, GuoYing Wang
It uses granular balls to simulate the spatial distribution characteristics of datasets, and informed entropy is utilized to further optimize the granular-ball space.
no code implementations • 22 May 2023 • Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu
This study proposes a multitask learning architecture for extractive summarization with coherence boosting.
1 code implementation • 17 May 2023 • Chuang Liu, Renren Jin, Yuqi Ren, Linhao Yu, Tianyu Dong, Xiaohan Peng, Shuting Zhang, Jianxiang Peng, Peiyi Zhang, Qingqing Lyu, Xiaowen Su, Qun Liu, Deyi Xiong
Comprehensively evaluating the capability of large language models in multiple tasks is of great importance.
no code implementations • 8 May 2023 • Zenan Xu, Xiaojun Meng, Yasheng Wang, Qinliang Su, Zexuan Qiu, Xin Jiang, Qun Liu
Multimodal abstractive summarization for videos (MAS) requires generating a concise textual summary to describe the highlights of a video according to multimodal resources, in our case, the video content and its transcript.
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Liangyou Li, Qun Liu, Zhihua Zhang
Utilizing pivot language effectively can significantly improve low-resource machine translation.
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Weixuan Wang, Liangyou Li, Qun Liu, Zhihua Zhang
We can use automatic summarization or machine translation evaluation metrics for length-controllable machine translation, but this is not necessarily suitable and accurate.
no code implementations • 12 Apr 2023 • Weixuan Wang, Wei Peng, Qun Liu
Visualization methods like heatmaps, T-SNE and translation examples are also utilized to demonstrate the effects of the proposed method.
no code implementations • 20 Mar 2023 • Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao
In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.
no code implementations • 24 Feb 2023 • Qiuchi Li, Benyou Wang, Yudong Zhu, Christina Lioma, Qun Liu
The emerging classical-quantum transfer learning paradigm has brought a decent performance to quantum computational models in many tasks, such as computer vision, by enabling a combination of quantum models and classical pre-trained neural networks.
1 code implementation • 29 Dec 2022 • Li Liu, Penggang Chen, Xin Li, William K. Cheung, Youmin Zhang, Qun Liu, Guoyin Wang
Aligning users across networks using graph representation learning has been found effective where the alignment is accomplished in a low-dimensional embedding space.
1 code implementation • 21 Dec 2022 • Hao Sun, Zhexin Zhang, Fei Mi, Yasheng Wang, Wei Liu, Jianwei Cui, Bin Wang, Qun Liu, Minlie Huang
In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems.
1 code implementation • 19 Dec 2022 • Haoli Bai, Zhiguang Liu, Xiaojun Meng, Wentao Li, Shuang Liu, Nian Xie, Rongfu Zheng, Liangwei Wang, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu
While various vision-language pre-training objectives are studied in existing solutions, the document textline, as an intrinsic granularity in VDU, has seldom been explored so far.
no code implementations • 17 Dec 2022 • Xingshan Zeng, Liangyou Li, Qun Liu
To alleviate the data scarcity problem in End-to-end speech translation (ST), pre-training on data for speech recognition and machine translation is considered as an important technique.
no code implementations • 15 Dec 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen
In light of this, we present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
1 code implementation • 7 Dec 2022 • Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu
Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.
no code implementations • 4 Dec 2022 • Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun Liu, Xiaoyan Zhu, Minlie Huang
For the former, the grounding knowledge consists of keywords extracted from the response.
1 code implementation • 28 Nov 2022 • Yusen Sun, Liangyou Li, Qun Liu, Dit-yan Yeung
Although lyrics generation has achieved significant progress in recent years, it has limited practical applications because the generated lyrics cannot be performed without composing compatible melodies.
no code implementations • 26 Nov 2022 • Xiaojun Meng, Wenlin Dai, Yasheng Wang, Baojun Wang, Zhiyong Wu, Xin Jiang, Qun Liu
Then we present a novel lexicon-injected semantic parser, which collects slot labels of tree representation as a lexicon, and injects lexical features to the span representation of parser.
1 code implementation • 13 Nov 2022 • Yufei Huang, Yujia Qin, Huadong Wang, Yichun Yin, Maosong Sun, Zhiyuan Liu, Qun Liu
Inspired by these observations, we propose Fast Prompt Tuning (FPT), which starts by conducting PT using a small-scale partial PLM, and then progressively expands its depth and width until the full-model size.
1 code implementation • 8 Nov 2022 • Hao Peng, Xiaozhi Wang, Shengding Hu, Hailong Jin, Lei Hou, Juanzi Li, Zhiyuan Liu, Qun Liu
We believe this is a critical bottleneck for realizing human-like cognition in PLMs.
no code implementations • 21 Oct 2022 • Dongsheng Chen, Chaofan Tao, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu
Recent large-scale video-language pre-trained models have shown appealing performance on various downstream tasks.
no code implementations • 20 Oct 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu
Further experiments on question-answering datasets show that trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.
1 code implementation • 18 Oct 2022 • Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin
MIRACL (Multilingual Information Retrieval Across a Continuum of Languages) is a multilingual dataset we have built for the WSDM 2023 Cup challenge that focuses on ad hoc retrieval across 18 different languages, which collectively encompass over three billion native speakers around the world.
no code implementations • 17 Aug 2022 • Zhihua Jin, Xingbo Wang, Furui Cheng, Chunhui Sun, Qun Liu, Huamin Qu
Since shortcuts vary in coverage, productivity, and semantic meaning, it is challenging for NLU experts to systematically understand and avoid them when creating benchmark datasets.
1 code implementation • 22 Jul 2022 • Fenia Christopoulou, Gerasimos Lampouras, Milan Gritta, Guchun Zhang, Yinpeng Guo, Zhongqi Li, Qi Zhang, Meng Xiao, Bo Shen, Lin Li, Hao Yu, Li Yan, Pingyi Zhou, Xin Wang, Yuchi Ma, Ignacio Iacobacci, Yasheng Wang, Guangtai Liang, Jiansheng Wei, Xin Jiang, Qianxiang Wang, Qun Liu
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.
no code implementations • Findings (NAACL) 2022 • Yinpeng Guo, Liangyou Li, Xin Jiang, Qun Liu
However, labeled cross-lingual corpus is expensive or even inaccessible, especially in the fields where labels are private, such as diagnostic results of symptoms in medicine and user profiles in business.
1 code implementation • 24 May 2022 • Jinghui Xiao, Qun Liu, Xin Jiang, Yuanfeng Xiong, Haiteng Wu, Zhe Zhang
Pinyin to Character conversion (P2C) task is the key task of Input Method Engine (IME) in commercial input software for Asian languages, such as Chinese, Japanese, Thai language and so on.
no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.
1 code implementation • ICLR 2022 • Yuxin Ren, Benyou Wang, Lifeng Shang, Xin Jiang, Qun Liu
A tiny version achieves $96. 7\%$ performance of BERT-base with $ {1}/{48} $ encoder parameters (i. e., less than 2M parameters excluding the embedding layer) and $2. 7 \times$ faster on inference.
no code implementations • CVPR 2022 • Cheng Chen, Yudong Zhu, Zhenshan Tan, Qingrong Cheng, Xin Jiang, Qun Liu, Xiaodong Gu
In this paper, we propose a contrastive learning-based framework UTC to unify and facilitate both discriminative and generative tasks in visual dialog with a single model.
2 code implementations • 31 Mar 2022 • Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu
We investigate different aspects of responses generated by PanGu-Bot, including response quality, knowledge, and safety.
no code implementations • Findings (ACL) 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Zhenhua Dong, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu
We check the words that have three typical associations with the missing words: knowledge-dependent, positionally close, and highly co-occurred.
no code implementations • ACL 2022 • Chaofan Tao, Lu Hou, Wei zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong
We find that previous quantization methods fail on generative tasks due to the \textit{homogeneous word embeddings} caused by reduced capacity, and \textit{varied distribution of weights}.
no code implementations • ACL 2022 • Meng Zhang, Liangyou Li, Qun Liu
Triangular machine translation is a special case of low-resource machine translation where the language pair of interest has limited parallel data, but both languages have abundant parallel data with a pivot language.
1 code implementation • ACL 2022 • Pengfei Li, Liangyou Li, Meng Zhang, Minghao Wu, Qun Liu
To the best of our knowledge, this is the first work to pre-train a unified model for fine-tuning on both NMT tasks.
1 code implementation • ACL 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Lan Luo, Ke Zhan, Enrui Hu, Xinyu Zhang, Hao Jiang, Zhao Cao, Fan Yu, Xin Jiang, Qun Liu, Lei Chen
To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR).
no code implementations • Findings (ACL) 2022 • Wenliang Dai, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung
Furthermore, the original textual language understanding and generation ability of the PLM is maintained after VLKD, which makes our model versatile for both multimodal and unimodal tasks.
1 code implementation • ACL 2022 • Tianbo Ji, Yvette Graham, Gareth J. F. Jones, Chenyang Lyu, Qun Liu
Answering the distress call of competitions that have emphasized the urgent need for better evaluation techniques in dialogue, we present the successful development of human evaluation that is highly reliable while still remaining feasible and low cost.
no code implementations • Findings (ACL) 2022 • Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu
Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering.
no code implementations • 8 Mar 2022 • Zhengkun Zhang, Wenya Guo, Xiaojun Meng, Yasheng Wang, Yadao Wang, Xin Jiang, Qun Liu, Zhenglu Yang
In this paper, we design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks.
no code implementations • Findings (ACL) 2022 • Dan Su, Xiaoguang Li, Jindi Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung
Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question.
Ranked #1 on
Question Answering
on KILT: ELI5
no code implementations • 16 Feb 2022 • Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng
The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e. g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice.
no code implementations • COLING 2022 • Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Xin Jiang, Qun Liu
Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data.
1 code implementation • ICLR 2022 • Wenyong Huang, Zhenhe Zhang, Yu Ting Yeung, Xin Jiang, Qun Liu
The student network is trained to output representation resembling that of the teacher.
no code implementations • Findings (NAACL) 2022 • Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schütze
We propose LMTurk, a novel approach that treats few-shot learners as crowdsourcing workers.
1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.
1 code implementation • 22 Nov 2021 • Zihan Yan, Li Liu, Xin Li, William K. Cheung, Youmin Zhang, Qun Liu, Guoyin Wang
Social network alignment aims at aligning person identities across social networks.
no code implementations • 16 Nov 2021 • Nianzu Zheng, Liqun Deng, Wenyong Huang, Yu Ting Yeung, Baohua Xu, Yuanyuan Guo, Yasheng Wang, Xiao Chen, Xin Jiang, Qun Liu
We utilize conv-transformer structure to encode input speech in a streaming manner.
no code implementations • dialdoc (ACL) 2022 • Xinyan Zhao, Bin He, Yasheng Wang, Yitong Li, Fei Mi, Yajiao Liu, Xin Jiang, Qun Liu, Huanhuan Chen
With the advances in deep learning, tremendous progress has been made with chit-chat dialogue systems and task-oriented dialogue systems.
no code implementations • ACL 2022 • Cheng Chen, Yichun Yin, Lifeng Shang, Xin Jiang, Yujia Qin, Fengyu Wang, Zhi Wang, Xiao Chen, Zhiyuan Liu, Qun Liu
However, large language model pre-training costs intensive computational resources and most of the models are trained from scratch without reusing the existing pre-trained models, which is wasteful.
no code implementations • 29 Sep 2021 • Chao Xing, Dong Wang, LiRong Dai, Qun Liu, Anderson Avila
Overparameterized transformer-based architectures have shown remarkable performance in recent years, achieving state-of-the-art results in speech processing tasks such as speech recognition, speech synthesis, keyword spotting, and speech enhancement et al.
no code implementations • 28 Sep 2021 • Qianmengke Zhao, Ye Wang, Qun Liu
Although deep learning models are powerful among various applications, most deep learning models are still a black box, lacking verifiability and interpretability, which means the decision-making process that human beings cannot understand.
1 code implementation • EMNLP 2021 • Baojun Wang, Zhao Zhang, Kun Xu, Guang-Yuan Hao, Yuyang Zhang, Lifeng Shang, Linlin Li, Xiao Chen, Xin Jiang, Qun Liu
Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks.
no code implementations • EMNLP 2021 • Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang, Qun Liu
Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, whereas supervised QG uses existing Question Answering (QA) datasets to train a system to generate a question given a passage and an answer.
no code implementations • 13 Sep 2021 • Zhengkun Zhang, Xiaojun Meng, Yasheng Wang, Xin Jiang, Qun Liu, Zhenglu Yang
Specially, we adopt knowledge distillation from a vision-language pretrained model to improve image selection, which avoids any requirement on the existence and quality of image captions.
no code implementations • 10 Sep 2021 • Fei Mi, Yitong Li, Yasheng Wang, Xin Jiang, Qun Liu
As labeling cost for different modules in task-oriented dialog (ToD) systems is high, a major challenge in practice is to learn different tasks with the least amount of labeled data.
1 code implementation • 9 Sep 2021 • Yinquan Lu, Haonan Lu, Guirong Fu, Qun Liu
Incorporating factual knowledge into pre-trained language models (PLM) such as BERT is an emerging trend in recent NLP studies.
Ranked #11 on
Common Sense Reasoning
on ReCoRD
no code implementations • 7 Sep 2021 • Shaobo Li, Qun Liu, Xin Jiang, Yichun Yin, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Lifeng Shang
Human-designed rules are widely used to build industry applications.
no code implementations • 7 Sep 2021 • Zhihua Jin, Xin Jiang, Xingbo Wang, Qun Liu, Yong Wang, Xiaozhe Ren, Huamin Qu
However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e. g., math word problems and measurement estimation).
no code implementations • Findings (EMNLP) 2021 • Jianhao Shen, Yichun Yin, Lin Li, Lifeng Shang, Xin Jiang, Ming Zhang, Qun Liu
Math word problem (MWP) is a challenging and critical task in natural language processing.
Ranked #3 on
Math Word Problem Solving
on Math23K
no code implementations • EMNLP 2021 • Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari, Qun Liu
In this work, we propose an approach, MultiUAT, that dynamically adjusts the training data usage based on the model's uncertainty on a small set of trusted clean data for multi-corpus machine translation.
no code implementations • ACL 2021 • Zhiqi Huang, Lu Hou, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Transformer-based pre-trained language models like BERT, though powerful in many tasks, are expensive in both memory and computation, due to their large number of parameters.
no code implementations • ACL 2021 • Jie He, Bo Peng, Yi Liao, Qun Liu, Deyi Xiong
Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error.
1 code implementation • ACL 2021 • Yichun Yin, Cheng Chen, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Specifically, we carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints.
1 code implementation • ACL 2021 • Zhihong Shao, Lifeng Shang, Qun Liu, Minlie Huang
This setting gives rise to the spurious solution problem: there may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance (e. g., producing wrong solutions or answers).
no code implementations • 9 Jun 2021 • Yinpeng Guo, Liangyou Li, Xin Jiang, Qun Liu
Recently, pre-training multilingual language models has shown great potential in learning multilingual representation, a crucial topic of natural language processing.
no code implementations • Findings (ACL) 2021 • Xingshan Zeng, Liangyou Li, Qun Liu
To bridge the modality gap between speech and text, RealTranS gradually downsamples the input speech with interleaved convolution and unidirectional Transformer layers for acoustic modeling, and then maps speech features into text space with a weighted-shrinking operation and a semantic encoder.
2 code implementations • 1 Jun 2021 • Chenglei Si, Zhengyan Zhang, Yingfa Chen, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
2) Pronunciation-based SubChar tokenizers can encode Chinese homophones into the same transliteration sequences and produce the same tokenization output, hence being robust to homophone typos.
no code implementations • 1 Jun 2021 • Xingshan Zeng, Liangyou Li, Qun Liu
We use a unified transformer architecture for our MultiST model, so that the data from different modalities (i. e., speech and text) and different tasks (i. e., Speech Recognition, Machine Translation, and Speech Translation) can be exploited to enhance the model's ability.
no code implementations • 24 May 2021 • Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data.
1 code implementation • 14 May 2021 • Zhixing Tan, Zeyuan Yang, Meng Zhang, Qun Liu, Maosong Sun, Yang Liu
With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices.
1 code implementation • Findings (ACL) 2021 • Silin Gao, Ryuichi Takanobu, Wei Peng, Qun Liu, Minlie Huang
To address this task, we propose a TOD system with hybrid knowledge management, HyKnow.
4 code implementations • 26 Apr 2021 • Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, YaoWei Wang, Xuefeng Jin, Qun Liu, Yonghong Tian
To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.
Ranked #1 on
Reading Comprehension (One-Shot)
on DuReader
Cloze (multi-choices) (Few-Shot)
Cloze (multi-choices) (One-Shot)
+19
no code implementations • 24 Apr 2021 • Cheng Chen, Yichun Yin, Lifeng Shang, Zhi Wang, Xin Jiang, Xiao Chen, Qun Liu
Task-agnostic knowledge distillation, a teacher-student framework, has been proved effective for BERT compression.
no code implementations • 18 Apr 2021 • Krtin Kumar, Peyman Passban, Mehdi Rezagholizadeh, Yiu Sing Lau, Qun Liu
Embedding matrices are key components in neural natural language processing (NLP) models that are responsible to provide numerical representations of input tokens.\footnote{In this paper words and subwords are referred to as \textit{tokens} and the term \textit{embedding} only refers to embeddings of inputs.}
no code implementations • 25 Mar 2021 • Tong Cui, Jinghui Xiao, Liangyou Li, Xin Jiang, Qun Liu
Speech-enabled systems typically first convert audio to text through an automatic speech recognition (ASR) model and then feed the text to downstream natural language processing (NLP) modules.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
no code implementations • 20 Mar 2021 • Liangyou Li, Andy Way, Qun Liu
We present graph-based translation models which translate source graphs into target strings.
1 code implementation • ICLR 2021 • Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i. e., harder examples).
no code implementations • 11 Mar 2021 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
The multilingual pre-trained language models (e. g, mBERT, XLM and XLM-R) have shown impressive performance on cross-lingual natural language understanding tasks.
1 code implementation • 23 Jan 2021 • Junqiu Wei, Qun Liu, Yinpeng Guo, Xin Jiang
The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.
no code implementations • ICLR 2021 • Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Grue Simonsen
Various Position Embeddings (PEs) have been proposed in Transformer based architectures~(e. g. BERT) to model word order.
1 code implementation • 31 Dec 2020 • Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA).
1 code implementation • ACL 2021 • Haoli Bai, Wei zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael Lyu, Irwin King
In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization.
no code implementations • 31 Dec 2020 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Chengjie Sun, Zhenzhou Ji, Bingquan Liu
In this paper, we propose a new retrieval target, hop, to collect the hidden reasoning evidence from Wikipedia for complex question answering.
Ranked #6 on
Question Answering
on HotpotQA
no code implementations • Findings (EMNLP) 2021 • Peyman Passban, Puneeth S. M. Saladi, Qun Liu
There is a large body of work in the NMT literature on analyzing the behavior of conventional models for the problem of noise but Transformers are relatively understudied in this context.
no code implementations • 27 Dec 2020 • Peyman Passban, Yimeng Wu, Mehdi Rezagholizadeh, Qun Liu
Knowledge distillation is considered as a training and compression strategy in which two neural networks, namely a teacher and a student, are coupled together during training.
no code implementations • 11 Dec 2020 • Xiaoqi Jiao, Huating Chang, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
Comprehensive experiments on the evaluation benchmarks demonstrate that 1) layer mapping strategy has a significant effect on task-agnostic BERT distillation and different layer mappings can result in quite different performances; 2) the optimal layer mapping strategy from the proposed search process consistently outperforms the other heuristic ones; 3) with the optimal layer mapping, our student model achieves state-of-the-art performance on the GLUE tasks.
no code implementations • 7 Dec 2020 • Bin He, Di Zhou, Jing Xie, Jinghui Xiao, Xin Jiang, Qun Liu
Entities may have complex interactions in a knowledge graph (KG), such as multi-step relationships, which can be viewed as graph contextual information of the entities.
no code implementations • 7 Dec 2020 • Bin He, Xin Jiang, Jinghui Xiao, Qun Liu
Recent studies on pre-trained language models have demonstrated their ability to capture factual knowledge and applications in knowledge-aware downstream tasks.
no code implementations • EMNLP 2021 • Mingzhou Xu, Liangyou Li, Derek. F. Wong, Qun Liu, Lidia S. Chao
Previous works have shown that contextual information can improve the performance of neural machine translation (NMT).
no code implementations • 10 Nov 2020 • Ahmad Rashid, Alan Do-Omri, Md. Akmal Haidar, Qun Liu, Mehdi Rezagholizadeh
B-GAN is able to generate a distributed latent space representation which can be paired with an attention based decoder to generate fluent sentences.
no code implementations • 7 Nov 2020 • Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Qun Liu, Maosong Sun
To measure the informativeness of attention heads, we train our Single-Shot Meta-Pruner (SMP) with a meta-learning paradigm aiming to maintain the distribution of text representations after pruning.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Bin He, Di Zhou, Jinghui Xiao, Xin Jiang, Qun Liu, Nicholas Jing Yuan, Tong Xu
Complex node interactions are common in knowledge graphs (KGs), and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Yudong Zhu, Di Zhou, Jinghui Xiao, Xin Jiang, Xiao Chen, Qun Liu
Natural language data exhibit tree-like hierarchical structures such as the hypernym-hyponym relations in WordNet.
no code implementations • 13 Oct 2020 • Qun Liu, Matthew Shreve, Raja Bala
Although data is abundant, data labeling is expensive.
2 code implementations • EMNLP 2020 • Yimeng Wu, Peyman Passban, Mehdi Rezagholizade, Qun Liu
With the growth of computing power neural machine translation (NMT) models also grow accordingly and become better.
no code implementations • 2 Oct 2020 • Yang Bai, Xiaoguang Li, Gang Wang, Chaoliang Zhang, Lifeng Shang, Jun Xu, Zhaowei Wang, Fangshan Wang, Qun Liu
Term-based sparse representations dominate the first-stage text retrieval in industrial applications, due to its advantage in efficiency, interpretability, and exact term matching.
5 code implementations • EMNLP 2020 • Wei Zhang, Lu Hou, Yichun Yin, Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu
Transformer-based pre-training models like BERT have achieved remarkable performance in many natural language processing tasks. However, these models are both computation and memory expensive, hindering their deployment to resource-constrained devices.
no code implementations • 28 Jul 2020 • Shuai Zhang, Peng Zhang, Xindian Ma, Junqiu Wei, Ningning Wang, Qun Liu
Transformer has been widely-used in many Natural Language Processing (NLP) tasks and the scaled dot-product attention between tokens is a core module of Transformer.
no code implementations • 8 May 2020 • Meng Zhang, Xin Jiang, Yang Liu, Qun Liu
In this work, we put machine translation in a cross-lingual pipeline and introduce downstream tasks to define task-specific acceptability of machine translations.
1 code implementation • EMNLP 2020 • Yun Chen, Yang Liu, Guanhua Chen, Xin Jiang, Qun Liu
Shift-Att is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change.
1 code implementation • ACL 2020 • Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu
However, this approach of evaluating a language model is undermined by the uncertainty of the amount of knowledge that is learned by the probe itself.
3 code implementations • ACL 2020 • Yi Liao, Xin Jiang, Qun Liu
Masked language model and autoregressive language model are two types of language models.
3 code implementations • NeurIPS 2020 • Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
The pre-trained language models like BERT, though powerful in many natural language processing tasks, are both computation and memory expensive.
no code implementations • 6 Apr 2020 • Wei Peng, Chongxuan Huang, Tian-Hao Li, Yun Chen, Qun Liu
Existing data augmentation approaches for neural machine translation (NMT) have predominantly relied on back-translating in-domain (IND) monolingual corpora.
no code implementations • 23 Jan 2020 • Qun Liu, Supratik Mukhopadhyay, Maria Ximena Bastidas Rodriguez, Xing Fu, Sushant Sahu, David Burk, Manas Gartia
Myocardial infarction (MI) is a scientific term that refers to heart attack.
no code implementations • 7 Jan 2020 • Supratik Mukhopadhyay, Qun Liu, Edward Collier, Yimin Zhu, Ravindra Gudishala, Chanachok Chokwitthaya, Robert DiBiano, Alimire Nabijiang, Sanaz Saeidi, Subhajit Sidhanta, Arnab Ganguly
The impacts of context factors driving human system interaction are challenging and are difficult to capture and replicate in existing design models.
1 code implementation • 18 Dec 2019 • Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description.
no code implementations • 8 Dec 2019 • Qun Liu, Subhashis Hazarika, John M. Patchett, James Paul Ahrens, Ayan Biswas
Data modeling and reduction for in situ is important.
no code implementations • 5 Dec 2019 • Gang Chen, Yang Liu, Huanbo Luan, Meng Zhang, Qun Liu, Maosong Sun
While the use of neural networks has proven effective in improving story generation, how to learn to generate an explainable high-level plot still remains a major challenge.
no code implementations • 30 Nov 2019 • Bin He, Di Zhou, Jinghui Xiao, Xin Jiang, Qun Liu, Nicholas Jing Yuan, Tong Xu
Complex node interactions are common in knowledge graphs, and these interactions also contain rich knowledge information.
no code implementations • 20 Nov 2019 • Qun Liu, Lihua Fu, Meng Zhang
Synthetic and field data were tested to assess the performance of the proposed algorithm (DSPRecon algorithm); the advantages of using our method were evaluated by comparing it with the singular spectrum analysis (SSA) method for irregular data reconstruction and de-aliased Cadzow method for regular data reconstruction.
1 code implementation • 15 Nov 2019 • Qun Liu, Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert DiBiano, Manohar Karki, Ramakrishna Nemani
Satellite image classification is a challenging problem that lies at the crossroads of remote sensing, computer vision, and machine learning.
Ranked #1 on
Satellite Image Classification
on SAT-4
no code implementations • 9 Nov 2019 • Yinpeng Guo, Yi Liao, Xin Jiang, Qing Zhang, Yibo Zhang, Qun Liu
Leveraging multilingual parallel texts to automatically generate paraphrases has drawn much attention as size of high-quality paraphrase corpus is limited.
no code implementations • 8 Nov 2019 • Liangyou Li, Xin Jiang, Qun Liu
Previous work on document-level NMT usually focuses on limited contexts because of degraded performance on larger contexts.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Yun Chen, Liangyou Li, Xin Jiang, Xiao Chen, Qun Liu
Despite the success of neural machine translation (NMT), simultaneous neural machine translation (SNMT), the task of translating in real time before a full sentence has been observed, remains challenging due to the syntactic structure difference and simultaneity requirements.
1 code implementation • ACL 2020 • Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun
Also, further experiments show our model has higher transferability and can bring more robustness enhancement to victim models by adversarial training.
1 code implementation • 20 Oct 2019 • Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng Wang, Qun Liu, Maosong Sun
Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications.
10 code implementations • Findings of the Association for Computational Linguistics 2020 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.
Ranked #1 on
Natural Language Inference
on MultiNLI Dev
10 code implementations • 31 Aug 2019 • Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu
The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.
no code implementations • 21 Aug 2019 • Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Neural dialog state trackers are generally limited due to the lack of quantity and diversity of annotated training data.
no code implementations • 11 Aug 2019 • Qun Liu, Edward Collier, Supratik Mukhopadhyay
We show that by learning the features at each resolution independently a trained model is able to accurately classify characters even in the presence of noise.
Ranked #1 on
Image Classification
on Noisy MNIST (AWGN)
no code implementations • WS 2019 • Wei Peng, Jianfeng Liu, Liangyou Li, Qun Liu
This paper describes Huawei{'}s neural machine translation systems for the WMT 2019 biomedical translation shared task.
1 code implementation • ACL 2019 • Fanchao Qi, Jun-Jie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun
In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment.
multi-word expression embedding
multi-word expression sememe prediction
3 code implementations • 29 Jun 2019 • Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT).