no code implementations • Findings (ACL) 2022 • Xianghong Fang, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Dit-yan Yeung
While variational autoencoders (VAEs) have been widely applied in text generation tasks, they are troubled by two challenges: insufficient representation capacity and poor controllability.
no code implementations • ACL 2022 • Ningning Wang, Guobing Gan, Peng Zhang, Shuai Zhang, Junqiu Wei, Qun Liu, Xin Jiang
Other sparse methods use clustering patterns to select words, but the clustering process is separate from the training process of the target task, which causes a decrease in effectiveness.
no code implementations • Findings (ACL) 2022 • Qiwei Bi, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Hanfang Yang
With the adoption of large pre-trained models like BERT in news recommendation, the above way to incorporate multi-field information may encounter challenges: the shallow feature encoding to compress the category and entity information is not compatible with the deep BERT encoding.
no code implementations • Findings (ACL) 2022 • Jian Li, Jieming Zhu, Qiwei Bi, Guohao Cai, Lifeng Shang, Zhenhua Dong, Xin Jiang, Qun Liu
Accurately matching user’s interests and candidate news is the key to news recommendation.
no code implementations • 20 Mar 2023 • Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao
In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.
no code implementations • 17 Mar 2023 • Jingxuan Wei, Shiyu Wu, Xin Jiang, Yequan Wang
The framework comprises a pretrained dialogue model (Blenderbot) and a diffusion model (Stable Diffusion).
no code implementations • 19 Dec 2022 • Haoli Bai, Zhiguang Liu, Xiaojun Meng, Wentao Li, Shuang Liu, Nian Xie, Rongfu Zheng, Liangwei Wang, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu
While various vision-language pre-training objectives are studied in existing solutions, the document textline, as an intrinsic granularity in VDU, has seldom been explored so far.
no code implementations • 15 Dec 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen
Disentangled representation learning remains challenging as ground truth factors of variation do not naturally exist.
no code implementations • 7 Dec 2022 • Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu
Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.
no code implementations • 4 Dec 2022 • Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun Liu, Xiaoyan Zhu, Minlie Huang
For the former, the grounding knowledge consists of keywords extracted from the response.
no code implementations • 26 Nov 2022 • Xiaojun Meng, Wenlin Dai, Yasheng Wang, Baojun Wang, Zhiyong Wu, Xin Jiang, Qun Liu
Then we present a novel lexicon-injected semantic parser, which collects slot labels of tree representation as a lexicon, and injects lexical features to the span representation of parser.
no code implementations • 21 Nov 2022 • Jiaru Jia, Mingzhe Liu, Jiake Xie, Xin Chen, Aiqing Yang, Xin Jiang, Hong Zhang, Yong Tang
Semantic segmentation models based on the conventional neural network can achieve remarkable performance in such tasks, while the dataset is crucial to the training model process.
no code implementations • 21 Oct 2022 • Dongsheng Chen, Chaofan Tao, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu
Recent large-scale video-language pre-trained models have shown appealing performance on various downstream tasks.
no code implementations • 20 Oct 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu
Further experiments on question-answering datasets show that trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.
no code implementations • 22 Jul 2022 • Fenia Christopoulou, Gerasimos Lampouras, Milan Gritta, Guchun Zhang, Yinpeng Guo, Zhongqi Li, Qi Zhang, Meng Xiao, Bo Shen, Lin Li, Hao Yu, Li Yan, Pingyi Zhou, Xin Wang, Yuchi Ma, Ignacio Iacobacci, Yasheng Wang, Guangtai Liang, Jiansheng Wei, Xin Jiang, Qianxiang Wang, Qun Liu
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.
1 code implementation • 17 Jun 2022 • Xin Liu, Jiayang Cheng, Yangqiu Song, Xin Jiang
We extend graph kernels and graph neural networks with dummy nodes and conduct experiments on graph classification and subgraph isomorphism matching tasks.
no code implementations • Findings (NAACL) 2022 • Yinpeng Guo, Liangyou Li, Xin Jiang, Qun Liu
However, labeled cross-lingual corpus is expensive or even inaccessible, especially in the fields where labels are private, such as diagnostic results of symptoms in medicine and user profiles in business.
1 code implementation • 24 May 2022 • Jinghui Xiao, Qun Liu, Xin Jiang, Yuanfeng Xiong, Haiteng Wu, Zhe Zhang
Pinyin to Character conversion (P2C) task is the key task of Input Method Engine (IME) in commercial input software for Asian languages, such as Chinese, Japanese, Thai language and so on.
no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.
no code implementations • 21 May 2022 • Fuzhao Xue, Jianghai Chen, Aixin Sun, Xiaozhe Ren, Zangwei Zheng, Xiaoxin He, Xin Jiang, Yang You
In this paper, we revisit these conventional configurations.
Ranked #88 on
Image Classification
on ImageNet
1 code implementation • ICLR 2022 • Yuxin Ren, Benyou Wang, Lifeng Shang, Xin Jiang, Qun Liu
A tiny version achieves $96. 7\%$ performance of BERT-base with $ {1}/{48} $ encoder parameters (i. e., less than 2M parameters excluding the embedding layer) and $2. 7 \times$ faster on inference.
no code implementations • CVPR 2022 • Cheng Chen, Yudong Zhu, Zhenshan Tan, Qingrong Cheng, Xin Jiang, Qun Liu, Xiaodong Gu
In this paper, we propose a contrastive learning-based framework UTC to unify and facilitate both discriminative and generative tasks in visual dialog with a single model.
no code implementations • 12 Apr 2022 • Daxin Tan, Liqun Deng, Nianzu Zheng, Yu Ting Yeung, Xin Jiang, Xiao Chen, Tan Lee
This study propose a fully automated system for speech correction and accent reduction.
1 code implementation • 31 Mar 2022 • Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu
We investigate different aspects of responses generated by PanGu-Bot, including response quality, knowledge, and safety.
no code implementations • Findings (ACL) 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Zhenhua Dong, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu
We check the words that have three typical associations with the missing words: knowledge-dependent, positionally close, and highly co-occurred.
no code implementations • ACL 2022 • Chaofan Tao, Lu Hou, Wei zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong
We find that previous quantization methods fail on generative tasks due to the \textit{homogeneous word embeddings} caused by reduced capacity, and \textit{varied distribution of weights}.
1 code implementation • ACL 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Lan Luo, Ke Zhan, Enrui Hu, Xinyu Zhang, Hao Jiang, Zhao Cao, Fan Yu, Xin Jiang, Qun Liu, Lei Chen
To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR).
no code implementations • Findings (ACL) 2022 • Wenliang Dai, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung
Furthermore, the original textual language understanding and generation ability of the PLM is maintained after VLKD, which makes our model versatile for both multimodal and unimodal tasks.
no code implementations • Findings (ACL) 2022 • Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu
Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering.
no code implementations • 8 Mar 2022 • Zhengkun Zhang, Wenya Guo, Xiaojun Meng, Yasheng Wang, Yadao Wang, Xin Jiang, Qun Liu, Zhenglu Yang
In this paper, we design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks.
no code implementations • Findings (ACL) 2022 • Dan Su, Xiaoguang Li, Jindi Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung
Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question.
Ranked #1 on
Question Answering
on KILT: ELI5
1 code implementation • 16 Feb 2022 • Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng
The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e. g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice.
1 code implementation • 14 Feb 2022 • Jiaxi Gu, Xiaojun Meng, Guansong Lu, Lu Hou, Minzhe Niu, Xiaodan Liang, Lewei Yao, Runhui Huang, Wei zhang, Xin Jiang, Chunjing Xu, Hang Xu
Experiments show that Wukong can serve as a promising Chinese pre-training dataset and benchmark for different cross-modal learning methods.
Ranked #6 on
Zero-shot Image Retrieval
on COCO-CN
no code implementations • COLING 2022 • Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Xin Jiang, Qun Liu
Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data.
1 code implementation • ICLR 2022 • Wenyong Huang, Zhenhe Zhang, Yu Ting Yeung, Xin Jiang, Qun Liu
The student network is trained to output representation resembling that of the teacher.
no code implementations • Findings (NAACL) 2022 • Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schütze
We propose LMTurk, a novel approach that treats few-shot learners as crowdsourcing workers.
1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.
no code implementations • 16 Nov 2021 • Nianzu Zheng, Liqun Deng, Wenyong Huang, Yu Ting Yeung, Baohua Xu, Yuanyuan Guo, Yasheng Wang, Xiao Chen, Xin Jiang, Qun Liu
We utilize conv-transformer structure to encode input speech in a streaming manner.
no code implementations • ICLR 2022 • Lewei Yao, Runhui Huang, Lu Hou, Guansong Lu, Minzhe Niu, Hang Xu, Xiaodan Liang, Zhenguo Li, Xin Jiang, Chunjing Xu
In this paper, we introduce a large-scale Fine-grained Interactive Language-Image Pre-training (FILIP) to achieve finer-level alignment through a cross-modal late interaction mechanism, which uses a token-wise maximum similarity between visual and textual tokens to guide the contrastive objective.
no code implementations • 26 Oct 2021 • Jin Zhang, Mingyang Zhao, Xin Jiang, Dong-Ming Yan
The proposed method assumes each data point is generated by a Laplacian Mixture Model (LMM), where its centers are determined by the corresponding points in other point sets.
no code implementations • dialdoc (ACL) 2022 • Xinyan Zhao, Bin He, Yasheng Wang, Yitong Li, Fei Mi, Yajiao Liu, Xin Jiang, Qun Liu, Huanhuan Chen
With the advances in deep learning, tremendous progress has been made with chit-chat dialogue systems and task-oriented dialogue systems.
no code implementations • ACL 2022 • Cheng Chen, Yichun Yin, Lifeng Shang, Xin Jiang, Yujia Qin, Fengyu Wang, Zhi Wang, Xiao Chen, Zhiyuan Liu, Qun Liu
However, large language model pre-training costs intensive computational resources and most of the models are trained from scratch without reusing the existing pre-trained models, which is wasteful.
no code implementations • 30 Sep 2021 • Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Irwin King, Michael R. Lyu
Experiments on GLUE and SQuAD benchmarks show that our proposed PTQ solution not only performs close to QAT, but also enjoys significant reductions in training time, memory overhead, and data consumption.
1 code implementation • EMNLP 2021 • Baojun Wang, Zhao Zhang, Kun Xu, Guang-Yuan Hao, Yuyang Zhang, Lifeng Shang, Linlin Li, Xiao Chen, Xin Jiang, Qun Liu
Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks.
no code implementations • EMNLP 2021 • Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang, Qun Liu
Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, whereas supervised QG uses existing Question Answering (QA) datasets to train a system to generate a question given a passage and an answer.
no code implementations • 13 Sep 2021 • Zhengkun Zhang, Xiaojun Meng, Yasheng Wang, Xin Jiang, Qun Liu, Zhenglu Yang
Specially, we adopt knowledge distillation from a vision-language pretrained model to improve image selection, which avoids any requirement on the existence and quality of image captions.
no code implementations • 10 Sep 2021 • Fei Mi, Yitong Li, Yasheng Wang, Xin Jiang, Qun Liu
As labeling cost for different modules in task-oriented dialog (ToD) systems is high, a major challenge in practice is to learn different tasks with the least amount of labeled data.
no code implementations • 7 Sep 2021 • Shaobo Li, Qun Liu, Xin Jiang, Yichun Yin, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Lifeng Shang
Human-designed rules are widely used to build industry applications.
no code implementations • Findings (EMNLP) 2021 • Jianhao Shen, Yichun Yin, Lin Li, Lifeng Shang, Xin Jiang, Ming Zhang, Qun Liu
Math word problem (MWP) is a challenging and critical task in natural language processing.
Ranked #1 on
Math Word Problem Solving
on Math23K
no code implementations • 7 Sep 2021 • Zhihua Jin, Xin Jiang, Xingbo Wang, Qun Liu, Yong Wang, Xiaozhe Ren, Huamin Qu
However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e. g., math word problems and measurement estimation).
no code implementations • 10 Aug 2021 • Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang
Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence.
no code implementations • ACL 2021 • Zhiqi Huang, Lu Hou, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Transformer-based pre-trained language models like BERT, though powerful in many tasks, are expensive in both memory and computation, due to their large number of parameters.
1 code implementation • ACL 2021 • Yichun Yin, Cheng Chen, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Specifically, we carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints.
no code implementations • 15 Jul 2021 • Jiahui Gao, Hang Xu, Han Shi, Xiaozhe Ren, Philip L. H. Yu, Xiaodan Liang, Xin Jiang, Zhenguo Li
Transformer-based pre-trained language models like BERT and its variants have recently achieved promising performance in various natural language processing (NLP) tasks.
Ranked #10 on
Semantic Textual Similarity
on MRPC
no code implementations • 4 Jul 2021 • Daxin Tan, Liqun Deng, Yu Ting Yeung, Xin Jiang, Xiao Chen, Tan Lee
This paper presents the design, implementation and evaluation of a speech editing system, named EditSpeech, which allows a user to perform deletion, insertion and replacement of words in a given speech utterance, without causing audible degradation in speech quality and naturalness.
no code implementations • 9 Jun 2021 • Yinpeng Guo, Liangyou Li, Xin Jiang, Qun Liu
Recently, pre-training multilingual language models has shown great potential in learning multilingual representation, a crucial topic of natural language processing.
1 code implementation • ACL 2021 • Xin Liu, Jiefu Ou, Yangqiu Song, Xin Jiang
Discourse relations among arguments reveal logical structures of a debate conversation.
no code implementations • 24 May 2021 • Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data.
no code implementations • 12 May 2021 • Hexiong Li, Xin Jiang, Guanying Huo, Cheng Su, Bolun Wang, Yifei Hu, Zhiming Zheng
With the consideration of kinematic limitation and machining efficiency, a time-optimal feed rate adjustment algorithm is proposed to further adjust feed rate value at breaking points.
2 code implementations • 26 Apr 2021 • Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, YaoWei Wang, Xuefeng Jin, Qun Liu, Yonghong Tian
To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.
Ranked #1 on
Reading Comprehension (Zero-Shot)
on CMRC 2018
Cloze (multi-choices) (Few-Shot)
Cloze (multi-choices) (One-Shot)
+18
no code implementations • 24 Apr 2021 • Cheng Chen, Yichun Yin, Lifeng Shang, Zhi Wang, Xin Jiang, Xiao Chen, Qun Liu
Task-agnostic knowledge distillation, a teacher-student framework, has been proved effective for BERT compression.
no code implementations • 26 Mar 2021 • Yifei Hu, Xin Jiang, Guanying Huo, Cheng Su, Bolun Wang, Hexiong Li, Zhiming Zheng
The algorithm consists of three modules: bidirectional scanning module, velocity scheduling module and round-off error elimination module.
no code implementations • 25 Mar 2021 • Tong Cui, Jinghui Xiao, Liangyou Li, Xin Jiang, Qun Liu
Speech-enabled systems typically first convert audio to text through an automatic speech recognition (ASR) model and then feed the text to downstream natural language processing (NLP) modules.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • ICLR 2021 • Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i. e., harder examples).
no code implementations • 11 Mar 2021 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
The multilingual pre-trained language models (e. g, mBERT, XLM and XLM-R) have shown impressive performance on cross-lingual natural language understanding tasks.
1 code implementation • 23 Jan 2021 • Junqiu Wei, Qun Liu, Yinpeng Guo, Xin Jiang
The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.
1 code implementation • ICML Workshop AML 2021 • Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun
In this work, we demonstrate the universal vulnerability of PTMs, where fine-tuned PTMs can be easily controlled by backdoor attacks in arbitrary downstream tasks.
no code implementations • ICLR 2021 • Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Grue Simonsen
Various Position Embeddings (PEs) have been proposed in Transformer based architectures~(e. g. BERT) to model word order.
no code implementations • 31 Dec 2020 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Chengjie Sun, Zhenzhou Ji, Bingquan Liu
In this paper, we propose a new retrieval target, hop, to collect the hidden reasoning evidence from Wikipedia for complex question answering.
Ranked #7 on
Question Answering
on HotpotQA
1 code implementation • ACL 2021 • Haoli Bai, Wei zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael Lyu, Irwin King
In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization.
no code implementations • 12 Dec 2020 • Jiarong Xu, Yizhou Sun, Xin Jiang, Yanhao Wang, Yang Yang, Chunping Wang, Jiangang Lu
To bridge the gap between theoretical graph attacks and real-world scenarios, in this work, we propose a novel and more realistic setting: strict black-box graph attack, in which the attacker has no knowledge about the victim model at all and is not allowed to send any queries.
no code implementations • 11 Dec 2020 • Xiaoqi Jiao, Huating Chang, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
Comprehensive experiments on the evaluation benchmarks demonstrate that 1) layer mapping strategy has a significant effect on task-agnostic BERT distillation and different layer mappings can result in quite different performances; 2) the optimal layer mapping strategy from the proposed search process consistently outperforms the other heuristic ones; 3) with the optimal layer mapping, our student model achieves state-of-the-art performance on the GLUE tasks.
no code implementations • 7 Dec 2020 • Bin He, Xin Jiang, Jinghui Xiao, Qun Liu
Recent studies on pre-trained language models have demonstrated their ability to capture factual knowledge and applications in knowledge-aware downstream tasks.
no code implementations • 7 Dec 2020 • Bin He, Di Zhou, Jing Xie, Jinghui Xiao, Xin Jiang, Qun Liu
Entities may have complex interactions in a knowledge graph (KG), such as multi-step relationships, which can be viewed as graph contextual information of the entities.
no code implementations • 4 Dec 2020 • Jiarong Xu, Yang Yang, Junru Chen, Chunping Wang, Xin Jiang, Jiangang Lu, Yizhou Sun
Additionally, we explore a provable connection between the robustness of the unsupervised graph encoder and that of models on downstream tasks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Bin He, Di Zhou, Jinghui Xiao, Xin Jiang, Qun Liu, Nicholas Jing Yuan, Tong Xu
Complex node interactions are common in knowledge graphs (KGs), and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Yudong Zhu, Di Zhou, Jinghui Xiao, Xin Jiang, Xiao Chen, Qun Liu
Natural language data exhibit tree-like hierarchical structures such as the hypernym-hyponym relations in WordNet.
1 code implementation • 28 Sep 2020 • Bolun Wang, Zachary Ferguson, Teseo Schneider, Xin Jiang, Marco Attene, Daniele Panozzo
We introduce a large scale benchmark for continuous collision detection (CCD) algorithms, composed of queries manually constructed to highlight challenging degenerate cases and automatically generated using existing simulators to cover common cases.
Graphics
2 code implementations • EMNLP 2020 • Wei Zhang, Lu Hou, Yichun Yin, Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu
Transformer-based pre-training models like BERT have achieved remarkable performance in many natural language processing tasks. However, these models are both computation and memory expensive, hindering their deployment to resource-constrained devices.
no code implementations • NeurIPS 2020 • Jingjing Li, Zichao Li, Lili Mou, Xin Jiang, Michael R. Lyu, Irwin King
In this work, we present TGLS, a novel framework to unsupervised Text Generation by Learning from Search.
no code implementations • 8 May 2020 • Meng Zhang, Xin Jiang, Yang Liu, Qun Liu
In this work, we put machine translation in a cross-lingual pipeline and introduce downstream tasks to define task-specific acceptability of machine translations.
no code implementations • EMNLP 2020 • Yun Chen, Yang Liu, Guanhua Chen, Xin Jiang, Qun Liu
Shift-Att is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change.
1 code implementation • 27 Apr 2020 • Xin Liu, Jiefu Ou, Yangqiu Song, Xin Jiang
Implicit discourse relation classification is one of the most difficult parts in shallow discourse parsing as the relation prediction without explicit connectives requires the language understanding at both the text span level and the sentence level.
3 code implementations • ACL 2020 • Yi Liao, Xin Jiang, Qun Liu
Masked language model and autoregressive language model are two types of language models.
3 code implementations • NeurIPS 2020 • Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
The pre-trained language models like BERT, though powerful in many natural language processing tasks, are both computation and memory expensive.
1 code implementation • 25 Dec 2019 • Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, Lifeng Shang
In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms.
no code implementations • 30 Nov 2019 • Bin He, Di Zhou, Jinghui Xiao, Xin Jiang, Qun Liu, Nicholas Jing Yuan, Tong Xu
Complex node interactions are common in knowledge graphs, and these interactions also contain rich knowledge information.
no code implementations • 12 Nov 2019 • Weiguo Zhou, Xin Jiang, Chen Chen, Sijia Mei, Yun-hui Liu
In this paper, we propose a method that takes advantage of human hand morphological topology (HMT) structure to improve the pose estimation performance.
Robotics Human-Computer Interaction
no code implementations • 9 Nov 2019 • Yinpeng Guo, Yi Liao, Xin Jiang, Qing Zhang, Yibo Zhang, Qun Liu
Leveraging multilingual parallel texts to automatically generate paraphrases has drawn much attention as size of high-quality paraphrase corpus is limited.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Yun Chen, Liangyou Li, Xin Jiang, Xiao Chen, Qun Liu
Despite the success of neural machine translation (NMT), simultaneous neural machine translation (SNMT), the task of translating in real time before a full sentence has been observed, remains challenging due to the syntactic structure difference and simultaneity requirements.
no code implementations • 8 Nov 2019 • Liangyou Li, Xin Jiang, Qun Liu
Previous work on document-level NMT usually focuses on limited contexts because of degraded performance on larger contexts.
no code implementations • IJCNLP 2019 • Lihua Qian, Lin Qiu, Wei-Nan Zhang, Xin Jiang, Yong Yu
Paraphrasing plays an important role in various natural language processing (NLP) tasks, such as question answering, information retrieval and sentence simplification.
3 code implementations • 20 Oct 2019 • Mengqi Zhang, Shu Wu, Meng Gao, Xin Jiang, Ke Xu, Liang Wang
The other is Dot-Product Attention mechanism, which draws on the Transformer net to explicitly model the effect of historical sessions on the current session.
6 code implementations • Findings of the Association for Computational Linguistics 2020 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu
To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.
Ranked #1 on
Natural Language Inference
on MultiNLI Dev
no code implementations • 19 Sep 2019 • Jie Zhao, Xin Jiang, Xiaoman Wang, Shengfan Wang, Yun-hui Liu
The proposal in this paper is verified by a simulated assembly in which a robot arm completed the assembly process including parts picking from bin and a subsequent peg-in-hole assembly.
8 code implementations • 31 Aug 2019 • Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu
The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.
no code implementations • 21 Aug 2019 • Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu
Neural dialog state trackers are generally limited due to the lack of quantity and diversity of annotated training data.
1 code implementation • 29 Jun 2019 • Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT).
no code implementations • ACL 2019 • Zichao Li, Xin Jiang, Lifeng Shang, Qun Liu
Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level.
1 code implementation • 25 May 2019 • Yaoming Zhu, Juncheng Wan, Zhiming Zhou, Liheng Chen, Lin Qiu, Wei-Nan Zhang, Xin Jiang, Yong Yu
Knowledge base is one of the main forms to represent information in a structured way.
1 code implementation • ACL 2019 • Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu
Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks.
Ranked #1 on
Relation Extraction
on FewRel
no code implementations • 24 Feb 2019 • Shengfan Wang, Xin Jiang, Jie Zhao, Xiaoman Wang, Weiguo Zhou, Yun-hui Liu, Fellow IEEE
This paper presents an efficient neural network model to generate robotic grasps with high resolution images.
Robotics
1 code implementation • 26 Dec 2018 • Yangbin Chen, Tom Ko, Lifeng Shang, Xiao Chen, Xin Jiang, Qing Li
In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task.
1 code implementation • ICLR 2020 • Nabiha Asghar, Lili Mou, Kira A. Selby, Kevin D. Pantasdo, Pascal Poupart, Xin Jiang
The memory bank provides a natural way of IDA: when adapting our model to a new domain, we progressively add new slots to the memory bank, which increases the number of parameters, and thus the model capacity.
no code implementations • COLING 2018 • Xin Jiang, Hai Ye, Zhunchen Luo, WenHan Chao, Wenjia Ma
This paper proposes a neural based system to solve the essential interpretability problem existing in text classification, especially in charge prediction task.
no code implementations • COLING 2018 • Wenjia Ma, WenHan Chao, Zhunchen Luo, Xin Jiang
For controversial topics, collecting argumentation-containing tweets which tend to be more convincing will help researchers analyze public opinions.
1 code implementation • NAACL 2018 • Hai Ye, Xin Jiang, Zhunchen Luo, WenHan Chao
In this paper, we propose to study the problem of COURT VIEW GENeration from the fact description in a criminal case.
no code implementations • EMNLP 2018 • Zichao Li, Xin Jiang, Lifeng Shang, Hang Li
The generator, built as a sequence-to-sequence learning model, can produce paraphrases given a sentence.
no code implementations • 12 Sep 2017 • Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, Lili Mou
Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content.
no code implementations • SEMEVAL 2017 • Nabiha Asghar, Pascal Poupart, Xin Jiang, Hang Li
We propose an online, end-to-end, neural generative conversational model for open-domain dialogue.
no code implementations • 12 Sep 2016 • Xin Jiang, Rebecca Willett
At the heart of this proposed approach is an online anomaly detection method based on dynamic, low-rank Gaussian mixture models.
1 code implementation • WS 2016 • Jun Yin, Xin Jiang, Zhengdong Lu, Lifeng Shang, Hang Li, Xiaoming Li
Empirical study shows the proposed model can effectively deal with the variations of questions and answers, and generate right and natural answers by referring to the facts in the knowledge-base.