no code implementations • NeurIPS 2012 • Qixia Jiang, Jun Zhu, Maosong Sun, Eric P. Xing
An effective strategy to exploit the supervising side information for discovering predictive topic representations is to impose discriminative constraints induced by such information on the posterior distributions under a topic model.
no code implementations • 17 May 2013 • Kaixu Zhang, Can Wang, Maosong Sun
A binary tree based framework is also designed to overcome the granularity mismatch problem.
no code implementations • 25 May 2013 • Kaixu Zhang, Maosong Sun
Conventional statistics-based methods for joint Chinese word segmentation and part-of-speech tagging (S&T) have generalization ability to recognize new words that do not appear in the training data.
no code implementations • 8 Oct 2014 • Yang Liu, Maosong Sun
However, a major challenge still remains: it is intractable to calculate the expectations of non-local features that are critical for capturing the divergence between natural languages.
2 code implementations • AAAI 2015 2015 • Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu
Knowledge graph completion aims to perform link prediction between entities.
1 code implementation • EMNLP 2015 • Yankai Lin, Zhiyuan Liu, Huanbo Luan, Maosong Sun, Siwei Rao, Song Liu
Representation learning of knowledge bases (KBs) aims to embed both entities and relations into a low-dimensional space.
3 code implementations • IJCAI 2015 • Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, Edward Chang
Representation learning has shown its effectiveness in many tasks such as image classification and text mining.
1 code implementation • ACL 2016 • Shiqi Shen, Yong Cheng, Zhongjun He, wei he, Hua Wu, Maosong Sun, Yang Liu
We propose minimum risk training for end-to-end neural machine translation.
1 code implementation • 15 Dec 2015 • Yong Cheng, Shiqi Shen, Zhongjun He, wei he, Hua Wu, Maosong Sun, Yang Liu
The attentional mechanism has proven to be effective in improving end-to-end neural machine translation.
no code implementations • 6 Apr 2016 • Xiaoyuan Yi, Ruoyu Li, Maosong Sun
We take the generation of Chinese classical poem lines as a sequence-to-sequence learning problem, and build a novel system based on the RNN Encoder-Decoder structure to generate quatrains (Jueju in Chinese), with a topic word as input.
no code implementations • 7 Apr 2016 • Ayana, Shiqi Shen, Yu Zhao, Zhiyuan Liu, Maosong Sun
Recently, neural models have been proposed for headline generation by learning to map documents to headlines with recurrent neural networks.
no code implementations • ACL 2016 • Chunyang Liu, Yang Liu, Huanbo Luan, Maosong Sun, Heng Yu
We introduce an agreement-based approach to learning parallel lexicons and phrases from non-parallel corpora.
no code implementations • ACL 2016 • Yong Cheng, Wei Xu, Zhongjun He, wei he, Hua Wu, Maosong Sun, Yang Liu
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation.
no code implementations • 20 Aug 2016 • Lei Xu, ZiYun Wang, Ayana, Zhiyuan Liu, Maosong Sun
Neural models have recently been used in text summarization including headline generation.
no code implementations • 22 Sep 2016 • Jiawei Wu, Ruobing Xie, Zhiyuan Liu, Maosong Sun
There are two main challenges for constructing knowledge representations from plain texts: (1) How to take full advantages of sequential contexts of entities in plain texts for KRL.
1 code implementation • 22 Sep 2016 • Ruobing Xie, Zhiyuan Liu, Huanbo Luan, Maosong Sun
More specifically, we first construct representations for all images of an entity with a neural image encoder.
1 code implementation • EMNLP 2017 • Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun
Distantly supervised relation extraction has been widely used to find novel relational facts from plain text.
no code implementations • 13 Nov 2016 • Xu Han, Zhiyuan Liu, Maosong Sun
Joint representation learning of text and knowledge within a unified semantic space enables us to perform knowledge graph completion more accurately.
no code implementations • 15 Nov 2016 • Yong Cheng, Yang Liu, Qian Yang, Maosong Sun, Wei Xu
While recent neural machine translation approaches have delivered state-of-the-art performance for resource-rich language pairs, they suffer from the data scarcity problem for resource-scarce language pairs.
no code implementations • 21 Nov 2016 • Cunchao Tu, Xiangkai Zeng, Hao Wang, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun, Bo Zhang, Leyu Lin
Network representation learning (NRL) aims to learn low-dimensional vectors for vertices in a network.
Social and Information Networks Physics and Society
no code implementations • COLING 2016 • Meng Zhang, Yang Liu, Huanbo Luan, Yiqun Liu, Maosong Sun
Being able to induce word translations from non-parallel data is often a prerequisite for cross-lingual processing in resource-scarce languages and domains.
no code implementations • 14 Dec 2016 • Ruobing Xie, Zhiyuan Liu, Rui Yan, Maosong Sun
It indicates that our method could well capture the contextual information and emotion flow in dialogues, which is significant for emoji recommendation.
no code implementations • 25 Apr 2017 • Liner Yang, Meishan Zhang, Yang Liu, Nan Yu, Maosong Sun, Guohong Fu
While part-of-speech (POS) tagging and dependency parsing are observed to be closely related, existing work on joint modeling with manually crafted feature templates suffers from the feature sparsity and incompleteness problems.
6 code implementations • 20 Jun 2017 • Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng, Maosong Sun, Huanbo Luan, Yang Liu
This paper introduces THUMT, an open-source toolkit for neural machine translation (NMT) developed by the Natural Language Processing Group at Tsinghua University.
1 code implementation • International Joint Conference on Artificial Intelligence 2017 • Hao Zhu, Ruobing Xie, Zhiyuan Liu, Maosong Sun
During this process, we can align entities according to their semantic distance in this joint semantic space.
1 code implementation • ACL 2017 • Cunchao Tu, Han Liu, Zhiyuan Liu, Maosong Sun
Network embedding (NE) is playing a critical role in network analysis, due to its ability to represent vertices with efficient low-dimensional embedding vectors.
no code implementations • ACL 2017 • Yanzhuo Ding, Yang Liu, Huanbo Luan, Maosong Sun
While neural machine translation (NMT) has made remarkable progress in recent years, it is hard to interpret its internal workings due to the continuous representations and non-linearity of neural networks.
1 code implementation • ACL 2017 • Yilin Niu, Ruobing Xie, Zhiyuan Liu, Maosong Sun
The key idea is to utilize word sememes to capture exact meanings of a word within specific contexts accurately.
no code implementations • ACL 2017 • Yankai Lin, Zhiyuan Liu, Maosong Sun
Relation extraction has been widely used for finding unknown relational facts from plain text.
no code implementations • ACL 2017 • Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun
In this work, we show that such cross-lingual connection can actually be established without any form of supervision.
1 code implementation • SEMEVAL 2017 • Ao Chen, Maosong Sun
With the explosive growth of Internet, more and more domain-specific environments appear, such as forums, blogs, MOOCs and etc.
no code implementations • EMNLP 2017 • Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun
By viewing word embedding spaces as distributions, we propose to minimize their earth mover{'}s distance, a measure of divergence between distributions.
1 code implementation • ACL 2018 • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu
This paper presents the Entity-Duet Neural Ranking Model (EDRM), which introduces knowledge graphs to neural search systems.
no code implementations • 28 May 2018 • Xu Han, Zhiyuan Liu, Maosong Sun
As shown in the experiments on a large-scale benchmark dataset in relation extraction, our denoising method can effectively filter out noisy instances and achieve significant improvements as compared with the state-of-the-art models.
1 code implementation • ACL 2018 • Huiming Jin, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Fen Lin, Leyu Lin
However, existing methods of lexical sememe prediction typically rely on the external context of words to represent the meaning, which usually fails to deal with low-frequency and out-of-vocabulary words.
1 code implementation • ACL 2018 • Yankai Lin, Haozhe Ji, Zhiyuan Liu, Maosong Sun
Distantly supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text.
Ranked #2 on Open-Domain Question Answering on Quasar
3 code implementations • 4 Jul 2018 • Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu
In this paper, we introduce the \textbf{C}hinese \textbf{AI} and \textbf{L}aw challenge dataset (CAIL2018), the first large-scale Chinese legal dataset for judgment prediction.
1 code implementation • COLING 2018 • Xiaozhi Wang, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun
To address these issues, we propose an adversarial multi-lingual neural relation extraction (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages.
1 code implementation • COLING 2018 • Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, Maosong Sun
Specifically, our model outperforms other baselines by more than 50{\%} in the few-shot scenario.
no code implementations • CONLL 2018 • Xiaoyuan Yi, Ruoyu Li, Maosong Sun
As a precious part of the human cultural heritage, Chinese poetry has influenced people for generations.
1 code implementation • 12 Sep 2018 • Xiaoyuan Yi, Maosong Sun, Ruoyu Li, Zonghan Yang
Different from previous methods, our model explicitly maintains topics and informative limited history in a neural memory.
no code implementations • 18 Sep 2018 • Shangbang Long, Cunchao Tu, Zhiyuan Liu, Maosong Sun
It has been studied for several decades mainly by lawyers and judges, considered as a novel and prospective application of artificial intelligence techniques in the legal field.
no code implementations • 27 Sep 2018 • Haozhe Ji, Yankai Lin, Zhiyuan Liu, Maosong Sun
The open-domain question answering (OpenQA) task aims to extract answers that match specific questions from a distantly supervised corpus.
1 code implementation • EMNLP 2018 • Xu Han, Pengfei Yu, Zhiyuan Liu, Maosong Sun, Peng Li
In this paper, we aim to incorporate the hierarchical information of relations for distantly supervised relation extraction and propose a novel hierarchical attention scheme.
no code implementations • EMNLP 2018 • Xiaoyuan Yi, Maosong Sun, Ruoyu Li, Wenhao Li
Human experts evaluate poetry in terms of some specific criteria, instead of word-level likelihood.
no code implementations • EMNLP 2018 • Cheng Yang, Maosong Sun, Xiaoyuan Yi, Wenhao Li
The ability to write diverse poems in different styles under the same poetic imagery is an important characteristic of human poetry writing.
1 code implementation • EMNLP 2018 • Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Chaojun Xiao, Zhiyuan Liu, Maosong Sun
Legal Judgment Prediction (LJP) aims to predict the judgment result based on the facts of a case and becomes a promising application of artificial intelligence techniques in the legal field.
1 code implementation • EMNLP 2018 • Fanchao Qi, Yankai Lin, Maosong Sun, Hao Zhu, Ruobing Xie, Zhiyuan Liu
We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction.
1 code implementation • EMNLP 2018 • Ji Xin, Hao Zhu, Xu Han, Zhiyuan Liu, Maosong Sun
Entity typing aims to classify semantic types of an entity mention in a specific context.
no code implementations • EMNLP 2018 • Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, Dekang Lin
The task of machine reading comprehension (MRC) has evolved from answering simple questions from well-edited text to answering real questions from users out of web data.
3 code implementations • EMNLP 2018 • Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Min Zhang, Yang Liu
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge.
2 code implementations • 13 Oct 2018 • Haoxi Zhong, Chaojun Xiao, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu
In this paper, we give an overview of the Legal Judgment Prediction (LJP) competition at Chinese AI and Law challenge (CAIL2018).
1 code implementation • 13 Oct 2018 • Fuli Feng, Huimin Chen, Xiangnan He, Ji Ding, Maosong Sun, Tat-Seng Chua
The key novelty is that we propose to employ adversarial training to improve the generalization of a neural network prediction model.
1 code implementation • EMNLP 2018 • Xu Han, Hao Zhu, Pengfei Yu, ZiYun Wang, Yuan YAO, Zhiyuan Liu, Maosong Sun
The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers.
1 code implementation • EMNLP 2018 • Yihong Gu, Jun Yan, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Fen Lin, Leyu Lin
Most language modeling methods rely on large-scale data to statistically learn the sequential patterns of words.
1 code implementation • EMNLP 2018 • Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, Juanzi Li
We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.
1 code implementation • ACL 2017 • Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun
Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge.
1 code implementation • 10 Nov 2018 • Changhe Song, Cunchao Tu, Cheng Yang, Zhiyuan Liu, Maosong Sun
By regarding all reposts to a rumor candidate as a sequence, the proposed model will seek an early point-in-time for making a credible prediction.
Social and Information Networks
1 code implementation • NeurIPS 2018 • Yi Qi, Qingyun Wu, Hongning Wang, Jie Tang, Maosong Sun
Implicit feedback, such as user clicks, although abundant in online information service systems, does not provide substantial evidence on users' evaluation of system's output.
5 code implementations • 20 Dec 2018 • Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, LiFeng Wang, Changcheng Li, Maosong Sun
Lots of learning tasks require dealing with graph data which contains rich relation information among elements.
no code implementations • 21 Dec 2018 • Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Maosong Sun, Zhichong Fang, Bo Zhang, Leyu Lin
There is recently a surge in approaches that learn low-dimensional embeddings of nodes in networks.
1 code implementation • 21 Dec 2018 • Cheng Yang, Maosong Sun, Haoran Liu, Shiyi Han, Zhiyuan Liu, Huanbo Luan
The strong assumptions oversimplify the complex diffusion mechanism and prevent these models from better fitting real-world cascade data.
Social and Information Networks Physics and Society
2 code implementations • 28 Dec 2018 • Yankai Lin, Xu Han, Ruobing Xie, Zhiyuan Liu, Maosong Sun
Knowledge representation learning (KRL) aims to represent entities and relations in knowledge graph in low-dimensional semantic space, which have been widely used in massive knowledge-driven tasks.
1 code implementation • 28 Jan 2019 • Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, Zhendong Dong
In this paper, we present an open sememe-based lexical knowledge base OpenHowNet.
1 code implementation • ACL 2019 • Hao Zhu, Yankai Lin, Zhiyuan Liu, Jie Fu, Tat-Seng Chua, Maosong Sun
Recently, progress has been made towards improving relational reasoning in machine learning field.
2 code implementations • ACL 2019 • Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu
Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks.
Ranked #1 on Entity Linking on FIGER
1 code implementation • 1 Jun 2019 • Junjie Huang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Maosong Sun
Word similarity computation is a widely recognized task in the field of lexical semantics.
1 code implementation • NAACL 2019 • Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li
Modern weakly supervised methods for event detection (ED) avoid time-consuming human annotation and achieve promising results by learning from auto-labeled data.
4 code implementations • ACL 2019 • Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Lixin Huang, Jie zhou, Maosong Sun
Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs.
Ranked #59 on Relation Extraction on DocRED
1 code implementation • ACL 2019 • Jiahua Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun
Experimental results show that the multilingual BERT model achieves the best results in almost all target languages, while the performance of cross-lingual OpenQA is still much lower than that of English.
no code implementations • ACL 2019 • Zonghan Yang, Yong Cheng, Yang Liu, Maosong Sun
While neural machine translation (NMT) has achieved remarkable success, NMT systems are prone to make word omission errors.
no code implementations • ACL 2019 • Guo Zhipeng, Xiaoyuan Yi, Maosong Sun, Wenhao Li, Cheng Yang, Jiannan Liang, Huimin Chen, Yuhui Zhang, Ruoyu Li
By exposing the options of poetry genres, styles and revision modes, Jiuge, acting as a professional assistant, allows constant and active participation of users in poetic creation.
1 code implementation • ACL 2019 • Fanchao Qi, Jun-Jie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun
In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment.
multi-word expression embedding multi-word expression sememe prediction
1 code implementation • ACL 2019 • Weize Chen, Hao Zhu, Xu Han, Zhiyuan Liu, Maosong Sun
We introduce a conceptually simple and effective method to quantify the similarity between relations in knowledge bases.
2 code implementations • ACL 2019 • Jie Zhou, Xu Han, Cheng Yang, Zhiyuan Liu, LiFeng Wang, Changcheng Li, Maosong Sun
Fact verification (FV) is a challenging task which requires to retrieve relevant evidence from plain text and use the evidence to verify given claims.
Ranked #7 on Fact Verification on FEVER
no code implementations • 28 Aug 2019 • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu
Entity embedding learns lots of semantic information from the knowledge graph and represents entities with a low-dimensional representation, which provides an opportunity to establish interactions between query related entities and candidate entities for entity retrieval.
1 code implementation • 29 Aug 2019 • Tianyu Gao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
To address new relations with few-shot instances, we propose a novel bootstrapping approach, Neural Snowball, to learn new relations by transferring semantic knowledge about existing relations.
1 code implementation • IJCNLP 2019 • Shuo Wang, Yang Liu, Chao Wang, Huanbo Luan, Maosong Sun
While back-translation is simple and effective in exploiting abundant monolingual corpora to improve low-resource neural machine translation (NMT), the synthetic bilingual corpora generated by NMT models trained on limited authentic bilingual data are inevitably noisy.
no code implementations • 18 Sep 2019 • Jiaju Du, Fanchao Qi, Maosong Sun
Word Sense Disambiguation (WSD), which aims to identify the correct sense of a given polyseme, is a long-standing problem in NLP.
1 code implementation • IJCNLP 2019 • Xu Han, Tianyu Gao, Yuan YAO, Demin Ye, Zhiyuan Liu, Maosong Sun
OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement neural models for relation extraction (RE).
1 code implementation • IJCNLP 2019 • Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
We present FewRel 2. 0, a more challenging task to investigate two aspects of few-shot relation classification models: (1) Can they adapt to a new domain with only a handful of instances?
1 code implementation • 20 Oct 2019 • Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng Wang, Qun Liu, Maosong Sun
Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications.
1 code implementation • ACL 2020 • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu
Fact Verification requires fine-grained natural language inference capability that finds subtle clues to identify the syntactical and semantically correct but not well-supported claims.
Ranked #5 on Fact Verification on FEVER
1 code implementation • ACL 2020 • Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun
Also, further experiments show our model has higher transferability and can bring more robustness enhancement to victim models by adversarial training.
1 code implementation • IJCNLP 2019 • Ruidong Wu, Yuan YAO, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
Open relation extraction (OpenRE) aims to extract relational facts from the open-domain corpus.
1 code implementation • IJCNLP 2019 • Xiaozhi Wang, Ziqi Wang, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie zhou, Xiang Ren
Existing event extraction methods classify each argument role independently, ignoring the conceptual correlations between different argument roles.
no code implementations • 5 Nov 2019 • Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun
In this work, we propose a challenging adversarial language game called Adversarial Taboo as an example, in which an attacker and a defender compete around a target word.
no code implementations • 6 Nov 2019 • Deming Ye, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Maosong Sun
Multi-paragraph reasoning is indispensable for open-domain question answering (OpenQA), which receives less attention in the current OpenQA systems.
Ranked #58 on Question Answering on HotpotQA
2 code implementations • IJCNLP 2019 • Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun
To better identify translation errors, our method learns the representations of source sentences and system outputs in an interactive way.
2 code implementations • 20 Nov 2019 • Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Tianyang Zhang, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu
In this paper, we introduce CAIL2019-SCM, Chinese AI and Law 2019 Similar Case Matching dataset.
no code implementations • 26 Nov 2019 • Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Yang Liu
The lack of alignment in NMT models leads to three problems: it is hard to (1) interpret the translation process, (2) impose lexical constraints, and (3) impose structural constraints.
no code implementations • 27 Nov 2019 • Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, Maosong Sun
We present JEC-QA, the largest question answering dataset in the legal domain, collected from the National Judicial Examination of China.
3 code implementations • 4 Dec 2019 • Fanchao Qi, Liang Chang, Maosong Sun, Sicong Ouyang, Zhiyuan Liu
We first build a dataset serving as the seed of the multilingual sememe KB.
no code implementations • 5 Dec 2019 • Gang Chen, Yang Liu, Huanbo Luan, Meng Zhang, Qun Liu, Maosong Sun
While the use of neural networks has proven effective in improving story generation, how to learn to generate an explainable high-level plot still remains a major challenge.
1 code implementation • 18 Dec 2019 • Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description.
1 code implementation • 16 Jan 2020 • Jiaju Du, Fanchao Qi, Maosong Sun, Zhiyuan Liu
We find that sememes of each word are usually semantically matched to different words in its dictionary definition, and we name this matching relationship local semantic correspondence.
no code implementations • LREC 2020 • Jinyi Hu, Maosong Sun
In this paper, we propose a GPT-2 based uniformed framework for generating major types of Chinese classical poems.
no code implementations • 13 Mar 2020 • Xiaoyuan Yi, Ruoyu Li, Cheng Yang, Wenhao Li, Maosong Sun
Though recent neural models make prominent progress in some criteria of poetry quality, generated poems still suffer from the problem of poor diversity.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Xu Han, Tianyu Gao, Yankai Lin, Hao Peng, Yaoliang Yang, Chaojun Xiao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Relational facts are an important component of human knowledge, which are hidden in vast amounts of text.
2 code implementations • EMNLP 2020 • Deming Ye, Yankai Lin, Jiaju Du, Zheng-Hao Liu, Peng Li, Maosong Sun, Zhiyuan Liu
Language representation models such as BERT could effectively capture contextual semantic information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning.
Ranked #31 on Relation Extraction on DocRED
1 code implementation • EMNLP 2020 • Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun
In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns.
2 code implementations • ACL 2020 • Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, Maosong Sun
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
1 code implementation • Findings (ACL) 2021 • Jie Zhou, Shengding Hu, Xin Lv, Cheng Yang, Zhiyuan Liu, Wei Xu, Jie Jiang, Juanzi Li, Maosong Sun
Based on the datasets, we propose novel tasks such as multi-hop knowledge abstraction (MKA), multi-hop knowledge concretization (MKC) and then design a comprehensive benchmark.
2 code implementations • 2020 • Cheng Yang, Maosong Sun, Zhiyuan Liu, Cunchao Tu
Many Network Representation Learning (NRL) methods have been proposed to learn vector representations for vertices in a network recently.
no code implementations • ACL 2020 • Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Continual relation learning aims to continually train a model on new data to learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations.
1 code implementation • 14 Jul 2020 • Xuancheng Huang, Jiacheng Zhang, Zhixing Tan, Derek F. Wong, Huanbo Luan, Jingfang Xu, Maosong Sun, Yang Liu
System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance.
1 code implementation • 12 Sep 2020 • Huimin Chen, Zeyu Zhu, Fanchao Qi, Yining Ye, Zhiyuan Liu, Maosong Sun, Jianbin Jin
Therefore, in this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)
no code implementations • 19 Sep 2020 • Yuan Zang, Bairu Hou, Fanchao Qi, Zhiyuan Liu, Xiaojun Meng, Maosong Sun
Adversarial attacking aims to fool deep neural networks with adversarial examples.
1 code implementation • ACL 2021 • Guoyang Zeng, Fanchao Qi, Qianrui Zhou, Tingji Zhang, Zixian Ma, Bairu Hou, Yuan Zang, Zhiyuan Liu, Maosong Sun
Textual adversarial attacking has received wide and increasing attention in recent years.
no code implementations • 19 Sep 2020 • Zheni Zeng, Chaojun Xiao, Yuan YAO, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
Recommender systems aim to provide item recommendations for users, and are usually faced with data sparsity problem (e. g., cold start) in real-world scenarios.
1 code implementation • 29 Sep 2020 • Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie zhou, Maosong Sun
In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text.
no code implementations • EMNLP 2020 • Xu Han, Yuzhuo Bai, Keyue Qiu, Zhiyuan Liu, Maosong Sun
Oracle bone script (OBS) is the earliest known ancient Chinese writing system and the ancestor of modern Chinese.
1 code implementation • EMNLP 2020 • Fanchao Qi, Lei Zhang, Yanhui Yang, Zhiyuan Liu, Maosong Sun
A reverse dictionary takes descriptions of words as input and outputs words semantically matching the input descriptions.
1 code implementation • EMNLP 2020 • Hao Peng, Tianyu Gao, Xu Han, Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie zhou
We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks.
Ranked #23 on Relation Extraction on TACRED
1 code implementation • NeurIPS 2020 • Wangchunshu Zhou, Jinyi Hu, HANLIN ZHANG, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang
In this paper, we develop a general framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhenghao Liu, Chenyan Xiong, Zhuyun Dai, Si Sun, Maosong Sun, Zhiyuan Liu
With the epidemic of COVID-19, verifying the scientifically false online information, such as fake news and maliciously fabricated statements, has become crucial.
no code implementations • 7 Nov 2020 • Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Qun Liu, Maosong Sun
To measure the informativeness of attention heads, we train our Single-Shot Meta-Pruner (SMP) with a meta-learning paradigm aiming to maintain the distribution of text representations after pruning.
1 code implementation • EMNLP 2020 • Chaojun Xiao, Yuan YAO, Ruobing Xie, Xu Han, Zhiyuan Liu, Maosong Sun, Fen Lin, Leyu Lin
Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance.
2 code implementations • EMNLP 2021 • Fanchao Qi, Yangyi Chen, Mukai Li, Yuan YAO, Zhiyuan Liu, Maosong Sun
Nevertheless, there are few studies on defending against textual backdoor attacks.
6 code implementations • 1 Dec 2020 • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available.
1 code implementation • COLING 2020 • Bowen Dong, Yuan YAO, Ruobing Xie, Tianyu Gao, Xu Han, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
Few-shot classification requires classifiers to adapt to new classes with only a few training instances.
1 code implementation • COLING 2020 • Bairu Hou, Fanchao Qi, Yuan Zang, Xurui Zhang, Zhiyuan Liu, Maosong Sun
In this paper, we propose a new unsupervised method for HowNet-based Chinese WSD, which exploits the masked language model task of pre-trained language models.
1 code implementation • ACL 2021 • Chi Chen, Maosong Sun, Yang Liu
Word alignment, which aims to align translationally equivalent words between source and target sentences, plays an important role in many natural language processing tasks.
no code implementations • 25 Dec 2020 • Gang Chen, Maosong Sun, Yang Liu
In this work, we propose a method for building a continuous knowledge base (CKB) that can store knowledge imported from multiple, diverse neural networks.
1 code implementation • ACL 2021 • Yujia Qin, Yankai Lin, Ryuichi Takanobu, Zhiyuan Liu, Peng Li, Heng Ji, Minlie Huang, Maosong Sun, Jie zhou
Pre-trained Language Models (PLMs) have shown superior performance on various downstream Natural Language Processing (NLP) tasks.
no code implementations • 31 Dec 2020 • Zhixing Tan, Shuo Wang, Zonghan Yang, Gang Chen, Xuancheng Huang, Maosong Sun, Yang Liu
Machine translation (MT) is an important sub-field of natural language processing that aims to translate natural languages using computers.
1 code implementation • 31 Dec 2020 • Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA).
1 code implementation • ICML Workshop AML 2021 • Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun
In this work, we demonstrate the universal vulnerability of PTMs, where fine-tuned PTMs can be easily controlled by backdoor attacks in arbitrary downstream tasks.
1 code implementation • 30 Jan 2021 • Zhenghao Liu, Kaitao Zhang, Chenyan Xiong, Zhiyuan Liu, Maosong Sun
OpenMatch is a Python-based library that serves for Neural Information Retrieval (Neu-IR) research.
1 code implementation • 7 Feb 2021 • Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Jie zhou, Maosong Sun
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
no code implementations • 7 Feb 2021 • Zhiyuan Liu, Yankai Lin, Maosong Sun
This book aims to review and present the recent advances of distributed representation learning for NLP, including why representation learning can improve NLP, how representation learning takes part in various important topics of NLP, and what challenges are still not well addressed by distributed representation.
no code implementations • 22 Feb 2021 • Chaojun Xiao, Ruobing Xie, Yuan YAO, Zhiyuan Liu, Maosong Sun, Xu Zhang, Leyu Lin
Existing sequential recommendation methods rely on large amounts of training data and usually suffer from the data sparsity problem.
no code implementations • 13 Mar 2021 • Xinran Zhang, Maosong Sun, Jiafeng Liu, Xiaobing Li
Traditional stochastic sampling methods only focus on truncating the unreliable "tail" of the distribution, and do not address the "head" part, which we show might contain tedious or even repetitive candidates with high probability that lead to repetition loops.
no code implementations • 13 Mar 2021 • Xinran Zhang, Maosong Sun, Jiafeng Liu, Xiaobing Li
In natural language processing (NLP), the semantic similarity task requires large-scale, high-quality human-annotated labels for fine-tuning or evaluation.
no code implementations • 25 Mar 2021 • Yuzhong Wang, Chaojun Xiao, Shirong Ma, Haoxi Zhong, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, Maosong Sun
We propose to simulate judges from different groups with legal judgment prediction (LJP) models and measure the judicial inconsistency with the disagreement of the judgment results given by LJP models trained on different groups.
1 code implementation • ICCV 2021 • Yuan YAO, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun
In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
1 code implementation • 9 May 2021 • Chaojun Xiao, Xueyu Hu, Zhiyuan Liu, Cunchao Tu, Maosong Sun
Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP).
1 code implementation • NAACL 2021 • Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua
Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills.
Ranked #1 on Grammatical Error Detection on FCE
1 code implementation • 14 May 2021 • Zhixing Tan, Zeyuan Yang, Meng Zhang, Qun Liu, Maosong Sun, Yang Liu
With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices.
1 code implementation • Findings (ACL) 2021 • Tianyu Gao, Xu Han, Keyue Qiu, Yuzhuo Bai, Zhiyu Xie, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Distantly supervised (DS) relation extraction (RE) has attracted much attention in the past few years as it can utilize large-scale auto-labeled data.
1 code implementation • 24 May 2021 • Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
This indicates that PTR is a promising approach to take advantage of both human prior knowledge and PLMs for those complicated classification tasks.
1 code implementation • NAACL 2021 • Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
To address this issue, we propose a dynamic token reduction approach to accelerate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation.
2 code implementations • ACL 2021 • Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, Maosong Sun
As far as we know, almost all existing textual backdoor attack methods insert additional contents into normal samples as triggers, which causes the trigger-embedded samples to be detected and the backdoor attacks to be blocked without much effort.
1 code implementation • Findings (ACL) 2021 • Fanchao Qi, Yangyi Chen, Fengyu Wang, Zhiyuan Liu, Xiao Chen, Maosong Sun
We use this method to build an English SKB and a French SKB, and conduct comprehensive evaluations from both intrinsic and extrinsic perspectives.
2 code implementations • NAACL 2022 • Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs.
1 code implementation • ACL 2022 • Weize Chen, Xu Han, Yankai Lin, Hexu Zhao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Hyperbolic neural networks have shown great potential for modeling complex data.
1 code implementation • ACL 2021 • Xuancheng Huang, Jingfang Xu, Maosong Sun, Yang Liu
Although directly finetuning pretrained models on MSG tasks and concatenating multiple sources into a single long sequence is regarded as a simple method to transfer pretrained models to MSG tasks, we conjecture that the direct finetuning method leads to catastrophic forgetting and solely relying on pretrained self-attention layers to capture cross-source information is not sufficient.
1 code implementation • NAACL 2021 • Kai Zhang, Yuan YAO, Ruobing Xie, Xu Han, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
To establish the bidirectional connections between OpenRE and relation hierarchy, we propose the task of open hierarchical relation extraction and present a novel OHRE framework for the task.
2 code implementations • 1 Jun 2021 • Chenglei Si, Zhengyan Zhang, Yingfa Chen, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
2) Pronunciation-based SubChar tokenizers can encode Chinese homophones into the same transliteration sequences and produce the same tokenization output, hence being robust to homophone typos.
1 code implementation • 3 Jun 2021 • Wenhao Li, Fanchao Qi, Maosong Sun, Xiaoyuan Yi, Jiarui Zhang
We hope this dataset can further enhance the study on incorporating deep semantics into the understanding and generation system of Chinese classical poetry.
no code implementations • Findings (ACL) 2021 • Shuo Wang, Zhaopeng Tu, Zhixing Tan, Shuming Shi, Maosong Sun, Yang Liu
Language coverage bias, which indicates the content-dependent differences between sentence pairs originating from the source and target languages, is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
1 code implementation • ACL 2021 • Fanchao Qi, Yuan YAO, Sophia Xu, Zhiyuan Liu, Maosong Sun
Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks.
no code implementations • Findings (ACL) 2021 • Rui Jiao, Zonghan Yang, Maosong Sun, Yang Liu
In this work, we propose alternated training with synthetic and authentic data for NMT.
2 code implementations • 20 Jun 2021 • Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
no code implementations • 25 Jun 2021 • Shuo Wang, Zhaopeng Tu, Zhixing Tan, Wenxuan Wang, Maosong Sun, Yang Liu
Inspired by the recent progress of large-scale pre-trained language models on machine translation in a limited scenario, we firstly demonstrate that a single language model (LM4MT) can achieve comparable performance with strong encoder-decoder NMT models on standard machine translation benchmarks, using the same training data and similar amount of model parameters.
2 code implementations • ACL 2022 • Shengding Hu, Ning Ding, Huadong Wang, Zhiyuan Liu, Jingang Wang, Juanzi Li, Wei Wu, Maosong Sun
Tuning pre-trained language models (PLMs) with task-specific prompts has been a promising approach for text classification.
no code implementations • 27 Aug 2021 • Xinran Zhang, Maosong Sun, Jiafeng Liu, Xiaobing Li
We propose nucleus sampling with randomized head (NS-RH) algorithm, which randomizes the high frequency part ("head") of the predicted distribution, in order to emphasize on the "comparatively low frequency" words.
2 code implementations • ACL 2022 • Deming Ye, Yankai Lin, Peng Li, Maosong Sun
In particular, we propose a neighborhood-oriented packing strategy, which considers the neighbor spans integrally to better model the entity boundary information.
Ranked #1 on Named Entity Recognition (NER) on Few-NERD (SUP)
1 code implementation • 24 Sep 2021 • Yuan YAO, Ao Zhang, Zhengyan Zhang, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
Pre-Trained Vision-Language Models (VL-PTMs) have shown promising capabilities in grounding natural language in image data, facilitating a broad variety of cross-modal tasks.
1 code implementation • Findings (ACL) 2022 • Zhengyan Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
In this work, we study the computational patterns of FFNs and observe that most inputs only activate a tiny ratio of neurons of FFNs.
1 code implementation • EMNLP 2021 • Fanchao Qi, Yangyi Chen, Xurui Zhang, Mukai Li, Zhiyuan Liu, Maosong Sun
In this paper, we make the first attempt to conduct adversarial and backdoor attacks based on text style transfer, which is aimed at altering the style of a sentence while preserving its meaning.
1 code implementation • 15 Oct 2021 • Yangyi Chen, Fanchao Qi, Hongcheng Gao, Zhiyuan Liu, Maosong Sun
In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful.
1 code implementation • 15 Oct 2021 • Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Jing Yi, Weize Chen, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie zhou
In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace.
2 code implementations • ACL 2022 • Ning Ding, Shengding Hu, Weilin Zhao, Yulin Chen, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun
Prompt-learning has become a new paradigm in modern natural language processing, which directly adapts pre-trained language models (PLMs) to $cloze$-style prediction, autoregressive modeling, or sequence to sequence generation, resulting in promising performances on various tasks.
1 code implementation • NAACL 2022 • Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie zhou
To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work.
no code implementations • 27 Dec 2021 • Yuan YAO, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan, Xiaodong He, Xiaojun Wan, Xin Zhao, Xu sun, Yang Liu, Zhiyuan Liu, Xianpei Han, Erhong Yang, Zhifang Sui, Maosong Sun
We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.
1 code implementation • 30 Dec 2021 • Yingying Wang, Cunliang Kong, Liner Yang, Yijun Wang, Xiaorong Lu, Renfen Hu, Shan He, Zhenghao Liu, Yun Chen, Erhong Yang, Maosong Sun
This resource is of great relevance for second language acquisition research, foreign-language teaching, and automatic grammatical error correction.
1 code implementation • 17 Feb 2022 • Shangda Wu, Xiaobing Li, Maosong Sun
Melody harmonization has long been closely associated with chorales composed by Johann Sebastian Bach.
1 code implementation • ACL 2022 • Fanchao Qi, Yanhui Yang, Jing Yi, Zhili Cheng, Zhiyuan Liu, Maosong Sun
To facilitate the research on this task, we build a large and fully open quote recommendation dataset called QuoteR, which comprises three parts including English, standard Chinese and classical Chinese.
1 code implementation • ACL 2022 • Deming Ye, Yankai Lin, Peng Li, Maosong Sun, Zhiyuan Liu
Pre-trained language models (PLMs) cannot well recall rich factual knowledge of entities exhibited in large-scale corpora, especially those rare entities.
1 code implementation • Findings (ACL) 2022 • Yujia Qin, Jiajie Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
We experiment ELLE with streaming data from 5 domains on BERT and GPT.
1 code implementation • 14 Mar 2022 • Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun
This necessitates a new branch of research focusing on the parameter-efficient adaptation of PLMs, dubbed as delta tuning in this paper.
1 code implementation • Findings (ACL) 2022 • Fanchao Qi, Chuancheng Lv, Zhiyuan Liu, Xiaojun Meng, Maosong Sun, Hai-Tao Zheng
In this paper, we utilize the multilingual synonyms, multilingual glosses and images in BabelNet for SPBS.
1 code implementation • Findings (ACL) 2022 • Feng Yao, Chaojun Xiao, Xiaozhi Wang, Zhiyuan Liu, Lei Hou, Cunchao Tu, Juanzi Li, Yun Liu, Weixing Shen, Maosong Sun
However, existing Legal Event Detection (LED) datasets only concern incomprehensive event types and have limited annotated data, which restricts the development of LED methods and their downstream applications.
2 code implementations • 22 Mar 2022 • Ao Zhang, Yuan YAO, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua
Scene graph generation (SGG) is designed to extract (subject, predicate, object) triplets in images.
Ranked #1 on Predicate Classification on Visual Genome
no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.
1 code implementation • 10 May 2022 • Jiafeng Liu, Yuanliang Dong, Zehua Cheng, Xinran Zhang, Xiaobing Li, Feng Yu, Maosong Sun
In this work, we propose a permutation invariant language model, SymphonyNet, as a solution for symbolic symphony music generation.
Ranked #1 on Audio Generation on Symphony music
no code implementations • 12 May 2022 • Shangda Wu, Maosong Sun
In recent years, there has been a growing interest in the development of language models capable of generating text with controllable attributes.
1 code implementation • 23 May 2022 • Shuo Wang, Peng Li, Zhixing Tan, Zhaopeng Tu, Maosong Sun, Yang Liu
In this work, we propose a template-based method that can yield results with high translation quality and match accuracy and the inference speed of our method is comparable with unconstrained NMT models.
1 code implementation • 23 May 2022 • Yuan YAO, Qianyu Chen, Ao Zhang, Wei Ji, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
We show that PEVL enables state-of-the-art performance of detector-free VLP models on position-sensitive tasks such as referring expression comprehension and phrase grounding, and also improves the performance on position-insensitive tasks with grounded inputs.
Ranked #1 on Visual Commonsense Reasoning on VCR (Q-AR) test
1 code implementation • Findings (ACL) 2022 • Yuan YAO, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, Jianyong Wang
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
no code implementations • 15 Jun 2022 • Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun
The searched structures preserve more than 99\% fine-tuning performance with 0. 01\% trainable parameters.
1 code implementation • 17 Jun 2022 • Ganqu Cui, Lifan Yuan, Bingxiang He, Yangyi Chen, Zhiyuan Liu, Maosong Sun
However, we highlight two issues in previous backdoor learning evaluations: (1) The differences between real-world scenarios (e. g. releasing poisoned datasets or models) are neglected, and we argue that each scenario has its own constraints and concerns, thus requires specific evaluation protocols; (2) The evaluation metrics only consider whether the attacks could flip the models' predictions on poisoned samples and retain performances on benign samples, but ignore that poisoned samples should also be stealthy and semantic-preserving.