1 code implementation • 1 Apr 2022 • Ziyun Xu, Chengyu Wang, Minghui Qiu, Fuli Luo, Runxin Xu, Songfang Huang, Jun Huang
Pre-trained Language Models (PLMs) have achieved remarkable performance for various language understanding tasks in IR systems, which require the fine-tuning process based on labeled training data.
3 code implementations • EMNLP 2021 • Runxin Xu, Fuli Luo, Zhiyuan Zhang, Chuanqi Tan, Baobao Chang, Songfang Huang, Fei Huang
Recent pretrained language models extend from millions to billions of parameters.
2 code implementations • 14 Dec 2021 • Runxin Xu, Fuli Luo, Chengyu Wang, Baobao Chang, Jun Huang, Songfang Huang, Fei Huang
Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge.
1 code implementation • 5 Feb 2024 • Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, Daya Guo
Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature.
Ranked #11 on Math Word Problem Solving on MATH (using extra training data)
2 code implementations • ACL 2021 • Runxin Xu, Tianyu Liu, Lei LI, Baobao Chang
Existing methods are not effective due to two challenges of this task: a) the target event arguments are scattered across sentences; b) the correlation among events in a document is non-trivial to model.
Ranked #2 on Document-level Event Extraction on ChFinAnn
2 code implementations • EMNLP 2020 • Shuang Zeng, Runxin Xu, Baobao Chang, Lei LI
Document-level relation extraction aims to extract relations among entities within a document.
Ranked #12 on Relation Extraction on DocRED
1 code implementation • NAACL 2022 • Peiyi Wang, Runxin Xu, Tianyu Liu, Qingyu Zhou, Yunbo Cao, Baobao Chang, Zhifang Sui
Few-Shot Sequence Labeling (FSSL) is a canonical paradigm for the tagging models, e. g., named entity recognition and slot filling, to generalize on an emerging, resource-scarce domain.
Ranked #6 on Few-shot NER on Few-NERD (INTER)
2 code implementations • 6 Mar 2022 • Liang Chen, Runxin Xu, Baobao Chang
Label smoothing and vocabulary sharing are two widely used techniques in neural machine translation models.
2 code implementations • ACL 2022 • Liang Chen, Runxin Xu, Baobao Chang
Label smoothing and vocabulary sharing are two widely used techniques in neural machine translation models.
1 code implementation • NAACL 2022 • Runxin Xu, Peiyi Wang, Tianyu Liu, Shuang Zeng, Baobao Chang, Zhifang Sui
In this paper, we focus on extracting event arguments from an entire document, which mainly faces two critical problems: a) the long-distance dependency between trigger and arguments over sentences; b) the distracting context towards an event in the document.
Document-level Event Extraction Event Argument Extraction +2
2 code implementations • Findings (NAACL) 2022 • Liang Chen, Peiyi Wang, Runxin Xu, Tianyu Liu, Zhifang Sui, Baobao Chang
As Abstract Meaning Representation (AMR) implicitly involves compound semantic annotations, we hypothesize auxiliary tasks which are semantically or formally related can better enhance AMR parsing.
Ranked #7 on AMR Parsing on LDC2020T02 (using extra training data)
1 code implementation • 29 Aug 2021 • Peiyi Wang, Runxin Xu, Tianyu Liu, Damai Dai, Baobao Chang, Zhifang Sui
However, we find they suffer from trigger biases that signify the statistical homogeneity between some trigger words and target event types, which we summarize as trigger overlapping and trigger separability.
1 code implementation • NAACL 2022 • Ce Zheng, Xudong Chen, Runxin Xu, Baobao Chang
In this paper, we propose a Knowledge-guided Incremental semantic parser with Double-graph (KID).
1 code implementation • 12 Jun 2020 • Xunpeng Huang, Runxin Xu, Hao Zhou, Zhe Wang, Zhengyang Liu, Lei LI
Due to its simplicity and outstanding ability to generalize, stochastic gradient descent (SGD) is still the most widely used optimization method despite its slow convergence.
no code implementations • 12 Jun 2020 • Xunpeng Huang, Hao Zhou, Runxin Xu, Zhe Wang, Lei LI
Adaptive gradient methods have attracted much attention of machine learning communities due to the high efficiency.
no code implementations • ACL 2020 • Runxin Xu, Jun Cao, Mingxuan Wang, Jiaze Chen, Hao Zhou, Ying Zeng, Yu-Ping Wang, Li Chen, Xiang Yin, Xijin Zhang, Songcheng Jiang, Yuxuan Wang, Lei LI
This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four integral capabilities: news generation, news translation, news reading and avatar animation.
no code implementations • WMT (EMNLP) 2020 • Runxin Xu, Zhuo Zhi, Jun Cao, Mingxuan Wang, Lei LI
In this paper, we describe our submissions to the WMT20 shared task on parallel corpus filtering and alignment for low-resource conditions.
no code implementations • 21 Jun 2021 • Peiyi Wang, Tianyu Liu, Damai Dai, Runxin Xu, Baobao Chang, Zhifang Sui
Table encoder extracts sentiment at token-pair level, so that the compositional feature between targets and opinions can be easily captured.
no code implementations • ACL 2022 • Yanyang Li, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, LiWei Wang
Structured pruning has been extensively studied on monolingual pre-trained language models and is yet to be fully evaluated on their multilingual counterparts.
no code implementations • 17 Apr 2022 • Cunxiang Wang, Fuli Luo, Yanyang Li, Runxin Xu, Fei Huang, Yue Zhang
Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks.
no code implementations • ACL 2022 • Runxin Xu, Fuli Luo, Baobao Chang, Songfang Huang, Fei Huang
The emergence of multilingual pre-trained language models makes it possible to adapt to target languages with only few labeled examples. However, vanilla fine-tuning tends to achieve degenerated and unstable results, owing to the Language Interference among different languages, and Parameter Overload under the few-sample transfer learning scenarios. To address two problems elegantly, we propose S^4-Tuning, a Simple Cross-lingual Sub-network Tuning method.
no code implementations • 24 May 2023 • Shoujie Tong, Heming Xia, Damai Dai, Runxin Xu, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui
Also, Bi-Drop needs only one mini-batch to estimate the sub-net so it achieves higher utility of training data.
no code implementations • 1 Mar 2024 • Lei LI, Yuqi Wang, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu
To fill this gap, we introduce Multimodal ArXiv, consisting of ArXivCap and ArXivQA, for enhancing LVLMs scientific comprehension.