1 code implementation • EMNLP 2021 • Peijie Jiang, Dingkun Long, Yueheng Sun, Meishan Zhang, Guangwei Xu, Pengjun Xie
Self-training is one promising solution for it, which struggles to construct a set of high-quality pseudo training instances for the target domain.
no code implementations • Findings (EMNLP) 2021 • Ying Li, Meishan Zhang, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan
Thanks to the strong representation learning capability of deep learning, especially pre-training techniques with language model loss, dependency parsing has achieved great performance boost in the in-domain scenario with abundant labeled training data for target domains.
1 code implementation • EMNLP 2021 • Ranran Zhen, Rui Wang, Guohong Fu, Chengguo Lv, Meishan Zhang
Opinion Role Labeling (ORL), aiming to identify the key roles of opinion, has received increasing interest.
1 code implementation • ACL 2022 • Nan Yu, Meishan Zhang, Guohong Fu, Min Zhang
Pre-trained language models (PLMs) have shown great potentials in natural language processing (NLP) including rhetorical structure theory (RST) discourse parsing. Current PLMs are obtained by sentence-level pre-training, which is different from the basic processing unit, i. e. element discourse unit (EDU). To this end, we propose a second-stage EDU-level pre-training approach in this work, which presents two novel tasks to learn effective EDU representations continually based on well pre-trained language models. Concretely, the two tasks are (1) next EDU prediction (NEP) and (2) discourse marker prediction (DMP). We take a state-of-the-art transition-based neural parser as baseline, and adopt it with a light bi-gram EDU modification to effectively explore the EDU-level pre-trained EDU representation. Experimental results on a benckmark dataset show that our method is highly effective, leading a 2. 1-point improvement in F1-score. All codes and pre-trained models will be released publicly to facilitate future studies.
no code implementations • CCL 2020 • Meishan Zhang, Yue Zhang
Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings.
no code implementations • 28 Nov 2023 • Longhui Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
Text ranking is a critical task in various information retrieval applications, and the recent success of Large Language Models (LLMs) in natural language processing has sparked interest in their application to text ranking.
1 code implementation • 5 Nov 2023 • Jianling Li, Meishan Zhang, Peiming Guo, Min Zhang, Yue Zhang
Our experimental results demonstrate that self-training for constituency parsing, equipped with an LLM, outperforms traditional methods regardless of the LLM's performance.
1 code implementation • 12 Oct 2023 • Xin Zhang, Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
As such cases span from English to other natural or programming languages, from retrieval to classification and beyond, it is desirable to build a unified embedding model rather than dedicated ones for each scenario.
1 code implementation • ICCV 2023 • Yibo Cui, Liang Xie, Yakun Zhang, Meishan Zhang, Ye Yan, Erwei Yin
To address this problem, we propose a novel Grounded Entity-Landmark Adaptive (GELA) pre-training paradigm for VLN tasks.
no code implementations • 9 Aug 2023 • Yu Zhao, Hao Fei, Yixin Cao, Bobo Li, Meishan Zhang, Jianguo Wei, Min Zhang, Tat-Seng Chua
A scene-event mapping mechanism is first designed to bridge the gap between the underlying scene structure and the high-level event semantic structure, resulting in an overall hierarchical scene-event (termed ICE) graph structure.
no code implementations • 7 Aug 2023 • Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang
We present GTE, a general-purpose text embedding model trained with multi-stage contrastive learning.
no code implementations • 3 Aug 2023 • Hao Fei, Meishan Zhang, Min Zhang, Tat-Seng Chua
Structured Natural Language Processing (XNLP) is an important subset of NLP that entails understanding the underlying semantic or syntactic structure of texts, which serves as a foundational component for many downstream applications.
1 code implementation • 25 Jul 2023 • Hexuan Deng, Xin Zhang, Meishan Zhang, Xuebo Liu, Min Zhang
In this paper, we conduct a holistic exploration of the Universal Decompositional Semantic (UDS) Parsing.
1 code implementation • 21 May 2023 • Gongyao Jiang, Shuang Liu, Meishan Zhang, Min Zhang
Dialogue-level dependency parsing has received insufficient attention, especially for Chinese.
no code implementations • 20 May 2023 • Hao Fei, Meishan Zhang, Min Zhang, Tat-Seng Chua
Latest efforts on cross-lingual relation extraction (XRE) aggressively leverage the language-consistent structural features from the universal dependency (UD) resource, while they may largely suffer from biased transfer (e. g., either target-biased or source-biased) due to the inevitable linguistic disparity between languages.
1 code implementation • 20 May 2023 • Hao Fei, Qian Liu, Meishan Zhang, Min Zhang, Tat-Seng Chua
In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs.
1 code implementation • 19 May 2023 • Yu Zhao, Hao Fei, Wei Ji, Jianguo Wei, Meishan Zhang, Min Zhang, Tat-Seng Chua
With an external 3D scene extractor, we obtain the 3D objects and scene features for input images, based on which we construct a target object-centered 3D spatial scene graph (Go3D-S2G), such that we model the spatial semantics of target objects within the holistic 3D scenes.
no code implementations • 19 Apr 2023 • Hao Fei, Tat-Seng Chua, Chenliang Li, Donghong Ji, Meishan Zhang, Yafeng Ren
In this study, we propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
Aspect-Based Sentiment Analysis (ABSA)
Contrastive Learning
+1
1 code implementation • 13 Apr 2023 • Hao Fei, Shengqiong Wu, Jingye Li, Bobo Li, Fei Li, Libo Qin, Meishan Zhang, Min Zhang, Tat-Seng Chua
Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM.
1 code implementation • 20 Feb 2023 • Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han
Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.
1 code implementation • 2 Dec 2022 • Hexuan Deng, Liang Ding, Xuebo Liu, Meishan Zhang, DaCheng Tao, Min Zhang
Preliminary experiments on En-Zh and En-Ja news domain corpora demonstrate that monolingual data can significantly improve translation quality (e. g., +3. 15 BLEU on En-Zh).
2 code implementations • 27 Oct 2022 • Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang
We apply BABERT for feature induction of Chinese sequence labeling tasks.
Ranked #1 on
Chinese Word Segmentation
on MSRA
Chinese Named Entity Recognition
Chinese Word Segmentation
+3
1 code implementation • 23 Oct 2022 • Panzhong Lu, Xin Zhang, Meishan Zhang, Min Zhang
First, we construct a dataset of phrase grounding with both noun phrases and pronouns to image regions.
1 code implementation • 20 Oct 2022 • Yu Zhao, Jianguo Wei, Zhichao Lin, Yueheng Sun, Meishan Zhang, Min Zhang
Accordingly, we manually annotate a dataset to facilitate the investigation of the newly-introduced task and build several benchmark encoder-decoder models by using VL-BART and VL-T5 as backbones.
no code implementations • 6 Oct 2022 • Hao Fei, Shengqiong Wu, Meishan Zhang, Yafeng Ren, Donghong Ji
In this work, we investigate the integration of a latent graph for CSRL.
1 code implementation • COLING 2022 • Xin Zhang, Yong Jiang, Xiaobin Wang, Xuming Hu, Yueheng Sun, Pengjun Xie, Meishan Zhang
Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge.
1 code implementation • NAACL 2022 • Linzhi Wu, Pengjun Xie, Jie zhou, Meishan Zhang, Chunping Ma, Guangwei Xu, Min Zhang
Prior research has mainly resorted to heuristic rule-based constraints to reduce the noise for specific self-augmentation methods individually.
1 code implementation • ACL 2022 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Xiaobin Wang, Min Zhang
Recent works of opinion expression identification (OEI) rely heavily on the quality and scale of the manually-constructed training corpus, which could be extremely difficult to satisfy.
1 code implementation • COLING 2022 • Zebin Ou, Meishan Zhang, Yue Zhang
Word ordering is a constrained language generation task taking unordered words as input.
1 code implementation • 17 Feb 2022 • Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei Huang
Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 11 Jan 2022 • Yuting Yang, Pei Huang, Feifei Ma, Juan Cao, Meishan Zhang, Jian Zhang, Jintao Li
Deep-learning-based NLP models are found to be vulnerable to word substitution perturbations.
1 code implementation • 19 Dec 2021 • Jingye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, Fei Li
So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka.
Ranked #2 on
Chinese Named Entity Recognition
on OntoNotes 4
1 code implementation • 5 Oct 2021 • Shengqiong Wu, Hao Fei, Fei Li, Donghong Ji, Meishan Zhang, Yijiang Liu, Chong Teng
Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text.
Ranked #1 on
Fine-Grained Opinion Analysis
on MPQA
(F1 (Opinion) metric)
1 code implementation • EMNLP 2021 • Zhichao Lin, Yueheng Sun, Meishan Zhang
The three subtasks are closely related while previous studies model them individually, which ignores their intern connections and meanwhile induces error propagation problem.
1 code implementation • ACL 2021 • Fei Li, Zhichao Lin, Meishan Zhang, Donghong Ji
Second, we perform relation classification to judge whether a given pair of entity fragments to be overlapping or succession.
1 code implementation • ACL 2021 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie
Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers.
1 code implementation • 2 Jan 2021 • Hao Fei, Meishan Zhang, Bobo Li, Donghong Ji
It performs the two subtasks of SRL: predicate identification and argument role labeling, jointly.
Ranked #1 on
Semantic Role Labeling
on CoNLL-2009
no code implementations • COLING 2020 • Jingye Li, Donghong Ji, Fei Li, Meishan Zhang, Yijiang Liu
Emotion detection in conversations (EDC) is to detect the emotion for each utterance in conversations that have multiple speakers.
Ranked #17 on
Emotion Recognition in Conversation
on EmoryNLP
no code implementations • 24 Aug 2020 • Hao Fei, Meishan Zhang, Fei Li, Donghong Ji
In this paper, we fill the gap of cross-lingual SRL by proposing an end-to-end SRL model that incorporates a variety of universal features and transfer methods.
no code implementations • 19 Jun 2020 • Meishan Zhang
This article briefly reviews the representative models of constituent parsing and dependency parsing, and also dependency graph parsing with rich semantics.
no code implementations • ACL 2020 • Qiankun Fu, Yue Zhang, Jiangming Liu, Meishan Zhang
Discourse representation tree structure (DRTS) parsing is a novel semantic parsing task which has been concerned most recently.
1 code implementation • ACL 2020 • Hao Fei, Meishan Zhang, Donghong Ji
Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding.
no code implementations • COLING 2020 • Yijiang Liu, Meishan Zhang, Donghong Ji
In this paper, we present Chinese lexical fusion recognition, a new task which could be regarded as one kind of coreference recognition.
no code implementations • 3 Mar 2020 • Jingye Li, Meishan Zhang, Donghong Ji, Yijiang Liu
Conversational emotion recognition (CER) has attracted increasing interests in the natural language processing (NLP) community.
Ranked #20 on
Emotion Recognition in Conversation
on EmoryNLP
1 code implementation • 22 Jul 2019 • Qingrong Xia, Zhenghua Li, Min Zhang, Meishan Zhang, Guohong Fu, Rui Wang, Luo Si
Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP.
1 code implementation • NAACL 2019 • Meishan Zhang, Peili Liang, Guohong Fu
Opinion role labeling (ORL) is an important task for fine-grained opinion mining, which identifies important opinion arguments such as holder and target for a given opinion trigger.
Ranked #1 on
Fine-Grained Opinion Analysis
on MPQA
(using extra training data)
no code implementations • NAACL 2019 • Meishan Zhang, Zhenghua Li, Guohong Fu, Min Zhang
Syntax has been demonstrated highly effective in neural machine translation (NMT).
Ranked #8 on
Machine Translation
on IWSLT2015 English-Vietnamese
1 code implementation • COLING 2018 • Nan Yu, Meishan Zhang, Guohong Fu
Syntax has been a useful source of information for statistical RST discourse parsing.
Ranked #7 on
Discourse Parsing
on RST-DT
no code implementations • 16 Jan 2018 • YaoSheng Yang, Meishan Zhang, Wenliang Chen, Wei zhang, Haofen Wang, Min Zhang
To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time.
Chinese Named Entity Recognition
named-entity-recognition
+2
no code implementations • 11 Jan 2018 • Zhengqiu He, Wenliang Chen, Zhenghua Li, Meishan Zhang, Wei zhang, Min Zhang
First, we encode the context of entities on a dependency tree as sentence-level entity embedding based on tree-GRU.
1 code implementation • EMNLP 2017 • Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, Ting Liu
In this paper, we model the problem of disfluency detection using a transition-based framework, which incrementally constructs and labels the disfluency chunk of input sentences using a new transition system without syntax information.
no code implementations • EMNLP 2017 • Meishan Zhang, Yue Zhang, Guohong Fu
Neural networks have shown promising results for relation extraction.
Ranked #1 on
Relation Extraction
on ACE 2005
(Sentence Encoder metric)
1 code implementation • 24 Aug 2017 • Jie Yang, Zhiyang Teng, Meishan Zhang, Yue Zhang
Our results on standard benchmarks show that state-of-the-art neural models can give accuracies comparable to the best discrete models in the literature for most tasks and combing discrete and neural features unanimously yield better results.
no code implementations • Pattern Recognition Letters 2017 • Fei Li, Meishan Zhang, Bo Tian, Bo Chen, Guohong Fu, Donghong Ji
We evaluate our models on two datasets for recognizing regular and irreg- ular biomedical entities.
no code implementations • 25 Apr 2017 • Liner Yang, Meishan Zhang, Yang Liu, Nan Yu, Maosong Sun, Guohong Fu
While part-of-speech (POS) tagging and dependency parsing are observed to be closely related, existing work on joint modeling with manually crafted feature templates suffers from the feature sparsity and incompleteness problems.
1 code implementation • COLING 2016 • Meishan Zhang, Yue Zhang, Guohong Fu
We investigate the use of neural network for tweet sarcasm detection, and compare the effects of the continuous automatic features with discrete manual features.
no code implementations • 27 Aug 2016 • Fei Li, Meishan Zhang, Guohong Fu, Tao Qian, Donghong Ji
This model divides a sentence or text segment into five parts, namely two target entities and their three contexts.
no code implementations • LREC 2016 • Meishan Zhang, Jie Yang, Zhiyang Teng, Yue Zhang
We present a light-weight machine learning tool for NLP research.