1 code implementation • EMNLP 2021 • Peijie Jiang, Dingkun Long, Yueheng Sun, Meishan Zhang, Guangwei Xu, Pengjun Xie
Self-training is one promising solution for it, which struggles to construct a set of high-quality pseudo training instances for the target domain.
1 code implementation • EMNLP 2021 • Ranran Zhen, Rui Wang, Guohong Fu, Chengguo Lv, Meishan Zhang
Opinion Role Labeling (ORL), aiming to identify the key roles of opinion, has received increasing interest.
1 code implementation • ACL 2022 • Nan Yu, Meishan Zhang, Guohong Fu, Min Zhang
Pre-trained language models (PLMs) have shown great potentials in natural language processing (NLP) including rhetorical structure theory (RST) discourse parsing. Current PLMs are obtained by sentence-level pre-training, which is different from the basic processing unit, i. e. element discourse unit (EDU). To this end, we propose a second-stage EDU-level pre-training approach in this work, which presents two novel tasks to learn effective EDU representations continually based on well pre-trained language models. Concretely, the two tasks are (1) next EDU prediction (NEP) and (2) discourse marker prediction (DMP). We take a state-of-the-art transition-based neural parser as baseline, and adopt it with a light bi-gram EDU modification to effectively explore the EDU-level pre-trained EDU representation. Experimental results on a benckmark dataset show that our method is highly effective, leading a 2. 1-point improvement in F1-score. All codes and pre-trained models will be released publicly to facilitate future studies.
no code implementations • Findings (EMNLP) 2021 • Ying Li, Meishan Zhang, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan
Thanks to the strong representation learning capability of deep learning, especially pre-training techniques with language model loss, dependency parsing has achieved great performance boost in the in-domain scenario with abundant labeled training data for target domains.
no code implementations • CCL 2020 • Meishan Zhang, Yue Zhang
Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings.
1 code implementation • 3 Dec 2024 • Haidong Xu, Meishan Zhang, Hao Ju, Zhedong Zheng, Hongyuan Zhu, Erik Cambria, Min Zhang, Hao Fei
T3DEM is the most crucial step in determining the quality of Emo3D generation and encompasses three key challenges: Expression Diversity, Emotion-Content Consistency, and Expression Fluidity.
no code implementations • 20 Oct 2024 • Yu Zhao, Hao Fei, Xiangtai Li, Libo Qin, Jiayi Ji, Hongyuan Zhu, Meishan Zhang, Min Zhang, Jianguo Wei
In the visual spatial understanding (VSU) area, spatial image-to-text (SI2T) and spatial text-to-image (ST2I) are two fundamental tasks that appear in dual form.
no code implementations • 1 Oct 2024 • Yu Zhao, Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua
Grammar Induction could benefit from rich heterogeneous signals, such as text, vision, and acoustics.
no code implementations • 16 Aug 2024 • Peiming Guo, Sinuo Liu, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
We propose the first end-to-end model for photo-sharing multi-modal dialogue generation, which integrates an image perceptron and an image generator with a large language model.
1 code implementation • 29 Jul 2024 • Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, Min Zhang
We first introduce a text encoder (base size) enhanced with RoPE and unpadding, pre-trained in a native 8192-token context (longer than 512 of previous multilingual encoders).
no code implementations • 27 Jun 2024 • Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua, Shuicheng Yan
Then, an SG-based framework is built, where the textual SG (TSG) is encoded with a graph Transformer, while the video dynamic SG (DSG) and the HSG are modeled with a novel recurrent graph Transformer for spatial and temporal feature propagation.
no code implementations • 26 Jun 2024 • Bonian Jia, Huiyao Chen, Yueheng Sun, Meishan Zhang, Min Zhang
We introduce a novel multimodal OEI (MOEI) task, integrating text and speech to mirror real-world scenarios.
1 code implementation • 25 Jun 2024 • Huiyao Chen, Yu Zhao, Zulong Chen, Mengjia Wang, Liangyue Li, Meishan Zhang, Min Zhang
Hierarchical text classification (HTC) is an important task with broad applications, while few-shot HTC has gained increasing interest recently.
1 code implementation • 10 Jun 2024 • Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence.
1 code implementation • 8 Apr 2024 • Longhui Zhang, Dingkun Long, Meishan Zhang, Yanzhao Zhang, Pengjun Xie, Min Zhang
Experimental results on Chinese sequence labeling datasets demonstrate that the improved BABERT variant outperforms the vanilla version, not only on these tasks but also more broadly across a range of Chinese natural language understanding tasks.
no code implementations • 26 Feb 2024 • Jingsi Yu, Cunliang Kong, Liner Yang, Meishan Zhang, Lin Zhu, Yujie Wang, Haozhe Lin, Maosong Sun, Erhong Yang
Sentence Pattern Structure (SPS) parsing is a syntactic analysis method primarily employed in language teaching. Existing SPS parsers rely heavily on textbook corpora for training, lacking cross-domain capability. To overcome this constraint, this paper proposes an innovative approach leveraging large language models (LLMs) within a self-training framework.
no code implementations • 2 Feb 2024 • Meishan Zhang, Bin Wang, Hao Fei, Min Zhang
In nested Named entity recognition (NER), entities are nested with each other, and thus requiring more data annotations to address.
1 code implementation • 28 Nov 2023 • Longhui Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
In this work, we propose a two-stage progressive paradigm to better adapt LLMs to text ranking.
1 code implementation • 5 Nov 2023 • Jianling Li, Meishan Zhang, Peiming Guo, Min Zhang, Yue Zhang
Our experimental results demonstrate that self-training for constituency parsing, equipped with an LLM, outperforms traditional methods regardless of the LLM's performance.
1 code implementation • 12 Oct 2023 • Xin Zhang, Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
As such cases span from English to other natural or programming languages, from retrieval to classification and beyond, it is desirable to build a unified embedding model rather than dedicated ones for each scenario.
1 code implementation • ICCV 2023 • Yibo Cui, Liang Xie, Yakun Zhang, Meishan Zhang, Ye Yan, Erwei Yin
To address this problem, we propose a novel Grounded Entity-Landmark Adaptive (GELA) pre-training paradigm for VLN tasks.
no code implementations • 9 Aug 2023 • Yu Zhao, Hao Fei, Yixin Cao, Bobo Li, Meishan Zhang, Jianguo Wei, Min Zhang, Tat-Seng Chua
A scene-event mapping mechanism is first designed to bridge the gap between the underlying scene structure and the high-level event semantic structure, resulting in an overall hierarchical scene-event (termed ICE) graph structure.
no code implementations • 7 Aug 2023 • Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang
We present GTE, a general-purpose text embedding model trained with multi-stage contrastive learning.
no code implementations • 3 Aug 2023 • Hao Fei, Meishan Zhang, Min Zhang, Tat-Seng Chua
Structured Natural Language Processing (XNLP) is an important subset of NLP that entails understanding the underlying semantic or syntactic structure of texts, which serves as a foundational component for many downstream applications.
1 code implementation • 25 Jul 2023 • Hexuan Deng, Xin Zhang, Meishan Zhang, Xuebo Liu, Min Zhang
In this paper, we conduct a holistic exploration of the Universal Decompositional Semantic (UDS) Parsing.
1 code implementation • 21 May 2023 • Gongyao Jiang, Shuang Liu, Meishan Zhang, Min Zhang
Dialogue-level dependency parsing has received insufficient attention, especially for Chinese.
1 code implementation • 20 May 2023 • Hao Fei, Qian Liu, Meishan Zhang, Min Zhang, Tat-Seng Chua
In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs.
no code implementations • 20 May 2023 • Hao Fei, Meishan Zhang, Min Zhang, Tat-Seng Chua
Latest efforts on cross-lingual relation extraction (XRE) aggressively leverage the language-consistent structural features from the universal dependency (UD) resource, while they may largely suffer from biased transfer (e. g., either target-biased or source-biased) due to the inevitable linguistic disparity between languages.
1 code implementation • 19 May 2023 • Yu Zhao, Hao Fei, Wei Ji, Jianguo Wei, Meishan Zhang, Min Zhang, Tat-Seng Chua
With an external 3D scene extractor, we obtain the 3D objects and scene features for input images, based on which we construct a target object-centered 3D spatial scene graph (Go3D-S2G), such that we model the spatial semantics of target objects within the holistic 3D scenes.
no code implementations • 19 Apr 2023 • Hao Fei, Tat-Seng Chua, Chenliang Li, Donghong Ji, Meishan Zhang, Yafeng Ren
In this study, we propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
1 code implementation • 13 Apr 2023 • Hao Fei, Shengqiong Wu, Jingye Li, Bobo Li, Fei Li, Libo Qin, Meishan Zhang, Min Zhang, Tat-Seng Chua
Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM.
1 code implementation • 20 Feb 2023 • Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han
Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.
1 code implementation • 2 Dec 2022 • Hexuan Deng, Liang Ding, Xuebo Liu, Meishan Zhang, DaCheng Tao, Min Zhang
Preliminary experiments on En-Zh and En-Ja news domain corpora demonstrate that monolingual data can significantly improve translation quality (e. g., +3. 15 BLEU on En-Zh).
2 code implementations • 27 Oct 2022 • Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang
We apply BABERT for feature induction of Chinese sequence labeling tasks.
Ranked #1 on Chinese Word Segmentation on MSRA
Chinese Named Entity Recognition Chinese Word Segmentation +3
1 code implementation • 23 Oct 2022 • Panzhong Lu, Xin Zhang, Meishan Zhang, Min Zhang
First, we construct a dataset of phrase grounding with both noun phrases and pronouns to image regions.
1 code implementation • 20 Oct 2022 • Yu Zhao, Jianguo Wei, Zhichao Lin, Yueheng Sun, Meishan Zhang, Min Zhang
Accordingly, we manually annotate a dataset to facilitate the investigation of the newly-introduced task and build several benchmark encoder-decoder models by using VL-BART and VL-T5 as backbones.
no code implementations • 6 Oct 2022 • Hao Fei, Shengqiong Wu, Meishan Zhang, Yafeng Ren, Donghong Ji
In this work, we investigate the integration of a latent graph for CSRL.
1 code implementation • COLING 2022 • Xin Zhang, Yong Jiang, Xiaobin Wang, Xuming Hu, Yueheng Sun, Pengjun Xie, Meishan Zhang
Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge.
1 code implementation • NAACL 2022 • Linzhi Wu, Pengjun Xie, Jie zhou, Meishan Zhang, Chunping Ma, Guangwei Xu, Min Zhang
Prior research has mainly resorted to heuristic rule-based constraints to reduce the noise for specific self-augmentation methods individually.
1 code implementation • ACL 2022 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Xiaobin Wang, Min Zhang
Recent works of opinion expression identification (OEI) rely heavily on the quality and scale of the manually-constructed training corpus, which could be extremely difficult to satisfy.
1 code implementation • COLING 2022 • Zebin Ou, Meishan Zhang, Yue Zhang
Word ordering is a constrained language generation task taking unordered words as input.
1 code implementation • 17 Feb 2022 • Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei Huang
Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 11 Jan 2022 • Yuting Yang, Pei Huang, Feifei Ma, Juan Cao, Meishan Zhang, Jian Zhang, Jintao Li
Deep-learning-based NLP models are found to be vulnerable to word substitution perturbations.
1 code implementation • 19 Dec 2021 • Jingye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, Fei Li
So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka.
Ranked #2 on Chinese Named Entity Recognition on OntoNotes 4
1 code implementation • 5 Oct 2021 • Shengqiong Wu, Hao Fei, Fei Li, Donghong Ji, Meishan Zhang, Yijiang Liu, Chong Teng
Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text.
Ranked #1 on Fine-Grained Opinion Analysis on MPQA (F1 (Opinion) metric)
1 code implementation • EMNLP 2021 • Zhichao Lin, Yueheng Sun, Meishan Zhang
The three subtasks are closely related while previous studies model them individually, which ignores their intern connections and meanwhile induces error propagation problem.
1 code implementation • ACL 2021 • Fei Li, Zhichao Lin, Meishan Zhang, Donghong Ji
Second, we perform relation classification to judge whether a given pair of entity fragments to be overlapping or succession.
1 code implementation • ACL 2021 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie
Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers.
1 code implementation • 2 Jan 2021 • Hao Fei, Meishan Zhang, Bobo Li, Donghong Ji
It performs the two subtasks of SRL: predicate identification and argument role labeling, jointly.
Ranked #1 on Semantic Role Labeling on CoNLL-2009
no code implementations • COLING 2020 • Jingye Li, Donghong Ji, Fei Li, Meishan Zhang, Yijiang Liu
Emotion detection in conversations (EDC) is to detect the emotion for each utterance in conversations that have multiple speakers.
Ranked #21 on Emotion Recognition in Conversation on EmoryNLP
no code implementations • 24 Aug 2020 • Hao Fei, Meishan Zhang, Fei Li, Donghong Ji
In this paper, we fill the gap of cross-lingual SRL by proposing an end-to-end SRL model that incorporates a variety of universal features and transfer methods.
no code implementations • 19 Jun 2020 • Meishan Zhang
This article briefly reviews the representative models of constituent parsing and dependency parsing, and also dependency graph parsing with rich semantics.
no code implementations • ACL 2020 • Qiankun Fu, Yue Zhang, Jiangming Liu, Meishan Zhang
Discourse representation tree structure (DRTS) parsing is a novel semantic parsing task which has been concerned most recently.
1 code implementation • ACL 2020 • Hao Fei, Meishan Zhang, Donghong Ji
Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding.
no code implementations • COLING 2020 • Yijiang Liu, Meishan Zhang, Donghong Ji
In this paper, we present Chinese lexical fusion recognition, a new task which could be regarded as one kind of coreference recognition.
no code implementations • 3 Mar 2020 • Jingye Li, Meishan Zhang, Donghong Ji, Yijiang Liu
Conversational emotion recognition (CER) has attracted increasing interests in the natural language processing (NLP) community.
Ranked #24 on Emotion Recognition in Conversation on EmoryNLP
1 code implementation • 22 Jul 2019 • Qingrong Xia, Zhenghua Li, Min Zhang, Meishan Zhang, Guohong Fu, Rui Wang, Luo Si
Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP.
1 code implementation • NAACL 2019 • Meishan Zhang, Peili Liang, Guohong Fu
Opinion role labeling (ORL) is an important task for fine-grained opinion mining, which identifies important opinion arguments such as holder and target for a given opinion trigger.
Ranked #1 on Fine-Grained Opinion Analysis on MPQA (using extra training data)
no code implementations • NAACL 2019 • Meishan Zhang, Zhenghua Li, Guohong Fu, Min Zhang
Syntax has been demonstrated highly effective in neural machine translation (NMT).
Ranked #8 on Machine Translation on IWSLT2015 English-Vietnamese
1 code implementation • COLING 2018 • Nan Yu, Meishan Zhang, Guohong Fu
Syntax has been a useful source of information for statistical RST discourse parsing.
Ranked #3 on Discourse Parsing on RST-DT (RST-Parseval (Full) metric)
no code implementations • 16 Jan 2018 • YaoSheng Yang, Meishan Zhang, Wenliang Chen, Wei zhang, Haofen Wang, Min Zhang
To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time.
Chinese Named Entity Recognition named-entity-recognition +2
no code implementations • 11 Jan 2018 • Zhengqiu He, Wenliang Chen, Zhenghua Li, Meishan Zhang, Wei zhang, Min Zhang
First, we encode the context of entities on a dependency tree as sentence-level entity embedding based on tree-GRU.
1 code implementation • EMNLP 2017 • Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, Ting Liu
In this paper, we model the problem of disfluency detection using a transition-based framework, which incrementally constructs and labels the disfluency chunk of input sentences using a new transition system without syntax information.
no code implementations • EMNLP 2017 • Meishan Zhang, Yue Zhang, Guohong Fu
Neural networks have shown promising results for relation extraction.
Ranked #1 on Relation Extraction on ACE 2005 (Sentence Encoder metric)
1 code implementation • 24 Aug 2017 • Jie Yang, Zhiyang Teng, Meishan Zhang, Yue Zhang
Our results on standard benchmarks show that state-of-the-art neural models can give accuracies comparable to the best discrete models in the literature for most tasks and combing discrete and neural features unanimously yield better results.
no code implementations • Pattern Recognition Letters 2017 • Fei Li, Meishan Zhang, Bo Tian, Bo Chen, Guohong Fu, Donghong Ji
We evaluate our models on two datasets for recognizing regular and irreg- ular biomedical entities.
no code implementations • 25 Apr 2017 • Liner Yang, Meishan Zhang, Yang Liu, Nan Yu, Maosong Sun, Guohong Fu
While part-of-speech (POS) tagging and dependency parsing are observed to be closely related, existing work on joint modeling with manually crafted feature templates suffers from the feature sparsity and incompleteness problems.
1 code implementation • COLING 2016 • Meishan Zhang, Yue Zhang, Guohong Fu
We investigate the use of neural network for tweet sarcasm detection, and compare the effects of the continuous automatic features with discrete manual features.
no code implementations • 27 Aug 2016 • Fei Li, Meishan Zhang, Guohong Fu, Tao Qian, Donghong Ji
This model divides a sentence or text segment into five parts, namely two target entities and their three contexts.
no code implementations • LREC 2016 • Meishan Zhang, Jie Yang, Zhiyang Teng, Yue Zhang
We present a light-weight machine learning tool for NLP research.