Search Results for author: Zhenghua Li

Found 57 papers, 23 papers with code

数据标注方法比较研究:以依存句法树标注为例(Comparison Study on Data Annotation Approaches: Dependency Tree Annotation as Case Study)

no code implementations CCL 2021 Mingyue Zhou, Chen Gong, Zhenghua Li, Min Zhang

“数据标注最重要的考虑因素是数据的质量和标注代价。我们调研发现自然语言处理领域的数据标注工作通常采用机标人校的标注方法以降低代价;同时, 很少有工作严格对比不同标注方法, 以探讨标注方法对标注质量和代价的影响。该文借助一个成熟的标注团队, 以依存句法数据标注为案例, 实验对比了机标人校、双人独立标注、及本文通过融合前两种方法所新提出的人机独立标注方法, 得到了一些初步的结论。”

APGN: Adversarial and Parameter Generation Networks for Multi-Source Cross-Domain Dependency Parsing

no code implementations Findings (EMNLP) 2021 Ying Li, Meishan Zhang, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

Thanks to the strong representation learning capability of deep learning, especially pre-training techniques with language model loss, dependency parsing has achieved great performance boost in the in-domain scenario with abundant labeled training data for target domains.

Dependency Parsing Language Modelling +1

A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing

1 code implementation CoNLL (EMNLP) 2021 Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

In the coarse labeling stage, the joint model outputs a bracketed tree, in which each node corresponds to one of four labels (i. e., phrase, subphrase, word, subword).

Part-Of-Speech Tagging POS +2

Stacked AMR Parsing with Silver Data

1 code implementation Findings (EMNLP) 2021 Qingrong Xia, Zhenghua Li, Rui Wang, Min Zhang

In particular, one recent seq-to-seq work directly fine-tunes AMR graph sequences on the encoder-decoder pre-trained language model and achieves new state-of-the-art results, outperforming previous works by a large margin.

AMR Parsing Language Modelling

How Well Do Large Language Models Understand Syntax? An Evaluation by Asking Natural Language Questions

1 code implementation14 Nov 2023 Houquan Zhou, Yang Hou, Zhenghua Li, Xuebin Wang, Zhefeng Wang, Xinyu Duan, Min Zhang

While recent advancements in large language models (LLMs) bring us closer to achieving artificial general intelligence, the question persists: Do LLMs truly understand language, or do they merely mimic comprehension through pattern recognition?

Prepositional Phrase Attachment Question Answering +1

Improving Seq2Seq Grammatical Error Correction via Decoding Interventions

1 code implementation23 Oct 2023 Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token.

Grammatical Error Correction Language Modelling

High-order Joint Constituency and Dependency Parsing

1 code implementation21 Sep 2023 Yanggan Gu, Yang Hou, Zhefeng Wang, Xinyu Duan, Zhenghua Li

Compared to their work, we make progress in three aspects: (1) adopting a much more efficient decoding algorithm of $O(n^4)$ time complexity, (2) exploring joint modeling at the training phase, instead of only at the inference phase, (3) proposing high-order scoring components to promote constituent-dependency interaction.

Dependency Parsing Multi-Task Learning

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

1 code implementation25 May 2023 Yue Zhang, Bo Zhang, Haochen Jiang, Zhenghua Li, Chen Li, Fei Huang, Min Zhang

We introduce NaSGEC, a new dataset to facilitate research on Chinese grammatical error correction (CGEC) for native speaker texts from multiple domains.

Grammatical Error Correction

CopyNE: Better Contextual ASR by Copying Named Entities

no code implementations22 May 2023 Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai

However, traditional token-level ASR models have struggled with accurately transcribing entities due to the problem of homophonic and near-homophonic tokens.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

CSynGEC: Incorporating Constituent-based Syntax for Grammatical Error Correction with a Tailored GEC-Oriented Parser

no code implementations15 Nov 2022 Yue Zhang, Zhenghua Li

Recently, Zhang et al. (2022) propose a syntax-aware grammatical error correction (GEC) approach, named SynGEC, showing that incorporating tailored dependency-based syntax of the input sentence is quite beneficial to GEC.

Grammatical Error Correction Sentence

Mining Word Boundaries in Speech as Naturally Annotated Word Segmentation Data

no code implementations31 Oct 2022 Lei Zhang, Zhenghua Li, Shilin Zhou, Chen Gong, Zhefeng Wang, Baoxing Huai, Min Zhang

Inspired by early research on exploring naturally annotated data for Chinese word segmentation (CWS), and also by recent research on integration of speech and text processing, this work for the first time proposes to mine word boundaries from parallel speech/text data.

Chinese Word Segmentation

SeSQL: Yet Another Large-scale Session-level Chinese Text-to-SQL Dataset

no code implementations26 Aug 2022 Saihao Huang, Lijie Wang, Zhenghua Li, Zeyang Liu, Chenhui Dou, Fukang Yan, Xinyan Xiao, Hua Wu, Min Zhang

As the first session-level Chinese dataset, CHASE contains two separate parts, i. e., 2, 003 sessions manually constructed from scratch (CHASE-C), and 3, 456 sessions translated from English SParC (CHASE-T).

SQL Parsing Text-To-SQL

Faster and Better Grammar-based Text-to-SQL Parsing via Clause-level Parallel Decoding and Alignment Loss

no code implementations26 Apr 2022 Kun Wu, Lijie Wang, Zhenghua Li, Xinyan Xiao

Grammar-based parsers have achieved high performance in the cross-domain text-to-SQL parsing task, but suffer from low decoding efficiency due to the much larger number of actions for grammar selection than that of tokens in SQL queries.

SQL Parsing Text-To-SQL

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

2 code implementations NAACL 2022 Yue Zhang, Zhenghua Li, Zuyi Bao, Jiacheng Li, Bo Zhang, Chen Li, Fei Huang, Min Zhang

This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7, 063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources.

Grammatical Error Correction Sentence

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based Graph Parsing

1 code implementation COLING 2022 Shilin Zhou, Qingrong Xia, Zhenghua Li, Yu Zhang, Yu Hong, Min Zhang

Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure.

Chinese Word Segmentation named-entity-recognition +3

An In-depth Study on Internal Structure of Chinese Words

1 code implementation ACL 2021 Chen Gong, Saihao Huang, Houquan Zhou, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

Several previous works on syntactic parsing propose to annotate shallow word-internal structures for better utilizing character-level information.

Sentence

A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents

1 code implementation NAACL 2021 Qingrong Xia, Bo Zhang, Rui Wang, Zhenghua Li, Yue Zhang, Fei Huang, Luo Si, Min Zhang

Fine-grained opinion mining (OM) has achieved increasing attraction in the natural language processing (NLP) community, which aims to find the opinion structures of {``}Who expressed what opinions towards what{''} in one sentence.

Multi-Task Learning Opinion Mining +1

Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations

no code implementations COLING 2020 Ying Li, Zhenghua Li, Min Zhang

The major challenge for current parsing research is to improve parsing performance on out-of-domain texts that are very different from the in-domain training data when there is only a small-scale out-domain labeled data.

Dependency Parsing Domain Adaptation +2

Semantic Role Labeling with Heterogeneous Syntactic Knowledge

1 code implementation COLING 2020 Qingrong Xia, Rui Wang, Zhenghua Li, Yue Zhang, Min Zhang

Recently, due to the interplay between syntax and semantics, incorporating syntactic knowledge into neural semantic role labeling (SRL) has achieved much attention.

Semantic Role Labeling

Multi-grained Chinese Word Segmentation with Weakly Labeled Data

no code implementations COLING 2020 Chen Gong, Zhenghua Li, Bowei Zou, Min Zhang

Detailed evaluation shows that our proposed model with weakly labeled data significantly outperforms the state-of-the-art MWS model by 1. 12 and 5. 97 on NEWS and BAIKE data in F1.

Chinese Word Segmentation Sentence

Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks

no code implementations ACL 2020 Bo Zhang, Yue Zhang, Rui Wang, Zhenghua Li, Min Zhang

The experimental results show that syntactic information is highly valuable for ORL, and our final MTL model effectively boosts the F1 score by 9. 29 over the syntax-agnostic baseline.

Fine-Grained Opinion Analysis Multi-Task Learning

Efficient Second-Order TreeCRF for Neural Dependency Parsing

2 code implementations ACL 2020 Yu Zhang, Zhenghua Li, Min Zhang

Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data.

Chinese Dependency Parsing Dependency Parsing

Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing?

1 code implementation6 Mar 2020 Houquan Zhou, Yu Zhang, Zhenghua Li, Min Zhang

In the pre deep learning era, part-of-speech tags have been considered as indispensable ingredients for feature engineering in dependency parsing.

Dependency Parsing Feature Engineering +4

Syntax-aware Neural Semantic Role Labeling

1 code implementation22 Jul 2019 Qingrong Xia, Zhenghua Li, Min Zhang, Meishan Zhang, Guohong Fu, Rui Wang, Luo Si

Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP.

Semantic Parsing Semantic Role Labeling +1

Semi-supervised Domain Adaptation for Dependency Parsing

1 code implementation ACL 2019 Zhenghua Li, Xue Peng, Min Zhang, Rui Wang, Luo Si

During the past decades, due to the lack of sufficient labeled data, most studies on cross-domain parsing focus on unsupervised domain adaptation, assuming there is no target-domain training data.

Chinese Dependency Parsing Dependency Parsing +3

HLT@SUDA at SemEval-2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing

no code implementations SEMEVAL 2019 Wei Jiang, Zhenghua Li, Yu Zhang, Min Zhang

The key idea is to convert a UCCA semantic graph into a constituent tree, in which extra labels are deliberately designed to mark remote edges and discontinuous nodes for future recovery.

General Classification Multi-Task Learning +1

HLT@SUDA at SemEval 2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing

no code implementations11 Mar 2019 Wei Jiang, Zhenghua Li, Yu Zhang, Min Zhang

The key idea is to convert a UCCA semantic graph into a constituent tree, in which extra labels are deliberately designed to mark remote edges and discontinuous nodes for future recovery.

General Classification UCCA Parsing

Supervised Treebank Conversion: Data and Approaches

no code implementations ACL 2018 Xinzhou Jiang, Zhenghua Li, Bo Zhang, Min Zhang, Sheng Li, Luo Si

Treebank conversion is a straightforward and effective way to exploit various heterogeneous treebanks for boosting parsing performance.

Dependency Parsing Multi-Task Learning +1

SEE: Syntax-aware Entity Embedding for Neural Relation Extraction

no code implementations11 Jan 2018 Zhengqiu He, Wenliang Chen, Zhenghua Li, Meishan Zhang, Wei zhang, Min Zhang

First, we encode the context of entities on a dependency tree as sentence-level entity embedding based on tree-GRU.

Relation Relation Classification +3

Multi-Grained Chinese Word Segmentation

no code implementations EMNLP 2017 Chen Gong, Zhenghua Li, Min Zhang, Xinzhou Jiang

Traditionally, word segmentation (WS) adopts the single-grained formalism, where a sentence corresponds to a single word sequence.

Chinese Word Segmentation Language Modelling +2

Distributed Representations for Building Profiles of Users and Items from Text Reviews

no code implementations COLING 2016 Wenliang Chen, Zhenjie Zhang, Zhenghua Li, Min Zhang

In this paper, we propose an approach to learn distributed representations of users and items from text comments for recommendation systems.

Collaborative Filtering Decision Making +3

Training Dependency Parsers with Partial Annotation

no code implementations29 Sep 2016 Zhenghua Li, Yue Zhang, Jiayuan Chao, Min Zhang

The first approach is previously proposed to directly train a log-linear graph-based parser (LLGPar) with PA based on a forest-based objective.

Dependency Parsing

Word Segmentation on Micro-blog Texts with External Lexicon and Heterogeneous Data

no code implementations4 Aug 2016 Qingrong Xia, Zhenghua Li, Jiayuan Chao, Min Zhang

This paper describes our system designed for the NLPCC 2016 shared task on word segmentation on micro-blog texts.

Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.