Search Results for author: Chong Feng

Found 11 papers, 0 papers with code

面向司法领域的高质量开源藏汉平行语料库构建(A High-quality Open Source Tibetan-Chinese Parallel Corpus Construction of Judicial Domain)

no code implementations CCL 2020 Jiu Sha, Luqin Zhou, Chong Feng, Hongzheng Li, Tianfu Zhang, Hui Hui

面向司法领域的藏汉机器翻译面临严重的数据稀疏问题。本文将从两个方面展录研究:第一, 相比于通用领域, 司法领域的藏语要有更严谨的逻辑表达和更多的专业术语。然而, 目前藏语资源在司法领域内缺乏对应的语料, 稀缺专业术语词以及句法结构。第二, 藏语的特殊词汇表达方式和特定句法结构使得通用语料构建方法难以构建藏汉平行语料库。为此, 本文提出仺种针对司法领域藏汉平行语料的轻量级构建方法。首先, 我们采取人工标注获取一个中等规模的司法领域藏汉专业术语表作为先验知识库, 以避免领域越界而产生的语料逻辑表达问题和领域术语缺失问题;其次, 我们从全国的地方法庭官网采集实例语料数据, 例如裁判文书。我们优先寻找藏文实例数据, 其次是汉语, 以避免后续构造藏语句子而丢失特殊的词汇表达和句式结构。我们基于以上原则采集藏汉语料构建高质量的藏汉平行语料库, 具体方法包括:爬虫获取语料, 规则断章对齐检测, 语句边界识别, 语料库自动清洗。朂终, 我们构建了16万级规模的藏汉司法领域语料库, 并通过多种翻译模型和交叉实验验证了构建的语料库的高质量特点和鲁棒性。另外, 此语料库会弚源以便于相关研究人员用于科研工作。

Enlivening Redundant Heads in Multi-head Self-attention for Machine Translation

no code implementations EMNLP 2021 Tianfu Zhang, Heyan Huang, Chong Feng, Longbing Cao

Multi-head self-attention recently attracts enormous interest owing to its specialized functions, significant parallelizable computation, and flexible extensibility.

Machine Translation Translation

RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts

no code implementations23 Mar 2024 Hongzheng Li, Ruojin Wang, Ge Shi, Xing Lv, Lei Lei, Chong Feng, Fang Liu, JinKun Lin, Yangguang Mei, Lingnan Xu

In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts.

Boosting Event Extraction with Denoised Structure-to-Text Augmentation

no code implementations16 May 2023 Bo wang, Heyan Huang, Xiaochi Wei, Ge Shi, Xiao Liu, Chong Feng, Tong Zhou, Shuaiqiang Wang, Dawei Yin

Event extraction aims to recognize pre-defined event triggers and arguments from texts, which suffer from the lack of high-quality annotations.

Event Extraction Text Augmentation +1

Distant Supervision for Relation Extraction with Linear Attenuation Simulation and Non-IID Relevance Embedding

no code implementations22 Dec 2018 Changsen Yuan, He-Yan Huang, Chong Feng, Xiao Liu, Xiaochi Wei

Distant supervision for relation extraction is an efficient method to reduce labor costs and has been widely used to seek novel relational facts in large corpora, which can be identified as a multi-instance multi-label problem.

Relation Relation Extraction +2

Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction

no code implementations EMNLP 2018 Ge Shi, Chong Feng, Lifu Huang, Boliang Zhang, Heng Ji, Lejian Liao, He-Yan Huang

Relation Extraction suffers from dramatical performance decrease when training a model on one genre and directly applying it to a new genre, due to the distinct feature distributions.

Feature Engineering Relation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.