Search Results for author: Zhefeng Wang

Found 29 papers, 14 papers with code

JABER and SABER: Junior and Senior Arabic BERt

1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.

Language Modelling NER

2,953

Paper
Code

Robust Estimation of Similarity Transformation for Visual Object Tracking

2 code implementations • 14 Dec 2017 • Yang Li, Jianke Zhu, Steven C. H. Hoi, Wenjie Song, Zhefeng Wang, Hantang Liu

In order to efficiently search in such a large 4-DoF space in real-time, we formulate the problem into two 2-DoF sub-problems and apply an efficient Block Coordinates Descent solver to optimize the estimation result.

Object Visual Object Tracking

506

Paper
Code

Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph

1 code implementation • 11 Dec 2021 • Tong Zhu, Xiaoye Qu, Wenliang Chen, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Min Zhang

Most previous studies of document-level event extraction mainly focus on building argument chains in an autoregressive way, which achieves a certain success but is inefficient in both training and inference.

Ranked #3 on Document-level Event Extraction on ChFinAnn

Document-level Event Extraction Event Extraction

223

Paper
Code

Mirror: A Universal Framework for Various Information Extraction Tasks

1 code implementation • 9 Nov 2023 • Tong Zhu, Junfei Ren, Zijian Yu, Mengsong Wu, Guoliang Zhang, Xiaoye Qu, Wenliang Chen, Zhefeng Wang, Baoxing Huai, Min Zhang

Sharing knowledge between information extraction tasks has always been a challenge due to the diverse data formats and task variations.

Machine Reading Comprehension

Paper
Code

MMEA: Entity Alignment for Multi-Modal Knowledge Graphs

1 code implementation • 20 Aug 2020 • Liyi Chen, Zhi Li, Yijun Wang, Tong Xu, Zhefeng Wang, Enhong Chen

To that end, in this paper, we propose a novel solution called Multi-Modal Entity Alignment (MMEA) to address the problem of entity alignment in a multi-modal view.

Knowledge Graphs Multimodal Deep Learning +1

Paper
Code

Multi-modal Siamese Network for Entity Alignment

1 code implementation • KDD 2022 • Liyi Chen, Zhi Li, Tong Xu, Han Wu, Zhefeng Wang, Nicholas Jing Yuan, Enhong Chen

To deal with that problem, in this paper, we propose a novel Multi-modal Siamese Network for Entity Alignment (MSNEA) to align entities in different MMKGs, in which multi-modal knowledge could be comprehensively leveraged by the exploitation of inter-modal effect.

Ranked #7 on Multi-modal Entity Alignment on UMVM-oea-d-w-v1 (using extra training data)

Attribute Contrastive Learning +3

Paper
Code

HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications

1 code implementation • Findings (ACL) 2021 • Qiao Cheng, Juntao Liu, Xiaoye Qu, Jin Zhao, Jiaqing Liang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Yanghua Xiao

Relation Relation Extraction

Paper
Code

An In-depth Study on Internal Structure of Chinese Words

1 code implementation • ACL 2021 • Chen Gong, Saihao Huang, Houquan Zhou, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

Several previous works on syntactic parsing propose to annotate shallow word-internal structures for better utilizing character-level information.

Sentence

Paper
Code

CED: Catalog Extraction from Documents

1 code implementation • 28 Apr 2023 • Tong Zhu, Guoliang Zhang, Zechang Li, Zijian Yu, Junfei Ren, Mengsong Wu, Zhefeng Wang, Baoxing Huai, Pingfu Chao, Wenliang Chen

To address this problem, we build a large manually annotated corpus, which is the first dataset for the Catalog Extraction from Documents (CED) task.

Ranked #1 on Catalog Extraction on ChCatExt

Catalog Extraction Sentence

Paper
Code

Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble

1 code implementation • 13 Dec 2022 • Xiaoye Qu, Jun Zeng, Daizong Liu, Zhefeng Wang, Baoxing Huai, Pan Zhou

Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the data scarcity problem in NER by automatically generating training samples.

named-entity-recognition Named Entity Recognition +1

Paper
Code

How Well Do Large Language Models Understand Syntax? An Evaluation by Asking Natural Language Questions

1 code implementation • 14 Nov 2023 • Houquan Zhou, Yang Hou, Zhenghua Li, Xuebin Wang, Zhefeng Wang, Xinyu Duan, Min Zhang

While recent advancements in large language models (LLMs) bring us closer to achieving artificial general intelligence, the question persists: Do LLMs truly understand language, or do they merely mimic comprehension through pattern recognition?

Prepositional Phrase Attachment Question Answering +1

Paper
Code

A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing

1 code implementation • CoNLL (EMNLP) 2021 • Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

In the coarse labeling stage, the joint model outputs a bracketed tree, in which each node corresponds to one of four labels (i. e., phrase, subphrase, word, subword).

Part-Of-Speech Tagging POS +2

Paper
Code

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

1 code implementation • 8 Jun 2023 • Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai

To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories.

Benchmarking

Paper
Code

Finding Theme Communities from Database Networks

no code implementations • 23 Sep 2017 • Lingyang Chu, Zhefeng Wang, Jian Pei, Yanyan Zhang, Yu Yang, Enhong Chen

Given a database network where each vertex is associated with a transaction database, we are interested in finding theme communities.

Paper
Add Code

Read, Retrospect, Select: An MRC Framework to Short Text Entity Linking

no code implementations • 7 Jan 2021 • Yingjie Gu, Xiaoye Qu, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Xiaolin Gui

Entity linking (EL) for the rapidly growing short text (e. g. search queries and news titles) is critical to industrial applications.

Entity Linking Machine Reading Comprehension +1

Paper
Add Code

A High Precision Pipeline for Financial Knowledge Graph Construction

no code implementations • COLING 2020 • Sarah Elhammadi, Laks V.S. Lakshmanan, Raymond Ng, Michael Simpson, Baoxing Huai, Zhefeng Wang, Lanjun Wang

This pipeline combines multiple information extraction techniques with a financial dictionary that we built, all working together to produce over 342, 000 compact extractions from over 288, 000 financial news articles, with a precision of 78{\%} at the top-100 extractions. The extracted triples are stored in a knowledge graph making them readily available for use in downstream applications.

Data Integration Fact Checking +4

Paper
Add Code

SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation

no code implementations • 14 Oct 2021 • Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang

In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis.

Generative Adversarial Network Singing Voice Synthesis +2

Paper
Add Code

APGN: Adversarial and Parameter Generation Networks for Multi-Source Cross-Domain Dependency Parsing

no code implementations • Findings (EMNLP) 2021 • Ying Li, Meishan Zhang, Zhenghua Li, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

Thanks to the strong representation learning capability of deep learning, especially pre-training techniques with language model loss, dependency parsing has achieved great performance boost in the in-domain scenario with abundant labeled training data for target domains.

Dependency Parsing Language Modelling +1

Paper
Add Code

Delving Deep into Regularity: A Simple but Effective Method for Chinese Named Entity Recognition

no code implementations • Findings (NAACL) 2022 • Yingjie Gu, Xiaoye Qu, Zhefeng Wang, Yi Zheng, Baoxing Huai, Nicholas Jing Yuan

Recent years have witnessed the improving performance of Chinese Named Entity Recognition (NER) from proposing new frameworks or incorporating word lexicons.

Chinese Named Entity Recognition named-entity-recognition +3

Paper
Add Code

Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding

no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.

Natural Language Understanding

Paper
Add Code

Mining Word Boundaries in Speech as Naturally Annotated Word Segmentation Data

no code implementations • 31 Oct 2022 • Lei Zhang, Zhenghua Li, Shilin Zhou, Chen Gong, Zhefeng Wang, Baoxing Huai, Min Zhang

Inspired by early research on exploring naturally annotated data for Chinese word segmentation (CWS), and also by recent research on integration of speech and text processing, this work for the first time proposes to mine word boundaries from parallel speech/text data.

Chinese Word Segmentation

Paper
Add Code

A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

no code implementations • 7 Feb 2023 • Xiaoye Qu, Yingjie Gu, Qingrong Xia, Zechang Li, Zhefeng Wang, Baoxing Huai

In this paper, we provide a comprehensive review of the development of Arabic NER, especially the recent advances in deep learning and pre-trained language model.

Feature Engineering Language Modelling +4

Paper
Add Code

CopyNE: Better Contextual ASR by Copying Named Entities

no code implementations • 22 May 2023 • Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai

However, traditional token-level ASR models have struggled with accurately transcribing entities due to the problem of homophonic and near-homophonic tokens.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing

no code implementations • 11 Jun 2023 • Asaad Alghamdi, Xinyu Duan, Wei Jiang, Zhenhai Wang, Yimeng Wu, Qingrong Xia, Zhefeng Wang, Yi Zheng, Mehdi Rezagholizadeh, Baoxing Huai, Peilun Cheng, Abbas Ghaddar

Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP).

Few-Shot Learning

Paper
Add Code

Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation

no code implementations • 14 Jun 2023 • Likang Wu, Zhi Li, Hongke Zhao, Zhefeng Wang, Qi Liu, Baoxing Huai, Nicholas Jing Yuan, Enhong Chen

Zero-Shot Learning (ZSL), which aims at automatically recognizing unseen objects, is a promising learning paradigm to understand new real-world knowledge for machines continuously.

Attribute Knowledge Graphs +2

Paper
Add Code

High-order Joint Constituency and Dependency Parsing

1 code implementation • 21 Sep 2023 • Yanggan Gu, Yang Hou, Zhefeng Wang, Xinyu Duan, Zhenghua Li

Compared to their work, we make progress in three aspects: (1) adopting a much more efficient decoding algorithm of $O(n^4)$ time complexity, (2) exploring joint modeling at the training phase, instead of only at the inference phase, (3) proposing high-order scoring components to promote constituent-dependency interaction.

Dependency Parsing Multi-Task Learning

Paper
Code

Shai: A large language model for asset management

no code implementations • 21 Dec 2023 • Zhongyang Guo, Guanran Jiang, Zhongdan Zhang, Peng Li, Zhefeng Wang, Yinchun Wang

This paper introduces "Shai" a 10B level large language model specifically designed for the asset management industry, built upon an open-source foundational model.

Asset Management Language Modelling +2

Paper
Add Code

A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching

no code implementations • 5 Mar 2024 • Dong Yao, Asaad Alghamdi, Qingrong Xia, Xiaoye Qu, Xinyu Duan, Zhefeng Wang, Yi Zheng, Baoxing Huai, Peilun Cheng, Zhou Zhao

Although DC-Match is a simple yet effective method for semantic matching, it highly depends on the external NER techniques to identify the keywords of sentences, which limits the performance of semantic matching for minor languages since satisfactory NER tools are usually hard to obtain.

Chatbot Community Question Answering +4

Paper
Add Code

Adapprox: Adaptive Approximation in Adam Optimization via Randomized Low-Rank Matrices

no code implementations • 22 Mar 2024 • Pengxiang Zhao, Ping Li, Yingjie Gu, Yi Zheng, Stephan Ludger Kölker, Zhefeng Wang, Xiaoming Yuan

As deep learning models exponentially increase in size, optimizers such as Adam encounter significant memory consumption challenges due to the storage of first and second moment data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.