Search Results for author: Hai Zhao

Found 263 papers, 112 papers with code

What Works and Doesn’t Work, A Deep Decoder for Neural Machine Translation

no code implementations Findings (ACL) 2022 Zuchao Li, Yiran Wang, Masao Utiyama, Eiichiro Sumita, Hai Zhao, Taro Watanabe

Inspired by this discovery, we then propose approaches to improving it, with respect to model structure and model training, to make the deep decoder practical in NMT.

Decoder Language Modelling +3

Restricted or Not: A General Training Framework for Neural Machine Translation

no code implementations ACL 2022 Zuchao Li, Masao Utiyama, Eiichiro Sumita, Hai Zhao

Although this can satisfy the requirements overall, it usually requires a larger beam size and far longer decoding time than unrestricted translation, which limits the concurrent processing ability of the translation model in deployment, and thus its practicality.

Machine Translation Translation

Nested Named Entity Recognition as Corpus Aware Holistic Structure Parsing

no code implementations COLING 2022 Yifei Yang, Zuchao Li, Hai Zhao

Thus in order to address this mismatch, this work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.

Domain Adaptation named-entity-recognition +4

Aspect-based Sentiment Analysis as Machine Reading Comprehension

no code implementations COLING 2022 Yifei Yang, Hai Zhao

Existing studies typically handle aspect-based sentiment analysis by stacking multiple neural modules, which inevitably result in severe error propagation.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

BiBL: AMR Parsing and Generation with Bidirectional Bayesian Learning

1 code implementation COLING 2022 Ziming Cheng, Zuchao Li, Hai Zhao

Abstract Meaning Representation (AMR) offers a unified semantic representation for natural language sentences.

 Ranked #1 on AMR-to-Text Generation on LDC2017T10 (using extra training data)

AMR Parsing AMR-to-Text Generation +1

What If Sentence-hood is Hard to Define: A Case Study in Chinese Reading Comprehension

no code implementations Findings (EMNLP) 2021 Jiawei Wang, Hai Zhao, Yinggong Zhao, Libin Shen

Machine reading comprehension (MRC) is a challenging NLP task for it requires to carefully deal with all linguistic granularities from word, sentence to passage.

Chinese Reading Comprehension Machine Reading Comprehension +1

Syntax in End-to-End Natural Language Processing

no code implementations EMNLP (ACL) 2021 Hai Zhao, Rui Wang, Kehai Chen

This tutorial surveys the latest technical progress of syntactic parsing and the role of syntax in end-to-end natural language processing (NLP) tasks, in which semantic role labeling (SRL) and machine translation (MT) are the representative NLP tasks that have always been beneficial from informative syntactic clues since a long time ago, though the advance from end-to-end deep learning models shows new results.

Machine Translation NMT +2

Hypergraph based Understanding for Document Semantic Entity Recognition

no code implementations9 Jul 2024 Qiwei Li, Zuchao Li, Ping Wang, Haojun Ai, Hai Zhao

We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.

document understanding

Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba

no code implementations24 Jun 2024 Yuchen Zou, Yineng Chen, Zuchao Li, Lefei Zhang, Hai Zhao

Transformer, a deep neural network architecture, has long dominated the field of natural language processing and beyond.

The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models

1 code implementation22 Jun 2024 Jiajia Li, Lu Yang, Mingni Tang, Cong Chen, Zuchao Li, Ping Wang, Hai Zhao

By leveraging ZIQI-Eval, we conduct a comprehensive evaluation over 16 LLMs to evaluate and analyze LLMs' performance in the domain of music.

Vript: A Video Is Worth Thousands of Words

1 code implementation10 Jun 2024 Dongjie Yang, Suyuan Huang, Chengqiang Lu, Xiaodong Han, Haoxin Zhang, Yan Gao, Yao Hu, Hai Zhao

Vriptor is also a powerful model capable of end-to-end generation of dense and detailed captions for long videos.

Video Captioning Video Understanding

GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment

1 code implementation30 May 2024 Yao Yao, Zuchao Li, Hai Zhao

The results highlight substantial enhancements in accuracy and processing speed on the GSM8K and CSQA datasets, surpassing the performance of using either the student or teacher models in isolation.

GSM8K Knowledge Distillation +1

From Role-Play to Drama-Interaction: An LLM Solution

no code implementations23 May 2024 Weiqi Wu, Hongqiu Wu, Lai Jiang, XingYuan Liu, Jiale Hong, Hai Zhao, Min Zhang

Drama is a form of storytelling inspired by human creativity, proceeding with a predefined storyline, carrying emotions and thoughts.

Instruction Following

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference

no code implementations21 May 2024 Dongjie Yang, Xiaodong Han, Yan Gao, Yao Hu, Shilin Zhang, Hai Zhao

To accelerate inference, we store computed keys and values (KV cache) in the GPU memory.

SirLLM: Streaming Infinite Retentive LLM

1 code implementation21 May 2024 Yao Yao, Zuchao Li, Hai Zhao

However, the one-off input of overly long texts is limited, as studies have shown that when input lengths exceed the LLMs' pre-trained text length, there is a dramatic decline in text generation capabilities.

Text Generation

Communication-Efficient Hybrid Federated Learning for E-health with Horizontal and Vertical Data Partitioning

no code implementations15 Apr 2024 Chong Yu, Shuaiqi Shen, Shiqiang Wang, Kuan Zhang, Hai Zhao

In this paper, we provide a thorough study on an effective integration of HFL and VFL, to achieve communication efficiency and overcome the above limitations when data is both horizontally and vertically partitioned.

Vertical Federated Learning

Instruction-Driven Game Engines on Large Language Models

1 code implementation30 Mar 2024 Hongqiu Wu, Y. Wang, XingYuan Liu, Hai Zhao, Min Zhang

The Instruction-Driven Game Engine (IDGE) project aims to democratize game development by enabling a large language model (LLM) to follow free-form game rules and autonomously generate game-play processes.

Language Modelling Large Language Model

Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering

1 code implementation28 Mar 2024 Yexin Wu, Zhuosheng Zhang, Hai Zhao

However, flawless CoT reasoning cannot be guaranteed due to the presence of indecomposable questions and the potential for erroneous reasoning chains, particularly in the case of small-scale language models.

Multi-modal Auto-regressive Modeling via Visual Words

1 code implementation12 Mar 2024 Tianshuo Peng, Zuchao Li, Lefei Zhang, Hai Zhao, Ping Wang, Bo Du

Large Language Models (LLMs), benefiting from the auto-regressive modelling approach performed on massive unannotated texts corpora, demonstrates powerful perceptual and reasoning capabilities.

Visual Question Answering

Hypertext Entity Extraction in Webpage

no code implementations4 Mar 2024 Yifei Yang, Tianqiao Liu, Bo Shao, Hai Zhao, Linjun Shou, Ming Gong, Daxin Jiang

Webpage entity extraction is a fundamental natural language processing task in both research and applications.

Unveiling Vulnerability of Self-Attention

1 code implementation26 Feb 2024 Khai Jiet Liong, Hongqiu Wu, Hai Zhao

(2) We introduce \textit{S-Attend}, a novel smoothing technique that effectively makes SA robust via structural perturbations.

Head-wise Shareable Attention for Large Language Models

1 code implementation19 Feb 2024 Zouying Cao, Yifei Yang, Hai Zhao

In this paper, we present a perspective on $\textit{$\textbf{head-wise shareable attention for large language models}$}$.

CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation

2 code implementations19 Feb 2024 Xinbei Ma, Zhuosheng Zhang, Hai Zhao

We propose a Comprehensive Cognitive LLM Agent, CoCo-Agent, with two novel approaches, comprehensive environment perception (CEP) and conditional action prediction (CAP), to systematically improve the GUI automation performance.

Type prediction

Dissecting Human and LLM Preferences

1 code implementation17 Feb 2024 Junlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, Hai Zhao, PengFei Liu

As a relative quality comparison of model responses, human and Large Language Model (LLM) preferences serve as common alignment goals in model fine-tuning and criteria in evaluation.

Language Modelling Large Language Model

LaCo: Large Language Model Pruning via Layer Collapse

1 code implementation17 Feb 2024 Yifei Yang, Zouying Cao, Hai Zhao

Large language models (LLMs) based on transformer are witnessing a notable trend of size expansion, which brings considerable costs to both model training and inference.

Knowledge Distillation Language Modelling +2

GLaPE: Gold Label-agnostic Prompt Evaluation and Optimization for Large Language Model

1 code implementation4 Feb 2024 Xuanchang Zhang, Zhuosheng Zhang, Hai Zhao

Despite the rapid progress of large language models (LLMs), their task performance remains sensitive to prompt design.

Language Modelling Large Language Model

Sparse is Enough in Fine-tuning Pre-trained Large Language Models

1 code implementation19 Dec 2023 Weixi Song, Zuchao Li, Lefei Zhang, Hai Zhao, Bo Du

With the prevalence of pre-training-fine-tuning paradigm, how to efficiently adapt the pre-trained model to the downstream tasks has been an intriguing issue.

Language Modelling Large Language Model

A Novel Energy based Model Mechanism for Multi-modal Aspect-Based Sentiment Analysis

1 code implementation13 Dec 2023 Tianshuo Peng, Zuchao Li, Ping Wang, Lefei Zhang, Hai Zhao

However, previous methods still have certain limitations: (i) They ignore the difference in the focus of visual information between different analysis targets (aspect or sentiment).

Aspect-Based Sentiment Analysis Sentiment Analysis

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

1 code implementation20 Nov 2023 Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.

Multi-grained Evidence Inference for Multi-choice Reading Comprehension

no code implementations27 Oct 2023 Yilin Zhao, Hai Zhao, Sufeng Duan

Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options.

Machine Reading Comprehension Multi-Choice MRC +1

Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning

1 code implementation20 Oct 2023 JinYuan Wang, Junlong Li, Hai Zhao

To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting.

In-Context Learning Multi-hop Question Answering +1

Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models

1 code implementation10 Oct 2023 Anni Zou, Zhuosheng Zhang, Hai Zhao, Xiangru Tang

Large language models (LLMs) have unveiled remarkable reasoning capabilities by exploiting chain-of-thought (CoT) prompting, which generates intermediate reasoning chains to serve as the rationale for deriving the answer.

Empower Nested Boolean Logic via Self-Supervised Curriculum Learning

1 code implementation9 Oct 2023 Hongqiu Wu, Linfeng Liu, Hai Zhao, Min Zhang

Beyond the great cognitive powers showcased by language models, it is crucial to scrutinize whether their reasoning capabilities stem from strong generalization or merely exposure to relevant data.

Logical Reasoning Self-Supervised Learning

Generative Judge for Evaluating Alignment

1 code implementation9 Oct 2023 Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Hai Zhao, PengFei Liu

The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address.

AutoHall: Automated Hallucination Dataset Generation for Large Language Models

no code implementations30 Sep 2023 Zouying Cao, Yifei Yang, Hai Zhao

While Large language models (LLMs) have garnered widespread applications across various domains due to their powerful language understanding and generation capabilities, the detection of non-factual or hallucinatory content generated by LLMs remains scarce.

Fact Checking Hallucination

Multi-turn Dialogue Comprehension from a Topic-aware Perspective

no code implementations18 Sep 2023 Xinbei Ma, Yi Xu, Hai Zhao, Zhuosheng Zhang

On the other hand, the split segments are an appropriate element of multi-turn dialogue response selection.

Machine Reading Comprehension

CSPRD: A Financial Policy Retrieval Dataset for Chinese Stock Market

1 code implementation8 Sep 2023 JinYuan Wang, Hai Zhao, Zhong Wang, Zeyang Zhu, Jinhao Xie, Yong Yu, Yongjian Fei, Yue Huang, Dawei Cheng

In recent years, great advances in pre-trained language models (PLMs) have sparked considerable research focus and achieved promising performance on the approach of dense passage retrieval, which aims at retrieving relative passages from massive corpus with given questions.

Passage Retrieval Retrieval

Chinese Spelling Correction as Rephrasing Language Model

2 code implementations17 Aug 2023 Linfeng Liu, Hongqiu Wu, Hai Zhao

However, we note a critical flaw in the process of tagging one character to another, that the correction is excessively conditioned on the error.

Language Modelling Sentence +1

Enhancing Visually-Rich Document Understanding via Layout Structure Modeling

1 code implementation15 Aug 2023 Qiwei Li, Zuchao Li, Xiantao Cai, Bo Du, Hai Zhao

In this paper, we propose GraphLayoutLM, a novel document understanding model that leverages the modeling of layout structure graph to inject document layout knowledge into the model.

document understanding

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

1 code implementation2 Jul 2023 Yineng Chen, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao

SGD and Adam are two classical and effective optimizers on which researchers have proposed many variants, such as SGDM and RAdam.

BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer

1 code implementation1 Jul 2023 Zuchao Li, Shitou Zhang, Hai Zhao, Yifei Yang, Dongjie Yang

BatGPT is a large-scale language model designed and trained jointly by Wuhan University and Shanghai Jiao Tong University.

Language Modelling Question Answering +1

Modeling Hierarchical Reasoning Chains by Linking Discourse Units and Key Phrases for Reading Comprehension

1 code implementation COLING 2022 Jialin Chen, Zhuosheng Zhang, Hai Zhao

Machine reading comprehension (MRC) poses new challenges over logical reasoning, which aims to understand the implicit logical relations entailed in the given contexts and perform inference over them.

Logical Reasoning Machine Reading Comprehension +2

FSUIE: A Novel Fuzzy Span Mechanism for Universal Information Extraction

1 code implementation19 Jun 2023 Tianshuo Peng, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao

To address these deficiencies, we propose the Fuzzy Span Universal Information Extraction (FSUIE) framework.

UIE

CMMLU: Measuring massive multitask language understanding in Chinese

1 code implementation15 Jun 2023 Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, Timothy Baldwin

As the capabilities of large language models (LLMs) continue to advance, evaluating their performance becomes increasingly crucial and challenging.

Large Language Model

Rethinking Masked Language Modeling for Chinese Spelling Correction

1 code implementation28 May 2023 Hongqiu Wu, Shaohua Zhang, Yuchen Zhang, Hai Zhao

In this paper, we study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model.

Diversity Domain Generalization +3

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models

2 code implementations26 May 2023 Yao Yao, Zuchao Li, Hai Zhao

Therefore, we propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph.

GSM8K Multimodal Reasoning +1

Pre-training Multi-party Dialogue Models with Latent Discourse Inference

1 code implementation24 May 2023 Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao

Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows.

RefGPT: Dialogue Generation of GPT, by GPT, and for GPT

1 code implementation24 May 2023 Dongjie Yang, Ruifeng Yuan, Yuantao Fan, Yifei Yang, Zili Wang, Shusen Wang, Hai Zhao

Therefore, we propose a method called RefGPT to generate enormous truthful and customized dialogues without worrying about factual errors caused by the model hallucination.

Dialogue Generation Hallucination

Query Rewriting for Retrieval-Augmented Large Language Models

no code implementations23 May 2023 Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan

Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline.

Language Modelling Multiple-choice +1

Extrapolating Multilingual Understanding Models as Multilingual Generators

no code implementations22 May 2023 Bohong Wu, Fei Yuan, Hai Zhao, Lei LI, Jingjing Xu

Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.

Denoising Machine Translation +5

EM Pre-training for Multi-party Dialogue Response Generation

1 code implementation21 May 2023 Yiyang Li, Hai Zhao

Dialogue response generation requires an agent to generate a response according to the current dialogue history, in terms of which two-party dialogues have been well studied, but leaving a great gap for multi-party dialogues at the same time.

Response Generation

PROM: A Phrase-level Copying Mechanism with Pre-training for Abstractive Summarization

1 code implementation11 May 2023 Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan

Based on the remarkable achievements of pre-trained language models in abstractive summarization, the copying mechanism has proved helpful by improving the factuality, stability, and overall performance.

Abstractive Text Summarization

Decker: Double Check with Heterogeneous Knowledge for Commonsense Fact Verification

1 code implementation10 May 2023 Anni Zou, Zhuosheng Zhang, Hai Zhao

Commonsense fact verification, as a challenging branch of commonsense question-answering (QA), aims to verify through facts whether a given commonsense claim is correct or not.

Fact Verification Question Answering

Attack Named Entity Recognition by Entity Boundary Interference

no code implementations9 May 2023 Yifei Yang, Hongqiu Wu, Hai Zhao

This is due to the fine-grained nature of NER, as even minor word changes in the sentence can result in the emergence or mutation of any entities, resulting in invalid adversarial examples.

named-entity-recognition Named Entity Recognition +3

Toward Adversarial Training on Contextualized Language Representation

1 code implementation8 May 2023 Hongqiu Wu, Yongxiang Liu, Hanwen Shi, Hai Zhao, Min Zhang

Based on the observation, we propose simple yet effective \textit{Contextualized representation-Adversarial Training} (CreAT), in which the attack is explicitly optimized to deviate the contextualized representation of the encoder.

Decoder named-entity-recognition +1

Multimodal Chain-of-Thought Reasoning in Language Models

3 code implementations2 Feb 2023 Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola

Experimental results on ScienceQA and A-OKVQA benchmark datasets show the effectiveness of our proposed approach.

Hallucination Language Modelling +1

Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension

no code implementations10 Jan 2023 Zhuosheng Zhang, Hai Zhao, Longxiang Liu

We decouple the contextualized word representations by masking mechanisms in Transformer-based PrLM, making each word only focus on the words in current utterance, other utterances, and two speaker roles (i. e., utterances of sender and utterances of the receiver), respectively.

Self-Prompting Large Language Models for Zero-Shot Open-Domain QA

1 code implementation16 Dec 2022 Junlong Li, JinYuan Wang, Zhuosheng Zhang, Hai Zhao

This task becomes notably challenging in a zero-shot setting where no data is available to train tailored retrieval-reader models.

In-Context Learning Open-Domain Question Answering +1

Language Model Pre-training on True Negatives

no code implementations1 Dec 2022 Zhuosheng Zhang, Hai Zhao, Masao Utiyama, Eiichiro Sumita

Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones.

Language Modelling

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

2 code implementations19 Oct 2022 Hongqiu Wu, Ruixue Ding, Hai Zhao, Boli Chen, Pengjun Xie, Fei Huang, Min Zhang

Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios.

Language Modelling Meta-Learning

Sentence Representation Learning with Generative Objective rather than Contrastive Objective

1 code implementation16 Oct 2022 Bohong Wu, Hai Zhao

Though offering amazing contextualized token-level representations, current pre-trained language models take less attention on accurately acquiring sentence-level representation during their self-supervised pre-training.

Representation Learning Retrieval +4

Towards End-to-End Open Conversational Machine Reading

no code implementations13 Oct 2022 Sizhe Zhou, Siru Ouyang, Zhuosheng Zhang, Hai Zhao

In open-retrieval conversational machine reading (OR-CMR) task, machines are required to do multi-turn question answering given dialogue history and a textual knowledge base.

Decision Making Question Answering +4

Task Compass: Scaling Multi-task Pre-training with Task Prefix

1 code implementation12 Oct 2022 Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng

Leveraging task-aware annotated data as supervised signals to assist with self-supervised learning on large-scale unlabeled data has become a new trend in pre-training language models.

Common Sense Reasoning Data Augmentation +4

Instance Regularization for Discriminative Language Model Pre-training

1 code implementation11 Oct 2022 Zhuosheng Zhang, Hai Zhao, Ming Zhou

They treat training instances equally throughout the training process, with little attention on the individual contribution of those instances.

Denoising Language Modelling +2

Semantic-Preserving Adversarial Code Comprehension

1 code implementation COLING 2022 Yiyang Li, Hongqiu Wu, Hai Zhao

Based on the tremendous success of pre-trained language models (PrLMs) for source code comprehension tasks, current literature studies either ways to further improve the performance (generalization) of PrLMs, or their robustness against adversarial attacks.

Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense Reasoning

no code implementations23 Aug 2022 Letian Peng, Zuchao Li, Hai Zhao

In detail, it works on PLMs according to the Replaced Token Detection (RTD) pre-training objective in ELECTRA, in which the corruption detection objective reflects the confidence on contextual integrity that is more relevant to commonsense reasoning than existing probability.

Language Modelling Question Answering +1

Learning Better Masking for Better Language Model Pre-training

1 code implementation23 Aug 2022 Dongjie Yang, Zhuosheng Zhang, Hai Zhao

Masked Language Modeling (MLM) has been widely used as the denoising objective in pre-training language models (PrLMs).

Denoising Language Modelling +1

Rethinking Textual Adversarial Defense for Pre-trained Language Models

no code implementations21 Jul 2022 Jiayi Wang, Rongzhou Bao, Zhuosheng Zhang, Hai Zhao

However, we find that most existing textual adversarial examples are unnatural, which can be easily distinguished by both human and machine.

Adversarial Attack Adversarial Defense +1

Adversarial Self-Attention for Language Understanding

1 code implementation25 Jun 2022 Hongqiu Wu, Ruixue Ding, Hai Zhao, Pengjun Xie, Fei Huang, Min Zhang

Deep neural models (e. g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness.

Machine Reading Comprehension Named Entity Recognition (NER) +4

Generative or Contrastive? Phrase Reconstruction for Better Sentence Representation Learning

no code implementations20 Apr 2022 Bohong Wu, Hai Zhao

If self-supervised learning can be distinguished into two subcategories, generative and contrastive, then most existing studies show that sentence representation learning may more benefit from the contrastive methods but not the generative methods.

Contrastive Learning Representation Learning +5

Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling

1 code implementation18 Apr 2022 Yiyang Li, Hai Zhao, Zhuosheng Zhang

Multi-turn dialogue modeling as a challenging branch of natural language understanding (NLU), aims to build representations for machines to understand human dialogues, which provides a solid foundation for multiple downstream tasks.

Natural Language Understanding

Nested Named Entity Recognition as Holistic Structure Parsing

no code implementations17 Apr 2022 Yifei Yang, Zuchao Li, Hai Zhao

Thus in order to address this mismatch, this work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.

Domain Adaptation named-entity-recognition +4

Lite Unified Modeling for Discriminative Reading Comprehension

1 code implementation ACL 2022 Yilin Zhao, Hai Zhao, Libin Shen, Yinggong Zhao

As a broad and major category in machine reading comprehension (MRC), the generalized goal of discriminative MRC is answer prediction from the given materials.

Decoder Machine Reading Comprehension +2

Distinguishing Non-natural from Natural Adversarial Samples for More Robust Pre-trained Language Model

1 code implementation Findings (ACL) 2022 Jiayi Wang, Rongzhou Bao, Zhuosheng Zhang, Hai Zhao

We question the validity of current evaluation of robustness of PrLMs based on these non-natural adversarial samples and propose an anomaly detector to evaluate the robustness of PrLMs with more natural adversarial samples.

Data Augmentation Language Modelling

Semantics-Preserved Distortion for Personal Privacy Protection in Information Management

no code implementations4 Jan 2022 Jiajia Li, Lu Yang, Letian Peng, Shitou Zhang, Ping Wang, Zuchao Li, Hai Zhao

In recent years, machine learning - particularly deep learning - has significantly impacted the field of information management.

Attribute Constituency Parsing +7

ArT: All-round Thinker for Unsupervised Commonsense Question-Answering

1 code implementation26 Dec 2021 Jiawei Wang, Hai Zhao

In detail, our model first focuses on key parts in the given context, and then generates highly related knowledge on such a basis in an association way like human thinking.

Question Answering

Multilingual Pre-training with Universal Dependency Learning

no code implementations NeurIPS 2021 Kailai Sun, Zuchao Li, Hai Zhao

The pre-trained language model (PrLM) demonstrates domination in downstream natural language processing tasks, in which multilingual PrLM takes advantage of language universality to alleviate the issue of limited resources for low-resource languages.

Dependency Parsing Natural Language Understanding +1

Seeking Common but Distinguishing Difference, A Joint Aspect-based Sentiment Analysis Model

1 code implementation EMNLP 2021 Hongjiang Jing, Zuchao Li, Hai Zhao, Shu Jiang

Therefore, we propose a joint ABSA model, which not only enjoys the benefits of encoder sharing but also focuses on the difference to improve the effectiveness of the model.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Tracing Origins: Coreference-aware Machine Reading Comprehension

1 code implementation ACL 2022 Baorong Huang, Zhuosheng Zhang, Hai Zhao

In this paper, we imitate the human reading process in connecting the anaphoric expressions and explicitly leverage the coreference information of the entities to enhance the word embeddings from the pre-trained language model, in order to highlight the coreference mentions of the entities that must be identified for coreference-intensive question answering in QUOREF, a relatively new dataset that is specifically designed to evaluate the coreference-related performance of a model.

Language Modelling Machine Reading Comprehension +2

Structural Characterization for Dialogue Disentanglement

1 code implementation ACL 2022 Xinbei Ma, Zhuosheng Zhang, Hai Zhao

Tangled multi-party dialogue contexts lead to challenges for dialogue reading comprehension, where multiple dialogue threads flow simultaneously within a common dialogue record, increasing difficulties in understanding the dialogue history for both human and machine.

Disentanglement Feature Engineering +1

Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval

no code implementations ACL 2022 Bohong Wu, Zhuosheng Zhang, JinYuan Wang, Hai Zhao

In detail, we introduce an in-passage negative sampling strategy to encourage a diverse generation of sentence representations within the same passage.

Contrastive Learning Passage Retrieval +2

Advances in Multi-turn Dialogue Comprehension: A Survey

no code implementations11 Oct 2021 Zhuosheng Zhang, Hai Zhao

In this paper, we review the previous methods from the technical perspective of dialogue modeling for the dialogue comprehension task.

Diversity Reading Comprehension

Multi-tasking Dialogue Comprehension with Discourse Parsing

1 code implementation PACLIC 2021 Yuchen He, Zhuosheng Zhang, Hai Zhao

Multi-party dialogue machine reading comprehension (MRC) raises an even more challenging understanding goal on dialogue with more than two involved speakers, compared with the traditional plain passage style MRC.

Discourse Parsing Machine Reading Comprehension +1

Contextualized Semantic Distance between Highly Overlapped Texts

1 code implementation4 Oct 2021 Letian Peng, Zuchao Li, Hai Zhao

Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.

Domain Adaptation Language Modelling +8

Logic Pre-Training of Language Models

no code implementations29 Sep 2021 Siru Ouyang, Zhuosheng Zhang, Hai Zhao

Pre-trained language models (PrLMs) have been shown useful for enhancing a broad range of natural language understanding (NLU) tasks.

Logical Reasoning Machine Reading Comprehension +4

Sparse Fuzzy Attention for Structured Sentiment Analysis

no code implementations14 Sep 2021 Letian Peng, Zuchao Li, Hai Zhao

Attention scorers have achieved success in parsing tasks like semantic and syntactic dependency parsing.

Dependency Parsing Sentiment Analysis

Enhanced Speaker-aware Multi-party Multi-turn Dialogue Comprehension

no code implementations9 Sep 2021 Xinbei Ma, Zhuosheng Zhang, Hai Zhao

Multi-party multi-turn dialogue comprehension brings unprecedented challenges on handling the complicated scenarios from multiple speakers and criss-crossed discourse relationship among speaker-aware utterances.

Question Answering

Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension

1 code implementation Findings (EMNLP) 2021 Yiyang Li, Hai Zhao

Multi-party dialogue machine reading comprehension (MRC) brings tremendous challenge since it involves multiple speakers at one dialogue, resulting in intricate speaker information flows and noisy dialogue contexts.

Machine Reading Comprehension Question Answering

Unsupervised Open-Domain Question Answering

no code implementations31 Aug 2021 Pengfei Zhu, Xiaoguang Li, Jian Li, Hai Zhao

Open-domain Question Answering (ODQA) has achieved significant results in terms of supervised learning manner.

Machine Reading Comprehension Open-Domain Question Answering

Span Fine-tuning for Pre-trained Language Models

no code implementations Findings (EMNLP) 2021 Rongzhou Bao, Zhuosheng Zhang, Hai Zhao

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words.

Smoothing Dialogue States for Open Conversational Machine Reading

1 code implementation EMNLP 2021 Zhuosheng Zhang, Siru Ouyang, Hai Zhao, Masao Utiyama, Eiichiro Sumita

In this work, we propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation to provide a richer dialogue state reference.

Decision Making Decoder +3

Cross-lingual Transferring of Pre-trained Contextualized Language Models

no code implementations27 Jul 2021 Zuchao Li, Kevin Parnow, Hai Zhao, Zhuosheng Zhang, Rui Wang, Masao Utiyama, Eiichiro Sumita

Though the pre-trained contextualized language model (PrLM) has made a significant impact on NLP, training PrLMs in languages other than English can be impractical for two reasons: other languages often lack corpora sufficient for training powerful PrLMs, and because of the commonalities among human languages, computationally expensive PrLM training for different languages is somewhat redundant.

Language Modelling Machine Translation +1

Graph-free Multi-hop Reading Comprehension: A Select-to-Guide Strategy

no code implementations25 Jul 2021 Bohong Wu, Zhuosheng Zhang, Hai Zhao

Multi-hop reading comprehension (MHRC) requires not only to predict the correct answer span in the given passage, but also to provide a chain of supporting evidences for reasoning interpretability.

Multi-Hop Reading Comprehension

Dialogue-oriented Pre-training

1 code implementation Findings (ACL) 2021 Yi Xu, Hai Zhao

Pre-trained language models (PrLM) has been shown powerful in enhancing a broad range of downstream tasks including various dialogue related ones.

Language Modelling

Pre-training Universal Language Representation

no code implementations ACL 2021 Yian Li, Hai Zhao

Despite the well-developed cut-edge representation learning for language, most language representation models usually focus on specific levels of linguistic units.

Question Answering Representation Learning

Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice

1 code implementation30 May 2021 Rongzhou Bao, Jiayi Wang, Hai Zhao

In detail, we design an auxiliary anomaly detection classifier and adopt a multi-task learning procedure, by which PrLMs are able to distinguish adversarial input samples.

Adversarial Attack Anomaly Detection +2

Grammatical Error Correction as GAN-like Sequence Labeling

no code implementations Findings (ACL) 2021 Kevin Parnow, Zuchao Li, Hai Zhao

In Grammatical Error Correction (GEC), sequence labeling models enjoy fast inference compared to sequence-to-sequence models; however, inference in sequence labeling GEC models is an iterative process, as sentences are passed to the model for multiple rounds of correction, which exposes the model to sentences with progressively fewer errors at each round.

Grammatical Error Correction

Structural Pre-training for Dialogue Comprehension

no code implementations ACL 2021 Zhuosheng Zhang, Hai Zhao

Pre-trained language models (PrLMs) have demonstrated superior performance due to their strong ability to learn universal language representations from self-supervised pre-training.

Sentence

Fact-driven Logical Reasoning for Machine Reading Comprehension

2 code implementations NeurIPS 2021 Siru Ouyang, Zhuosheng Zhang, Hai Zhao

Recent years have witnessed an increasing interest in training machines with reasoning ability, which deeply relies on accurately and clearly presented clue forms.

Logical Reasoning Machine Reading Comprehension +1

Head-driven Phrase Structure Parsing in O($n^3$) Time Complexity

no code implementations20 May 2021 Zuchao Li, Junru Zhou, Hai Zhao, Kevin Parnow

Constituent and dependency parsing, the two classic forms of syntactic parsing, have been found to benefit from joint training and decoding under a uniform formalism, Head-driven Phrase Structure Grammar (HPSG).

Dependency Parsing

Neural Unsupervised Semantic Role Labeling

no code implementations19 Apr 2021 Kashif Munir, Hai Zhao, Zuchao Li

To decompose the task as two argument related subtasks, identification and clustering, we propose a pipeline that correspondingly consists of two neural modules.

Clustering Semantic Role Labeling +1

Not All Attention Is All You Need

no code implementations NeurIPS 2021 Hongqiu Wu, Hai Zhao, Min Zhang

Beyond the success story of pre-trained language models (PrLMs) in recent natural language processing, they are susceptible to over-fitting due to unusual large model size.

Document Classification Named Entity Recognition (NER) +1

Advances and Challenges in Unsupervised Neural Machine Translation

no code implementations EACL 2021 Rui Wang, Hai Zhao

Unsupervised cross-lingual language representation initialization methods, together with mechanisms such as denoising and back-translation, have advanced unsupervised neural machine translation (UNMT), which has achieved impressive results.

Denoising Machine Translation +1

Advances in Multi-turn Dialogue Comprehension: A Survey

no code implementations4 Mar 2021 Zhuosheng Zhang, Hai Zhao

In this paper, we review the previous methods from the technical perspective of dialogue modeling for the dialogue comprehension task.

Diversity Language Modelling +2

Text Compression-aided Transformer Encoding

no code implementations11 Feb 2021 Zuchao Li, Zhuosheng Zhang, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita

In this paper, we propose explicit and implicit text compression approaches to enhance the Transformer encoding and evaluate models using this approach on several typical downstream tasks that rely on the encoding heavily.

Text Compression

Multi-turn Dialogue Reading Comprehension with Pivot Turns and Knowledge

no code implementations10 Feb 2021 Zhuosheng Zhang, Junlong Li, Hai Zhao

Experimental results on four dialogue comprehension benchmark tasks show that our proposed model achieves great improvements on baselines.

Reading Comprehension

To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph

no code implementations16 Jan 2021 Sufeng Duan, Hai Zhao

We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG.

Machine Translation Sentence +1

Cross-lingual Transfer Learning for Pre-trained Contextualized Language Models

no code implementations1 Jan 2021 Zuchao Li, Kevin Barry Parnow, Hai Zhao, Zhuosheng Zhang, Rui Wang, Masao Utiyama, Eiichiro Sumita

Though the pre-trained contextualized language model (PrLM) has made a significant impact on NLP, training PrLMs in languages other than English can be impractical for two reasons: other languages often lack corpora sufficient for training powerful PrLMs, and because of the commonalities among human languages, computationally expensive PrLM training for different languages is somewhat redundant.

Cross-Lingual Transfer Language Modelling +3

Switching-Aligned-Words Data Augmentation for Neural Machine Translation

no code implementations1 Jan 2021 Fengshun Xiao, Zuchao Li, Hai Zhao

In neural machine translation (NMT), data augmentation methods such as back-translation make it possible to use extra monolingual data to help improve translation performance, while it needs extra training data and the in-domain monolingual data is not always available.

Data Augmentation Machine Translation +3

Efficient Neural Machine Translation with Prior Word Alignment

no code implementations1 Jan 2021 Jeonghyeok Park, Hai Zhao

In this paper, we propose a novel method that infuses prior word alignment information into neural machine translation (NMT) to provide hints or guidelines for the target sentence at running time.

Machine Translation NMT +3

Later Span Adaptation for Language Understanding

no code implementations1 Jan 2021 Rongzhou Bao, Zhuosheng Zhang, Hai Zhao

Instead of too early fixing the linguistic unit input as nearly all previous work did, we propose a novel method that combines span-level information into the representations generated by PrLMs during fine-tuning phase for better flexibility.

Natural Language Understanding Sentence

Enhancing Pre-trained Language Model with Lexical Simplification

no code implementations30 Dec 2020 Rongzhou Bao, Jiayi Wang, Zhuosheng Zhang, Hai Zhao

By substituting complex words with simple alternatives, lexical simplification (LS) is a recognized method to reduce such lexical diversity, and therefore to improve the understandability of sentences.

Diversity General Classification +5

Code Summarization with Structure-induced Transformer

1 code implementation Findings (ACL) 2021 Hongqiu Wu, Hai Zhao, Min Zhang

Code summarization (CS) is becoming a promising area in recent language understanding, which aims to generate sensible human language automatically for programming language in the format of source code, serving in the most convenience of programmer developing.

Code Summarization Graph Neural Network +2

BURT: BERT-inspired Universal Representation from Learning Meaningful Segment

no code implementations28 Dec 2020 Yian Li, Hai Zhao

We present a universal representation model, BURT (BERT-inspired Universal Representation from learning meaningful segmenT), to encode different levels of linguistic unit into the same vector space.

Information Retrieval Question Answering +4

SG-Net: Syntax Guided Transformer for Language Representation

no code implementations27 Dec 2020 Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang

In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.

Machine Reading Comprehension Machine Translation +2

Adaptive Convolution for Semantic Role Labeling

no code implementations27 Dec 2020 Kashif Munir, Hai Zhao, Zuchao Li

Semantic role labeling (SRL) aims at elaborating the meaning of a sentence by forming a predicate-argument structure.

Semantic Role Labeling Sentence

Cross-lingual Universal Dependency Parsing Only from One Monolingual Treebank

no code implementations24 Dec 2020 Kailai Sun, Zuchao Li, Hai Zhao

As it is unlikely to obtain a treebank for every human language, in this work, we propose an effective cross-lingual UD parsing framework for transferring parser from only one source monolingual treebank to any other target languages without treebank available.

Cross-Lingual Transfer Dependency Parsing +3

Reference Knowledgeable Network for Machine Reading Comprehension

1 code implementation7 Dec 2020 Yilin Zhao, Zhuosheng Zhang, Hai Zhao

Thus we propose a novel reference-based knowledge enhancement model called Reference Knowledgeable Network (RekNet), which simulates human reading strategies to refine critical information from the passage and quote explicit knowledge in necessity.

Machine Reading Comprehension Multi-Choice MRC

LIMIT-BERT : Linguistics Informed Multi-Task BERT

1 code implementation Findings of the Association for Computational Linguistics 2020 Junru Zhou, Zhuosheng Zhang, Hai Zhao, Shuailiang Zhang

Besides, LIMIT-BERT takes a semi-supervised learning strategy to offer the same large amount of linguistics task data as that for the language model training.

Language Modelling Multi-Task Learning +3

Topic-Aware Multi-turn Dialogue Modeling

1 code implementation26 Sep 2020 Yi Xu, Hai Zhao, Zhuosheng Zhang

In the retrieval-based multi-turn dialogue modeling, it remains a challenge to select the most appropriate response according to extracting salient features in context utterances.

Retrieval

Graph-to-Sequence Neural Machine Translation

no code implementations16 Sep 2020 Sufeng Duan, Hai Zhao, Rui Wang

In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing.

Graph-to-Sequence Machine Translation +3

Document-level Neural Machine Translation with Document Embeddings

no code implementations16 Sep 2020 Shu Jiang, Hai Zhao, Zuchao Li, Bao-liang Lu

Standard neural machine translation (NMT) is on the assumption of document-level context independent.

Machine Translation NMT +1

Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue

1 code implementation14 Sep 2020 Longxiang Liu, Zhuosheng Zhang, Hai Zhao, Xi Zhou, Xiang Zhou

A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.

Retrieval

Composing Answer from Multi-spans for Reading Comprehension

no code implementations14 Sep 2020 Zhuosheng Zhang, Yiqing Zhang, Hai Zhao, Xi Zhou, Xiang Zhou

This paper presents a novel method to generate answers for non-extraction machine reading comprehension (MRC) tasks whose answers cannot be simply extracted as one span from the given passages.

Decoder Machine Reading Comprehension

Syntax Role for Neural Semantic Role Labeling

no code implementations CL (ACL) 2021 Zuchao Li, Hai Zhao, Shexia He, Jiaxun Cai

Semantic role labeling (SRL) is dedicated to recognizing the semantic predicate-argument structure of a sentence.

Semantic Role Labeling Sentence

Dialogue-adaptive Language Model Pre-training From Quality Estimation

1 code implementation10 Sep 2020 Junlong Li, Zhuosheng Zhang, Hai Zhao

Pre-trained language models (PrLMs) have achieved great success on a wide range of natural language processing tasks by virtue of the universal language representation ability obtained by self-supervised learning on a large corpus.

Informativeness Language Modelling +2

Learning Universal Representations from Word to Sentence

no code implementations10 Sep 2020 Yian Li, Hai Zhao

Despite the well-developed cut-edge representation learning for language, most language representation models usually focus on specific level of linguistic unit, which cause great inconvenience when being confronted with handling multiple layers of linguistic objects in a unified way.

Representation Learning Sentence

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

1 code implementation13 May 2020 Zhuosheng Zhang, Hai Zhao, Rui Wang

In this survey, we provide a comprehensive and comparative review on MRC covering overall research topics about 1) the origin and development of MRC and CLM, with a particular focus on the role of CLMs; 2) the impact of MRC and CLM to the NLP community; 3) the definition, datasets, and evaluation of MRC; 4) general MRC architecture and technical methods in the view of two-stage Encoder-Decoder solving architecture from the insights of the cognitive process of humans; 5) previous highlights, emerging topics, and our empirical analysis, among which we especially focus on what works in different periods of MRC researches.

Decoder Machine Reading Comprehension +1

Data-dependent Gaussian Prior Objective for Language Generation

no code implementations ICLR 2020 Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

However, MLE focuses on once-to-all matching between the predicted sequence and gold-standard, consequently treating all incorrect predictions as being equally incorrect.

Diversity Image Captioning +5

Bipartite Flat-Graph Network for Nested Named Entity Recognition

1 code implementation ACL 2020 Ying Luo, Hai Zhao

In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for nested named entity recognition (NER), which contains two subgraph modules: a flat NER module for outermost entities and a graph module for all the entities located in inner layers.

named-entity-recognition Named Entity Recognition +3

Neural Machine Translation with Universal Visual Representation

1 code implementation ICLR 2020 Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Though visual information has been introduced for enhancing neural machine translation (NMT), its effectiveness strongly relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations.

Decoder Machine Translation +3

Capsule-Transformer for Neural Machine Translation

no code implementations30 Apr 2020 Sufeng Duan, Juncheng Cao, Hai Zhao

In this paper, we thus propose the capsule-Transformer, which extends the linear transformation into a more general capsule routing algorithm by taking SAN as a special case of capsule network.

Machine Translation Translation

BURT: BERT-inspired Universal Representation from Twin Structure

no code implementations29 Apr 2020 Yian Li, Hai Zhao

Pre-trained contextualized language models such as BERT have shown great effectiveness in a wide range of downstream Natural Language Processing (NLP) tasks.

Natural Language Inference Sentence +3

Knowledgeable Dialogue Reading Comprehension on Key Turns

no code implementations29 Apr 2020 Junlong Li, Zhuosheng Zhang, Hai Zhao

In this paper, the relevance of each turn to the question are calculated to choose key turns.

Answer Selection Language Modelling +1

Syntax-aware Data Augmentation for Neural Machine Translation

no code implementations29 Apr 2020 Sufeng Duan, Hai Zhao, Dong-dong Zhang, Rui Wang

Data augmentation is an effective performance enhancement in neural machine translation (NMT) by generating additional bilingual data.

Data Augmentation Machine Translation +3

Semantics-Aware Inferential Network for Natural Language Understanding

no code implementations28 Apr 2020 Shuailiang Zhang, Hai Zhao, Junru Zhou

Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues through an attention mechanism.

Machine Reading Comprehension Natural Language Inference +1

Reference Language based Unsupervised Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Zuchao Li, Hai Zhao, Rui Wang, Masao Utiyama, Eiichiro Sumita

Further enriching the idea of pivot translation by extending the use of parallel corpora beyond the source-target paradigm, we propose a new reference language-based framework for UNMT, RUNMT, in which the reference language only shares a parallel corpus with the source, but this corpus still indicates a signal clear enough to help the reconstruction training of UNMT through a proposed reference agreement mechanism.

Machine Translation Translation

Retrospective Reader for Machine Reading Comprehension

2 code implementations27 Jan 2020 Zhuosheng Zhang, Junjie Yang, Hai Zhao

Inspired by how humans solve reading comprehension questions, we proposed a retrospective reader (Retro-Reader) that integrates two stages of reading and verification strategies: 1) sketchy reading that briefly investigates the overall interactions of passage and question, and yield an initial judgment; 2) intensive reading that verifies the answer and gives the final prediction.

Machine Reading Comprehension Question Answering

DUMA: Reading Comprehension with Transposition Thinking

3 code implementations26 Jan 2020 Pengfei Zhu, Hai Zhao, Xiaoguang Li

Multi-choice Machine Reading Comprehension (MRC) requires model to decide the correct answer from a set of answer options when given a passage and a question.

Language Modelling Machine Reading Comprehension +1

Dual Multi-head Co-attention for Multi-choice Reading Comprehension

no code implementations1 Jan 2020 Pengfei Zhu, Hai Zhao, Xiaoguang Li

Multi-choice Machine Reading Comprehension (MRC) requires model to decide the correct answer from a set of answer options when given a passage and a question.

Language Modelling Machine Reading Comprehension +1

Explicit Sentence Compression for Neural Machine Translation

1 code implementation27 Dec 2019 Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT.

Decoder Machine Translation +4

Korean-to-Chinese Machine Translation using Chinese Character as Pivot Clue

2 code implementations25 Nov 2019 Jeonghyeok Park, Hai Zhao

Korean-Chinese is a low resource language pair, but Korean and Chinese have a lot in common in terms of vocabulary.

Machine Translation Translation

Global Greedy Dependency Parsing

1 code implementation20 Nov 2019 Zuchao Li, Hai Zhao, Kevin Parnow

Most syntactic dependency parsing models may fall into one of two categories: transition- and graph-based models.

Dependency Parsing Re-Ranking +1

Dependency and Span, Cross-Style Semantic Role Labeling on PropBank and NomBank

no code implementations7 Nov 2019 Zuchao Li, Hai Zhao, Junru Zhou, Kevin Parnow, Shexia He

In this paper, we define a new cross-style semantic role label convention and propose a new cross-style joint optimization model designed around the most basic linguistic meaning of a semantic role, providing a solution to make the results of the two styles more comparable and allowing both formalisms of SRL to benefit from their natural connections in both linguistics and computation.

Semantic Role Labeling

Probing Contextualized Sentence Representations with Visual Awareness

no code implementations7 Nov 2019 Zhuosheng Zhang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Hai Zhao

We present a universal framework to model contextualized sentence representations with visual awareness that is motivated to overcome the shortcomings of the multimodal parallel data with manual annotations.

Diversity Machine Translation +3

Hierarchical Contextualized Representation for Named Entity Recognition

1 code implementation6 Nov 2019 Ying Luo, Fengshun Xiao, Hai Zhao

In this paper, we address these two deficiencies and propose a model augmented with hierarchical contextualized representation: sentence-level representation and document-level representation.

Ranked #13 on Named Entity Recognition (NER) on Ontonotes v5 (English) (using extra training data)

named-entity-recognition Named Entity Recognition +2

Deepening Hidden Representations from Pre-trained Language Models

no code implementations5 Nov 2019 Junjie Yang, Hai Zhao

Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation.

Natural Language Understanding

SJTU-NICT at MRP 2019: Multi-Task Learning for End-to-End Uniform Semantic Graph Parsing

no code implementations CONLL 2019 Zuchao Li, Hai Zhao, Zhuosheng Zhang, Rui Wang, Masao Utiyama, Eiichiro Sumita

This paper describes our SJTU-NICT{'}s system for participating in the shared task on Cross-Framework Meaning Representation Parsing (MRP) at the 2019 Conference for Computational Language Learning (CoNLL).

Multi-Task Learning

SJTU at MRP 2019: A Transition-Based Multi-Task Parser for Cross-Framework Meaning Representation Parsing

no code implementations CONLL 2019 Hongxiao Bai, Hai Zhao

This paper describes the system of our team SJTU for our participation in the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing.

Sentence

LIMIT-BERT : Linguistic Informed Multi-Task BERT

no code implementations31 Oct 2019 Junru Zhou, Zhuosheng Zhang, Hai Zhao, Shuailiang Zhang

In this paper, we present a Linguistic Informed Multi-Task BERT (LIMIT-BERT) for learning language representations across multiple linguistic tasks by Multi-Task Learning (MTL).

Multi-Task Learning POS +2

Attention Is All You Need for Chinese Word Segmentation

1 code implementation EMNLP 2020 Sufeng Duan, Hai Zhao

Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model.

Chinese Word Segmentation Decoder +1