no code implementations • 15 Feb 2024 • Xinran Chen, Sufeng Duan, Gongshen Liu
Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens.
no code implementations • 27 Oct 2023 • Yilin Zhao, Hai Zhao, Sufeng Duan
Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options.
no code implementations • 16 Jan 2021 • Sufeng Duan, Hai Zhao
We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG.
no code implementations • 27 Dec 2020 • Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang
In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
no code implementations • 16 Sep 2020 • Sufeng Duan, Hai Zhao, Rui Wang
In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing.
no code implementations • 30 Apr 2020 • Sufeng Duan, Juncheng Cao, Hai Zhao
In this paper, we thus propose the capsule-Transformer, which extends the linear transformation into a more general capsule routing algorithm by taking SAN as a special case of capsule network.
no code implementations • 29 Apr 2020 • Sufeng Duan, Hai Zhao, Dong-dong Zhang, Rui Wang
Data augmentation is an effective performance enhancement in neural machine translation (NMT) by generating additional bilingual data.
1 code implementation • EMNLP 2020 • Sufeng Duan, Hai Zhao
Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model.
1 code implementation • 14 Aug 2019 • Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang
In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
Ranked #5 on Question Answering on SQuAD2.0 dev
no code implementations • 6 Nov 2018 • Sufeng Duan, Jiangtong Li, Hai Zhao
Rapidly developed neural models have achieved competitive performance in Chinese word segmentation (CWS) as their traditional counterparts.