no code implementations • EMNLP (ACL) 2021 • Hai Zhao, Rui Wang, Kehai Chen
This tutorial surveys the latest technical progress of syntactic parsing and the role of syntax in end-to-end natural language processing (NLP) tasks, in which semantic role labeling (SRL) and machine translation (MT) are the representative NLP tasks that have always been beneficial from informative syntactic clues since a long time ago, though the advance from end-to-end deep learning models shows new results.
no code implementations • Findings (ACL) 2022 • Kehai Chen, Masao Utiyama, Eiichiro Sumita, Rui Wang, Min Zhang
Machine translation typically adopts an encoder-to-decoder framework, in which the decoder generates the target sentence word-by-word in an auto-regressive manner.
no code implementations • WMT (EMNLP) 2020 • Zuchao Li, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita
In this paper, we introduced our joint team SJTU-NICT ‘s participation in the WMT 2020 machine translation shared task.
1 code implementation • 21 May 2025 • Hongli Zhou, Hui Huang, Ziqing Zhao, Lvyuan Han, Huicheng Wang, Kehai Chen, Muyun Yang, Wei Bao, Jian Dong, Bing Xu, Conghui Zhu, Hailong Cao, Tiejun Zhao
The evaluation of large language models (LLMs) via benchmarks is widespread, yet inconsistencies between different leaderboards and poor separability among top models raise concerns about their ability to accurately reflect authentic model capabilities.
no code implementations • 18 Mar 2025 • Zhengsheng Guo, Linwei Zheng, Xinyang Chen, Xuefeng Bai, Kehai Chen, Min Zhang
While human cognition inherently retrieves information from diverse and specialized knowledge sources during decision-making processes, current Retrieval-Augmented Generation (RAG) systems typically operate through single-source knowledge retrieval, leading to a cognitive-algorithmic discrepancy.
no code implementations • 13 Mar 2025 • Qiyuan Deng, Xuefeng Bai, Kehai Chen, YaoWei Wang, Liqiang Nie, Min Zhang
Reinforcement Learning (RL) algorithms for safety alignment of Large Language Models (LLMs), such as Direct Preference Optimization (DPO), encounter the challenge of distribution shift.
no code implementations • 13 Mar 2025 • Henglyu Liu, Andong Chen, Kehai Chen, Xuefeng Bai, Meizhi Zhong, Yuan Qiu, Min Zhang
Recent advancement of large language models (LLMs) has led to significant breakthroughs across various tasks, laying the foundation for the development of LLM-based speech translation systems.
no code implementations • 10 Mar 2025 • Zhenyu Li, Kehai Chen, Yunfei Long, Xuefeng Bai, Yaoyin Zhang, Xuchen Wei, Juntao Li, Min Zhang
Large Language Models (LLMs) have demonstrated remarkable instruction-following capabilities across various applications.
no code implementations • 8 Mar 2025 • Dingkun Zhang, Shuhan Qi, Xinyu Xiao, Kehai Chen, Xuan Wang
Considering the heavy cost of training MLLMs, it is necessary to reuse the existing ones and further extend them to more modalities through Modality-incremental Continual Learning (MCL).
1 code implementation • 4 Mar 2025 • Xingzuo Li, Kehai Chen, Yunfei Long, Xuefeng Bai, Yong Xu, Min Zhang
Large language model (LLM) agents typically adopt a step-by-step reasoning framework, in which they interleave the processes of thinking and acting to accomplish the given task.
no code implementations • 28 Feb 2025 • Yihong Tang, Kehai Chen, Xuefeng Bai, ZhengYu Niu, Bo wang, Jie Liu, Min Zhang
Large Language Models (LLMs) have made remarkable advances in role-playing dialogue agents, demonstrating their utility in character simulations.
no code implementations • 25 Feb 2025 • Zhiyu Yin, Kehai Chen, Xuefeng Bai, Ruili Jiang, Juntao Li, Hongdong Li, Jin Liu, Yang Xiang, Jun Yu, Min Zhang
Video generation, by leveraging a dynamic visual generation method, pushes the boundaries of Artificial Intelligence Generated Content (AIGC).
no code implementations • 17 Feb 2025 • Andong Chen, Yuchen Song, Wenxin Zhu, Kehai Chen, Muyun Yang, Tiejun Zhao, Min Zhang
The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored.
no code implementations • 17 Feb 2025 • Hongbin Zhang, Kehai Chen, Xuefeng Bai, Xiucheng Li, Min Zhang
Large language models (LLMs) have succeeded remarkably in multilingual translation tasks.
1 code implementation • 18 Dec 2024 • Yingjie Zhu, Xuefeng Bai, Kehai Chen, Yang Xiang, Min Zhang
To study the reason behind these limitations, we propose VGCure, a comprehensive benchmark covering 22 tasks for examining the fundamental graph understanding and reasoning capacities of LVLMs.
no code implementations • 17 Dec 2024 • Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang
Large language models (LLMs) have demonstrated impressive multilingual understanding and reasoning capabilities, driven by extensive pre-training multilingual corpora and fine-tuning instruction data.
no code implementations • 17 Dec 2024 • Andong Chen, Yuchen Song, Kehai Chen, Muyun Yang, Tiejun Zhao, Min Zhang
Visual information has been introduced for enhancing machine translation (MT), and its effectiveness heavily relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations.
no code implementations • 17 Dec 2024 • Mufan Xu, Kehai Chen, Xuefeng Bai, Muyun Yang, Tiejun Zhao, Min Zhang
Large language models (LLMs) based on generative pre-trained Transformer have achieved remarkable performance on knowledge graph question-answering (KGQA) tasks.
no code implementations • 12 Dec 2024 • Meizhi Zhong, Xikai Liu, Chen Zhang, Yikun Lei, Yan Gao, Yao Hu, Kehai Chen, Min Zhang
To accelerate the inference of LLMs, storing computed caches in memory has become the standard technique.
1 code implementation • 10 Dec 2024 • Shaoqing Zhang, Zhuosheng Zhang, Kehai Chen, Rongxiang Weng, Muyun Yang, Tiejun Zhao, Min Zhang
This vulnerability poses significant risks to the real-world applications.
no code implementations • 18 Oct 2024 • Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, Dawei Song
Long-context efficiency has recently become a trending topic in serving large language models (LLMs).
no code implementations • 16 Oct 2024 • Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Yang Feng, Tiejun Zhao, Min Zhang
The remarkable understanding and generation capabilities of large language models (LLMs) have greatly improved translation performance.
1 code implementation • 2 Oct 2024 • Yu Zhang, Kehai Chen, Xuefeng Bai, Zhao Kang, Quanjiang Guo, Min Zhang
Knowledge graph question answering (KGQA) involves answering natural language questions by leveraging structured information stored in a knowledge graph.
1 code implementation • 1 Oct 2024 • Shaoqing Zhang, Zhuosheng Zhang, Kehai Chen, Xinbei Ma, Muyun Yang, Tiejun Zhao, Min Zhang
However, a key challenge lies in devising effective plans to guide action prediction in GUI tasks, though planning have been widely recognized as effective for decomposing complex tasks into a series of steps.
1 code implementation • 30 Aug 2024 • Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang
This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context language modeling by utilizing an external retriever for historical information retrieval.
no code implementations • 26 Aug 2024 • Zelin Li, Kehai Chen, Lemao Liu, Xuefeng Bai, Mingming Yang, Yang Xiang, Min Zhang
In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, revealing that 1) the distributions of importance score differ markedly among victim models, restricting the transferability; 2) the sequential attack processes induces substantial time overheads.
1 code implementation • 25 Aug 2024 • Xingzuo Li, Kehai Chen, Yunfei Long, Min Zhang
Large language models (LLMs) have created a new paradigm for natural language processing.
no code implementations • 19 Aug 2024 • Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang
Different from the traditional translation tasks, classical Chinese poetry translation requires both adequacy and fluency in translating culturally and historically significant content and linguistic poetic elegance.
no code implementations • 19 Jun 2024 • Meizhi Zhong, Chen Zhang, Yikun Lei, Xikai Liu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang
Enabling LLMs to handle lengthy context is currently a research hotspot.
no code implementations • 17 Jun 2024 • Ruili Jiang, Kehai Chen, Xuefeng Bai, Zhixuan He, Juntao Li, Muyun Yang, Tiejun Zhao, Liqiang Nie, Min Zhang
In this survey, we review the progress in exploring human preference learning for LLMs from a preference-centered perspective, covering the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
1 code implementation • 11 Jun 2024 • Meizhi Zhong, Kehai Chen, Zhengshan Xue, Lemao Liu, Mingming Yang, Min Zhang
It is widely known that hallucination is a critical issue in Simultaneous Machine Translation (SiMT) due to the absence of source-side information.
1 code implementation • 11 Jun 2024 • Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang
Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation.
1 code implementation • 11 Jun 2024 • Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang
Large language models (LLMs) have showcased impressive multilingual machine translation ability.
no code implementations • 17 May 2024 • Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang
To address the challenges of low-rank compression in LLMs, we conduct empirical research on the low-rank characteristics of large models.
no code implementations • 12 Feb 2024 • Zhengsheng Guo, Zhiwei He, Wenxiang Jiao, Xing Wang, Rui Wang, Kehai Chen, Zhaopeng Tu, Yong Xu, Min Zhang
Motivated by the success of unsupervised neural machine translation (UNMT), we introduce an unsupervised sign language translation and generation network (USLNet), which learns from abundant single-modality (text and video) data without parallel sign language data.
1 code implementation • 13 Nov 2023 • Meizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang
Simultaneous Machine Translation (SiMT) aims to yield a real-time partial translation with a monotonically growing the source-side context.
no code implementations • 9 Jan 2023 • Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao
Representation learning is the foundation of natural language processing (NLP).
1 code implementation • NAACL 2022 • Wang Xu, Kehai Chen, Lili Mou, Tiejun Zhao
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
Ranked #5 on
Dialog Relation Extraction
on DialogRE
(F1c (v1) metric)
Dialog Relation Extraction
Document-level Relation Extraction
+2
2 code implementations • Findings (ACL) 2021 • Wang Xu, Kehai Chen, Tiejun Zhao
Document-level relation extraction (DocRE) models generally use graph networks to implicitly model the reasoning skill (i. e., pattern recognition, logical reasoning, coreference reasoning, etc.)
Ranked #24 on
Relation Extraction
on DocRED
no code implementations • 11 Feb 2021 • Zuchao Li, Zhuosheng Zhang, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita
In this paper, we propose explicit and implicit text compression approaches to enhance the Transformer encoding and evaluate models using this approach on several typical downstream tasks that rely on the encoding heavily.
no code implementations • 1 Jan 2021 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita
Self-attention networks (SANs) have shown promising empirical results in various natural language processing tasks.
1 code implementation • 21 Dec 2020 • Wang Xu, Kehai Chen, Tiejun Zhao
In document-level relation extraction (DocRE), graph structure is generally used to encode relation information in the input document to classify the relation category between each entity pair, and has greatly advanced the DocRE task over the past several years.
Ranked #35 on
Relation Extraction
on DocRED
no code implementations • COLING 2020 • Zhenyu Zhao, Shuangzhi Wu, Muyun Yang, Kehai Chen, Tiejun Zhao
Neural models have achieved great success on the task of machine reading comprehension (MRC), which are typically trained on hard labels.
no code implementations • 11 Oct 2020 • Zuchao Li, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita
In this paper, we introduced our joint team SJTU-NICT 's participation in the WMT 2020 machine translation shared task.
no code implementations • ACL 2020 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita
Neural machine translation (NMT) encodes the source sentence in a universal way to generate the target sentence word-by-word.
1 code implementation • ICLR 2020 • Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao
Though visual information has been introduced for enhancing neural machine translation (NMT), its effectiveness strongly relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations.
no code implementations • ICLR 2020 • Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao
However, MLE focuses on once-to-all matching between the predicted sequence and gold-standard, consequently treating all incorrect predictions as being equally incorrect.
no code implementations • ACL 2020 • Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs.
no code implementations • NAACL 2021 • Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Unsupervised neural machine translation (UNMT) that relies solely on massive monolingual corpora has achieved remarkable results in several translation tasks.
no code implementations • 8 Apr 2020 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita
Thus, we propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
no code implementations • COLING 2020 • Haipeng Sun, Rui Wang, Kehai Chen, Xugang Lu, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Unsupervised neural machine translation (UNMT) has recently attracted great interest in the machine translation community.
no code implementations • 28 Feb 2020 • Chaoqun Duan, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Conghui Zhu, Tiejun Zhao
Existing neural machine translation (NMT) systems utilize sequence-to-sequence neural networks to generate target translation word by word, and then make the generated word at each time-step and the counterpart in the references as consistent as possible.
1 code implementation • 27 Dec 2019 • Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao
In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT.
no code implementations • 7 Nov 2019 • Zhuosheng Zhang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Hai Zhao
We present a universal framework to model contextualized sentence representations with visual awareness that is motivated to overcome the shortcomings of the multimodal parallel data with manual annotations.
no code implementations • IJCNLP 2019 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita
To address this issue, this work proposes a recurrent positional embedding approach based on word vector.
no code implementations • WS 2019 • Rui Wang, Haipeng Sun, Kehai Chen, Chenchen Ding, Masao Utiyama, Eiichiro Sumita
This paper presents the NICT{'}s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions.
no code implementations • 31 Oct 2019 • Shu Jiang, Rui Wang, Zuchao Li, Masao Utiyama, Kehai Chen, Eiichiro Sumita, Hai Zhao, Bao-liang Lu
Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network.
no code implementations • 26 Aug 2019 • Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao, Chenhui Chu
However, it has not been well-studied for unsupervised neural machine translation (UNMT), although UNMT has recently achieved remarkable results in several domain-specific language pairs.
no code implementations • WS 2019 • Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita
In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions.
no code implementations • WS 2019 • Benjamin Marie, Haipeng Sun, Rui Wang, Kehai Chen, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita
This paper presents the NICT{'}s participation in the WMT19 unsupervised news translation task.
no code implementations • ACL 2019 • Mingming Yang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Min Zhang, Tiejun Zhao
The training objective of neural machine translation (NMT) is to minimize the loss between the words in the translated sentences and those in the references.
no code implementations • ACL 2019 • Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
In previous methods, UBWE is first trained using non-parallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT.
no code implementations • ACL 2019 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita
The reordering model plays an important role in phrase-based statistical machine translation.
no code implementations • ACL 2019 • Fengshun Xiao, Jiangtong Li, Hai Zhao, Rui Wang, Kehai Chen
To integrate different segmentations with the state-of-the-art NMT model, Transformer, we propose lattice-based encoders to explore effective word or subword representation in an automatic way during training.
no code implementations • 12 Nov 2017 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
In this paper, we extend local attention with syntax-distance constraint, to focus on syntactically related source words with the predicted target word, thus learning a more effective context vector for word prediction.
no code implementations • IJCNLP 2017 • Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information.
1 code implementation • EMNLP 2017 • Rui Wang, Masao Utiyama, Lemao Liu, Kehai Chen, Eiichiro Sumita
Instance weighting has been widely applied to phrase-based machine translation domain adaptation.
no code implementations • EMNLP 2017 • Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, Tiejun Zhao
Source dependency information has been successfully introduced into statistical machine translation.