Search Results for author: Kehai Chen

Found 68 papers, 18 papers with code

Syntax in End-to-End Natural Language Processing

no code implementations EMNLP (ACL) 2021 Hai Zhao, Rui Wang, Kehai Chen

This tutorial surveys the latest technical progress of syntactic parsing and the role of syntax in end-to-end natural language processing (NLP) tasks, in which semantic role labeling (SRL) and machine translation (MT) are the representative NLP tasks that have always been beneficial from informative syntactic clues since a long time ago, though the advance from end-to-end deep learning models shows new results.

Machine Translation NMT +2

Synchronous Refinement for Neural Machine Translation

no code implementations Findings (ACL) 2022 Kehai Chen, Masao Utiyama, Eiichiro Sumita, Rui Wang, Min Zhang

Machine translation typically adopts an encoder-to-decoder framework, in which the decoder generates the target sentence word-by-word in an auto-regressive manner.

Decoder Machine Translation +2

Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory

1 code implementation21 May 2025 Hongli Zhou, Hui Huang, Ziqing Zhao, Lvyuan Han, Huicheng Wang, Kehai Chen, Muyun Yang, Wei Bao, Jian Dong, Bing Xu, Conghui Zhu, Hailong Cao, Tiejun Zhao

The evaluation of large language models (LLMs) via benchmarks is widespread, yet inconsistencies between different leaderboards and poor separability among top models raise concerns about their ability to accurately reflect authentic model capabilities.

Benchmarking Language Modeling +2

MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments

no code implementations18 Mar 2025 Zhengsheng Guo, Linwei Zheng, Xinyang Chen, Xuefeng Bai, Kehai Chen, Min Zhang

While human cognition inherently retrieves information from diverse and specialized knowledge sources during decision-making processes, current Retrieval-Augmented Generation (RAG) systems typically operate through single-source knowledge retrieval, leading to a cognitive-algorithmic discrepancy.

Decision Making Language Modeling +4

Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model

no code implementations13 Mar 2025 Qiyuan Deng, Xuefeng Bai, Kehai Chen, YaoWei Wang, Liqiang Nie, Min Zhang

Reinforcement Learning (RL) algorithms for safety alignment of Large Language Models (LLMs), such as Direct Preference Optimization (DPO), encounter the challenge of distribution shift.

Language Modeling Language Modelling +4

Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation

no code implementations13 Mar 2025 Henglyu Liu, Andong Chen, Kehai Chen, Xuefeng Bai, Meizhi Zhong, Yuan Qiu, Min Zhang

Recent advancement of large language models (LLMs) has led to significant breakthroughs across various tasks, laying the foundation for the development of LLM-based speech translation systems.

Cross-Modal Retrieval Translation

Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs

no code implementations8 Mar 2025 Dingkun Zhang, Shuhan Qi, Xinyu Xiao, Kehai Chen, Xuan Wang

Considering the heavy cost of training MLLMs, it is necessary to reuse the existing ones and further extend them to more modalities through Modality-incremental Continual Learning (MCL).

Continual Learning

Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent

1 code implementation4 Mar 2025 Xingzuo Li, Kehai Chen, Yunfei Long, Xuefeng Bai, Yong Xu, Min Zhang

Large language model (LLM) agents typically adopt a step-by-step reasoning framework, in which they interleave the processes of thinking and acting to accomplish the given task.

Decision Making Language Modeling +2

The Rise of Darkness: Safety-Utility Trade-Offs in Role-Playing Dialogue Agents

no code implementations28 Feb 2025 Yihong Tang, Kehai Chen, Xuefeng Bai, ZhengYu Niu, Bo wang, Jie Liu, Min Zhang

Large Language Models (LLMs) have made remarkable advances in role-playing dialogue agents, demonstrating their utility in character simulations.

ASurvey: Spatiotemporal Consistency in Video Generation

no code implementations25 Feb 2025 Zhiyu Yin, Kehai Chen, Xuefeng Bai, Ruili Jiang, Juntao Li, Hongdong Li, Jin Liu, Yang Xiang, Jun Yu, Min Zhang

Video generation, by leveraging a dynamic visual generation method, pushes the boundaries of Artificial Intelligence Generated Content (AIGC).

Image Generation Video Generation

Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis

no code implementations17 Feb 2025 Andong Chen, Yuchen Song, Wenxin Zhu, Kehai Chen, Muyun Yang, Tiejun Zhao, Min Zhang

The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored.

Machine Translation Translation

Exploring Translation Mechanism of Large Language Models

no code implementations17 Feb 2025 Hongbin Zhang, Kehai Chen, Xuefeng Bai, Xiucheng Li, Min Zhang

Large language models (LLMs) have succeeded remarkably in multilingual translation tasks.

Translation

Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning

1 code implementation18 Dec 2024 Yingjie Zhu, Xuefeng Bai, Kehai Chen, Yang Xiang, Min Zhang

To study the reason behind these limitations, we propose VGCure, a comprehensive benchmark covering 22 tasks for examining the fundamental graph understanding and reasoning capacities of LVLMs.

Benchmarking Graph Learning +1

LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks

no code implementations17 Dec 2024 Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang

Large language models (LLMs) have demonstrated impressive multilingual understanding and reasoning capabilities, driven by extensive pre-training multilingual corpora and fine-tuning instruction data.

Math

Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation

no code implementations17 Dec 2024 Andong Chen, Yuchen Song, Kehai Chen, Muyun Yang, Tiejun Zhao, Min Zhang

Visual information has been introduced for enhancing machine translation (MT), and its effectiveness heavily relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations.

Language Modeling Language Modelling +4

LLM-based Discriminative Reasoning for Knowledge Graph Question Answering

no code implementations17 Dec 2024 Mufan Xu, Kehai Chen, Xuefeng Bai, Muyun Yang, Tiejun Zhao, Min Zhang

Large language models (LLMs) based on generative pre-trained Transformer have achieved remarkable performance on knowledge graph question-answering (KGQA) tasks.

Graph Question Answering Question Answering

ZigZagkv: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty

no code implementations12 Dec 2024 Meizhi Zhong, Xikai Liu, Chen Zhang, Yikun Lei, Yan Gao, Yao Hu, Kehai Chen, Min Zhang

To accelerate the inference of LLMs, storing computed caches in memory has become the standard technique.

MoDification: Mixture of Depths Made Easy

no code implementations18 Oct 2024 Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, Dawei Song

Long-context efficiency has recently become a trending topic in serving large language models (LLMs).

LLM-based Translation Inference with Iterative Bilingual Understanding

no code implementations16 Oct 2024 Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Yang Feng, Tiejun Zhao, Min Zhang

The remarkable understanding and generation capabilities of large language models (LLMs) have greatly improved translation performance.

Sentence Translation

Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Question Answering

1 code implementation2 Oct 2024 Yu Zhang, Kehai Chen, Xuefeng Bai, Zhao Kang, Quanjiang Guo, Min Zhang

Knowledge graph question answering (KGQA) involves answering natural language questions by leveraging structured information stored in a knowledge graph.

Graph Question Answering Question Answering

Dynamic Planning for LLM-based Graphical User Interface Automation

1 code implementation1 Oct 2024 Shaoqing Zhang, Zhuosheng Zhang, Kehai Chen, Xinbei Ma, Muyun Yang, Tiejun Zhao, Min Zhang

However, a key challenge lies in devising effective plans to guide action prediction in GUI tasks, though planning have been widely recognized as effective for decomposing complex tasks into a series of steps.

MemLong: Memory-Augmented Retrieval for Long Text Modeling

1 code implementation30 Aug 2024 Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang

This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context language modeling by utilizing an external retriever for historical information retrieval.

4k Decoder +5

TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models

no code implementations26 Aug 2024 Zelin Li, Kehai Chen, Lemao Liu, Xuefeng Bai, Mingming Yang, Yang Xiang, Min Zhang

In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, revealing that 1) the distributions of importance score differ markedly among victim models, restricting the transferability; 2) the sequential attack processes induces substantial time overheads.

Adversarial Attack

Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving

no code implementations19 Aug 2024 Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

Different from the traditional translation tasks, classical Chinese poetry translation requires both adequacy and fluency in translating culturally and historically significant content and linguistic poetic elegance.

Benchmarking Machine Translation +1

A Survey on Human Preference Learning for Large Language Models

no code implementations17 Jun 2024 Ruili Jiang, Kehai Chen, Xuefeng Bai, Zhixuan He, Juntao Li, Muyun Yang, Tiejun Zhao, Liqiang Nie, Min Zhang

In this survey, we review the progress in exploring human preference learning for LLMs from a preference-centered perspective, covering the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.

On the Hallucination in Simultaneous Machine Translation

1 code implementation11 Jun 2024 Meizhi Zhong, Kehai Chen, Zhengshan Xue, Lemao Liu, Mingming Yang, Min Zhang

It is widely known that hallucination is a critical issue in Simultaneous Machine Translation (SiMT) due to the absence of source-side information.

Hallucination Machine Translation +1

Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization

no code implementations17 May 2024 Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang

To address the challenges of low-rank compression in LLMs, we conduct empirical research on the low-rank characteristics of large models.

Bayesian Optimization Low-rank compression

Unsupervised Sign Language Translation and Generation

no code implementations12 Feb 2024 Zhengsheng Guo, Zhiwei He, Wenxiang Jiao, Xing Wang, Rui Wang, Kehai Chen, Zhaopeng Tu, Yong Xu, Min Zhang

Motivated by the success of unsupervised neural machine translation (UNMT), we introduce an unsupervised sign language translation and generation network (USLNet), which learns from abundant single-modality (text and video) data without parallel sign language data.

Machine Translation Sign Language Translation +1

Context Consistency between Training and Testing in Simultaneous Machine Translation

1 code implementation13 Nov 2023 Meizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang

Simultaneous Machine Translation (SiMT) aims to yield a real-time partial translation with a monotonically growing the source-side context.

Machine Translation Translation

Discriminative Reasoning for Document-level Relation Extraction

2 code implementations Findings (ACL) 2021 Wang Xu, Kehai Chen, Tiejun Zhao

Document-level relation extraction (DocRE) models generally use graph networks to implicitly model the reasoning skill (i. e., pattern recognition, logical reasoning, coreference reasoning, etc.)

Document-level Relation Extraction Logical Reasoning +1

Text Compression-aided Transformer Encoding

no code implementations11 Feb 2021 Zuchao Li, Zhuosheng Zhang, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita

In this paper, we propose explicit and implicit text compression approaches to enhance the Transformer encoding and evaluate models using this approach on several typical downstream tasks that rely on the encoding heavily.

Text Compression

Prior Knowledge Representation for Self-Attention Networks

no code implementations1 Jan 2021 Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita

Self-attention networks (SANs) have shown promising empirical results in various natural language processing tasks.

Translation

Document-Level Relation Extraction with Reconstruction

1 code implementation21 Dec 2020 Wang Xu, Kehai Chen, Tiejun Zhao

In document-level relation extraction (DocRE), graph structure is generally used to encode relation information in the input document to classify the relation category between each entity pair, and has greatly advanced the DocRE task over the past several years.

Document-level Relation Extraction Relation +1

Robust Machine Reading Comprehension by Learning Soft labels

no code implementations COLING 2020 Zhenyu Zhao, Shuangzhi Wu, Muyun Yang, Kehai Chen, Tiejun Zhao

Neural models have achieved great success on the task of machine reading comprehension (MRC), which are typically trained on hard labels.

Machine Reading Comprehension

Content Word Aware Neural Machine Translation

no code implementations ACL 2020 Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita

Neural machine translation (NMT) encodes the source sentence in a universal way to generate the target sentence word-by-word.

Machine Translation NMT +2

Neural Machine Translation with Universal Visual Representation

1 code implementation ICLR 2020 Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Though visual information has been introduced for enhancing neural machine translation (NMT), its effectiveness strongly relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations.

Decoder Machine Translation +3

Data-dependent Gaussian Prior Objective for Language Generation

no code implementations ICLR 2020 Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

However, MLE focuses on once-to-all matching between the predicted sequence and gold-standard, consequently treating all incorrect predictions as being equally incorrect.

Diversity Image Captioning +5

Self-Training for Unsupervised Neural Machine Translation in Unbalanced Training Data Scenarios

no code implementations NAACL 2021 Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

Unsupervised neural machine translation (UNMT) that relies solely on massive monolingual corpora has achieved remarkable results in several translation tasks.

Machine Translation Translation

Explicit Reordering for Neural Machine Translation

no code implementations8 Apr 2020 Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita

Thus, we propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.

Machine Translation NMT +2

Modeling Future Cost for Neural Machine Translation

no code implementations28 Feb 2020 Chaoqun Duan, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Conghui Zhu, Tiejun Zhao

Existing neural machine translation (NMT) systems utilize sequence-to-sequence neural networks to generate target translation word by word, and then make the generated word at each time-step and the counterpart in the references as consistent as possible.

Machine Translation NMT +1

Explicit Sentence Compression for Neural Machine Translation

1 code implementation27 Dec 2019 Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT.

Decoder Machine Translation +4

Probing Contextualized Sentence Representations with Visual Awareness

no code implementations7 Nov 2019 Zhuosheng Zhang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Hai Zhao

We present a universal framework to model contextualized sentence representations with visual awareness that is motivated to overcome the shortcomings of the multimodal parallel data with manual annotations.

Diversity Machine Translation +3

English-Myanmar Supervised and Unsupervised NMT: NICT's Machine Translation Systems at WAT-2019

no code implementations WS 2019 Rui Wang, Haipeng Sun, Kehai Chen, Chenchen Ding, Masao Utiyama, Eiichiro Sumita

This paper presents the NICT{'}s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions.

Language Modeling Language Modelling +3

Document-level Neural Machine Translation with Associated Memory Network

no code implementations31 Oct 2019 Shu Jiang, Rui Wang, Zuchao Li, Masao Utiyama, Kehai Chen, Eiichiro Sumita, Hai Zhao, Bao-liang Lu

Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network.

Machine Translation NMT +2

Revisiting Simple Domain Adaptation Methods in Unsupervised Neural Machine Translation

no code implementations26 Aug 2019 Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao, Chenhui Chu

However, it has not been well-studied for unsupervised neural machine translation (UNMT), although UNMT has recently achieved remarkable results in several domain-specific language pairs.

Domain Adaptation Machine Translation +1

NICT's Supervised Neural Machine Translation Systems for the WMT19 News Translation Task

no code implementations WS 2019 Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita

In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions.

Machine Translation NMT +2

Sentence-Level Agreement for Neural Machine Translation

no code implementations ACL 2019 Mingming Yang, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Min Zhang, Tiejun Zhao

The training objective of neural machine translation (NMT) is to minimize the loss between the words in the translated sentences and those in the references.

Machine Translation NMT +2

Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation

no code implementations ACL 2019 Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

In previous methods, UBWE is first trained using non-parallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT.

Decoder Denoising +2

Lattice-Based Transformer Encoder for Neural Machine Translation

no code implementations ACL 2019 Fengshun Xiao, Jiangtong Li, Hai Zhao, Rui Wang, Kehai Chen

To integrate different segmentations with the state-of-the-art NMT model, Transformer, we propose lattice-based encoders to explore effective word or subword representation in an automatic way during training.

Diversity Machine Translation +2

Syntax-Directed Attention for Neural Machine Translation

no code implementations12 Nov 2017 Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

In this paper, we extend local attention with syntax-distance constraint, to focus on syntactically related source words with the predicted target word, thus learning a more effective context vector for word prediction.

Machine Translation NMT +1

Context-Aware Smoothing for Neural Machine Translation

no code implementations IJCNLP 2017 Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information.

Machine Translation NMT +3

Cannot find the paper you are looking for? You can Submit a new open access paper.