Search Results for author: Chengqing Zong

Found 34 papers, 14 papers with code

Entity-level Cross-modal Learning Improves Multi-modal Machine Translation

no code implementations • Findings (EMNLP) 2021 • Xin Huang, Jiajun Zhang, Chengqing Zong

Inspired by the findings of (CITATION) that entities are most informative in the image, we propose an explicit entity-level cross-modal learning approach that aims to augment the entity representation.

Machine Translation Representation Learning +1

Paper
Add Code

How Does the Experimental Setting Affect the Conclusions of Neural Encoding Models?

no code implementations • LREC 2022 • Xiaohan Zhang, Shaonan Wang, Chengqing Zong

Based on these results, we suggest a block-wise cross-validation training method and an adequate data size for increasing the performance of linear encoding models.

Paper
Add Code

Cross-Modal Cloze Task: A New Task to Brain-to-Word Decoding

1 code implementation • Findings (ACL) 2022 • Shuxian Zou, Shaonan Wang, Jiajun Zhang, Chengqing Zong

More importantly, it demonstrates that it is feasible to decode a certain word within a large vocabulary from its neural brain activity.

Binary Classification Language Modelling

Paper
Code

A Knowledge-driven Generative Model for Multi-implication Chinese Medical Procedure Entity Normalization

no code implementations • EMNLP 2020 • Jinghui Yan, Yining Wang, Lu Xiang, Yu Zhou, Chengqing Zong

Medical entity normalization, which links medical mentions in the text to entities in knowledge bases, is an important research topic in medical natural language processing.

Medical Procedure

Paper
Add Code

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

1 code implementation • 7 Apr 2024 • Junhong Wu, Yuchen Liu, Chengqing Zong

In the evolving landscape of Neural Machine Translation (NMT), the pretrain-then-finetune paradigm has yielded impressive results.

Continual Learning Machine Translation +2

Paper
Code

MapGuide: A Simple yet Effective Method to Reconstruct Continuous Language from Brain Activities

no code implementations • 26 Mar 2024 • Xinpei Zhao, Jingyuan Sun, Shaonan Wang, Jing Ye, Xiaohan Zhang, Chengqing Zong

In contrast, we propose a simple yet effective method that guides text reconstruction by directly comparing them with the predicted text embeddings mapped from brain activities.

Text Generation

Paper
Add Code

Computational Models to Study Language Processing in the Human Brain: A Survey

no code implementations • 20 Mar 2024 • Shaonan Wang, Jingyuan Sun, Yunhao Zhang, Nan Lin, Marie-Francine Moens, Chengqing Zong

Despite differing from the human language processing mechanism in implementation and algorithms, current language models demonstrate remarkable human-like or surpassing language capabilities.

Paper
Add Code

MulCogBench: A Multi-modal Cognitive Benchmark Dataset for Evaluating Chinese and English Computational Language Models

no code implementations • 2 Mar 2024 • Yunhao Zhang, Xiaohan Zhang, Chong Li, Shaonan Wang, Chengqing Zong

Results show that language models share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity.

Paper
Add Code

MoDS: Model-oriented Data Selection for Instruction Tuning

1 code implementation • 27 Nov 2023 • Qianlong Du, Chengqing Zong, Jiajun Zhang

First, our approach utilizes a quality evaluation model to filter out the high-quality subset from the original instruction dataset, and then designs an algorithm to further select from the high-quality subset a seed instruction dataset with good coverage.

Instruction Following

Paper
Code

Align after Pre-train: Improving Multilingual Generative Models with Cross-lingual Alignment

no code implementations • 14 Nov 2023 • Chong Li, Shaonan Wang, Jiajun Zhang, Chengqing Zong

It aligns the internal sentence representations across different languages via multilingual contrastive learning and aligns model outputs by answering prompts in different languages.

Contrastive Learning Sentence

Paper
Add Code

ChineseWebText: Large-scale High-quality Chinese Web Text Extracted with Effective Evaluation Model

1 code implementation • 2 Nov 2023 • Jianghao Chen, Pu Jian, Tengxiao Xi, Dongyi Yi, Qianlong Du, Chenglin Ding, Guibo Zhu, Chengqing Zong, Jinqiao Wang, Jiajun Zhang

Using our proposed approach, we release the largest and latest large-scale high-quality Chinese web text ChineseWebText, which consists of 1. 42 TB and each text is associated with a quality score, facilitating the LLM researchers to choose the data according to the desired quality thresholds.

111

Paper
Code

Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning

1 code implementation • 16 Oct 2023 • Chong Li, Shaonan Wang, Yunhao Zhang, Jiajun Zhang, Chengqing Zong

We further propose a simple multi-task training method to increase functional specialization and mitigate negative information transfer in multi-task learning.

Multi-Task Learning

Paper
Code

BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing

1 code implementation • 2 Sep 2023 • Chen Wang, Minpeng Liao, Zhongqiang Huang, Jinliang Lu, Junhong Wu, Yuchen Liu, Chengqing Zong, Jiajun Zhang

One is a cascaded approach where outputs (tokens or states) of a separately trained speech recognition system are used as inputs for LLMs, which limits their potential in modeling alignment between speech and text.

speech-recognition Speech Recognition +1

Paper
Code

CFSum: A Coarse-to-Fine Contribution Network for Multimodal Summarization

1 code implementation • 6 Jul 2023 • Min Xiao, Junnan Zhu, Haitao Lin, Yu Zhou, Chengqing Zong

Therefore, we propose a novel Coarse-to-Fine contribution network for multimodal Summarization (CFSum) to consider different contributions of images for summarization.

Paper
Code

BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

2 code implementations • 29 May 2023 • Wen Yang, Chong Li, Jiajun Zhang, Chengqing Zong

Second, we continue training the model with a large-scale parallel dataset that covers 102 natural languages.

Translation

197

Paper
Code

E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation

1 code implementation • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong

Furthermore, the ablation studies verify the generalization of our method, where the proposed modal adapter is effective to bridge various OCR and MT models.

Machine Translation Optical Character Recognition +2

Paper
Code

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

no code implementations • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong

Text image machine translation (TIMT) has been widely used in various real-world applications, which translates source language texts in images into another target language sentence.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Language Cognition and Language Computation -- Human and Machine Language Understanding

no code implementations • 12 Jan 2023 • Shaonan Wang, Nai Ding, Nan Lin, Jiajun Zhang, Chengqing Zong

Language understanding is a key scientific issue in the fields of cognitive and computer science.

Paper
Add Code

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

no code implementations • 6 Dec 2022 • Yang Zhao, Junnan Zhu, Lu Xiang, Jiajun Zhang, Yu Zhou, FeiFei Zhai, Chengqing Zong

To alleviate the CF, we investigate knowledge distillation based life-long learning methods.

Knowledge Distillation Machine Translation +1

Paper
Add Code

Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation

1 code implementation • 18 Oct 2022 • Chen Wang, Yuchen Liu, Boxing Chen, Jiajun Zhang, Wei Luo, Zhongqiang Huang, Chengqing Zong

Existing zero-shot methods fail to align the two modalities of speech and text into a shared semantic space, resulting in much worse performance compared to the supervised ST methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization via Role Interactions

2 code implementations • ACL 2022 • Haitao Lin, Junnan Zhu, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong

Therefore, we propose a novel role interaction enhanced method for role-oriented dialogue summarization.

896

Paper
Code

Instance-aware Prompt Learning for Language Understanding and Generation

1 code implementation • 18 Jan 2022 • Feihu Jin, Jinliang Lu, Jiajun Zhang, Chengqing Zong

Specifically, we suppose that each learnable prompt token has a different contribution to different instances, and we learn the contribution by calculating the relevance score between an instance and each prompt token.

Few-Shot Learning

Paper
Code

Learning to Select the Next Reasonable Mention for Entity Linking

no code implementations • 8 Dec 2021 • Jian Sun, Yu Zhou, Chengqing Zong

To address the problem, we propose a novel model, called DyMen, to dynamically adjust the subsequent linking target based on the previously linked entities via reinforcement learning, enabling the model to select a link target that can fully use previously linked information.

Entity Linking Knowledge Graphs +2

Paper
Add Code

Towards Brain-to-Text Generation: Neural Decoding with Pre-trained Encoder-Decoder Models

no code implementations • NeurIPS Workshop AI4Scien 2021 • Shuxian Zou, Shaonan Wang, Jiajun Zhang, Chengqing Zong

However, most of the existing studies have focused on discriminating which one in two stimuli corresponds to the given brain image, which is far from directly generating text from neural activities.

Text Generation

Paper
Add Code

CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue Summarization

2 code implementations • EMNLP 2021 • Haitao Lin, Liqun Ma, Junnan Zhu, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong

Therefore, in this paper, we introduce a novel Chinese dataset for Customer Service Dialogue Summarization (CSDS).

Paper
Code

Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models

1 code implementation • 19 Aug 2021 • Haitao Lin, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong

We propose two strategies for finetuning process: value-based and context-based augmentation.

Data Augmentation slot-filling +2

Paper
Code

Distributed Representations of Emotion Categories in Emotion Space

no code implementations • ACL 2021 • Xiangyu Wang, Chengqing Zong

Emotion category is usually divided into different ones by human beings, but it is indeed difficult to clearly distinguish and define the boundaries between different emotion categories.

Emotion Classification

Paper
Add Code

Touch Editing: A Flexible One-Time Interaction Approach for Translation

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Qian Wang, Jiajun Zhang, Lemao Liu, Guoping Huang, Chengqing Zong

We propose a touch-based editing method for translation, which is more flexible than traditional keyboard-mouse-based translation postediting.

Translation

Paper
Add Code

Multimodal Sentence Summarization via Multimodal Selective Encoding

no code implementations • COLING 2020 • Haoran Li, Junnan Zhu, Jiajun Zhang, Xiaodong He, Chengqing Zong

Thus, we propose a multimodal selective gate network that considers reciprocal relationships between textual and multi-level visual features, including global image descriptor, activation grids, and object proposals, to select highlights of the event when encoding the source sentence.

Sentence Sentence Summarization

Paper
Add Code

Knowledge Graph Enhanced Neural Machine Translation via Multi-task Learning on Sub-entity Granularity

no code implementations • COLING 2020 • Yang Zhao, Lu Xiang, Junnan Zhu, Jiajun Zhang, Yu Zhou, Chengqing Zong

Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized.

Machine Translation Multi-Task Learning +3

Paper
Add Code

Distill and Replay for Continual Language Learning

no code implementations • COLING 2020 • Jingyuan Sun, Shaonan Wang, Jiajun Zhang, Chengqing Zong

The framework is based on language models and can be smoothly built with different language model architectures.

Language Modelling Natural Language Understanding

Paper
Add Code

Dual Attention Network for Cross-lingual Entity Alignment

no code implementations • COLING 2020 • Jian Sun, Yu Zhou, Chengqing Zong

The hierarchical attention adaptively aggregates the low-hierarchy and the high-hierarchy information, which is beneficial to balance the neighborhood information of counterpart entities and distinguish non-counterpart entities with similar structures.

Entity Alignment Graph Attention +2

Paper
Add Code

Bridging the Modality Gap for Speech-to-Text Translation

no code implementations • 28 Oct 2020 • Yuchen Liu, Junnan Zhu, Jiajun Zhang, Chengqing Zong

End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way.

Speech-to-Text Translation Translation

Paper
Add Code

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

no code implementations • EMNLP 2020 • Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence.

Machine Translation reinforcement-learning +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.