no code implementations • Findings (EMNLP) 2021 • Xin Huang, Jiajun Zhang, Chengqing Zong
Inspired by the findings of (CITATION) that entities are most informative in the image, we propose an explicit entity-level cross-modal learning approach that aims to augment the entity representation.
no code implementations • LREC 2022 • Xiaohan Zhang, Shaonan Wang, Chengqing Zong
Based on these results, we suggest a block-wise cross-validation training method and an adequate data size for increasing the performance of linear encoding models.
1 code implementation • Findings (ACL) 2022 • Shuxian Zou, Shaonan Wang, Jiajun Zhang, Chengqing Zong
More importantly, it demonstrates that it is feasible to decode a certain word within a large vocabulary from its neural brain activity.
no code implementations • EMNLP 2020 • Jinghui Yan, Yining Wang, Lu Xiang, Yu Zhou, Chengqing Zong
Medical entity normalization, which links medical mentions in the text to entities in knowledge bases, is an important research topic in medical natural language processing.
1 code implementation • 7 Apr 2024 • Junhong Wu, Yuchen Liu, Chengqing Zong
In the evolving landscape of Neural Machine Translation (NMT), the pretrain-then-finetune paradigm has yielded impressive results.
no code implementations • 26 Mar 2024 • Xinpei Zhao, Jingyuan Sun, Shaonan Wang, Jing Ye, Xiaohan Zhang, Chengqing Zong
In contrast, we propose a simple yet effective method that guides text reconstruction by directly comparing them with the predicted text embeddings mapped from brain activities.
no code implementations • 20 Mar 2024 • Shaonan Wang, Jingyuan Sun, Yunhao Zhang, Nan Lin, Marie-Francine Moens, Chengqing Zong
Despite differing from the human language processing mechanism in implementation and algorithms, current language models demonstrate remarkable human-like or surpassing language capabilities.
no code implementations • 2 Mar 2024 • Yunhao Zhang, Xiaohan Zhang, Chong Li, Shaonan Wang, Chengqing Zong
Results show that language models share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity.
1 code implementation • 27 Nov 2023 • Qianlong Du, Chengqing Zong, Jiajun Zhang
First, our approach utilizes a quality evaluation model to filter out the high-quality subset from the original instruction dataset, and then designs an algorithm to further select from the high-quality subset a seed instruction dataset with good coverage.
no code implementations • 14 Nov 2023 • Chong Li, Shaonan Wang, Jiajun Zhang, Chengqing Zong
It aligns the internal sentence representations across different languages via multilingual contrastive learning and aligns model outputs by answering prompts in different languages.
1 code implementation • 2 Nov 2023 • Jianghao Chen, Pu Jian, Tengxiao Xi, Dongyi Yi, Qianlong Du, Chenglin Ding, Guibo Zhu, Chengqing Zong, Jinqiao Wang, Jiajun Zhang
Using our proposed approach, we release the largest and latest large-scale high-quality Chinese web text ChineseWebText, which consists of 1. 42 TB and each text is associated with a quality score, facilitating the LLM researchers to choose the data according to the desired quality thresholds.
1 code implementation • 16 Oct 2023 • Chong Li, Shaonan Wang, Yunhao Zhang, Jiajun Zhang, Chengqing Zong
We further propose a simple multi-task training method to increase functional specialization and mitigate negative information transfer in multi-task learning.
1 code implementation • 2 Sep 2023 • Chen Wang, Minpeng Liao, Zhongqiang Huang, Jinliang Lu, Junhong Wu, Yuchen Liu, Chengqing Zong, Jiajun Zhang
One is a cascaded approach where outputs (tokens or states) of a separately trained speech recognition system are used as inputs for LLMs, which limits their potential in modeling alignment between speech and text.
1 code implementation • 6 Jul 2023 • Min Xiao, Junnan Zhu, Haitao Lin, Yu Zhou, Chengqing Zong
Therefore, we propose a novel Coarse-to-Fine contribution network for multimodal Summarization (CFSum) to consider different contributions of images for summarization.
2 code implementations • 29 May 2023 • Wen Yang, Chong Li, Jiajun Zhang, Chengqing Zong
Second, we continue training the model with a large-scale parallel dataset that covers 102 natural languages.
1 code implementation • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong
Furthermore, the ablation studies verify the generalization of our method, where the proposed modal adapter is effective to bridge various OCR and MT models.
no code implementations • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong
Text image machine translation (TIMT) has been widely used in various real-world applications, which translates source language texts in images into another target language sentence.
no code implementations • 12 Jan 2023 • Shaonan Wang, Nai Ding, Nan Lin, Jiajun Zhang, Chengqing Zong
Language understanding is a key scientific issue in the fields of cognitive and computer science.
no code implementations • 6 Dec 2022 • Yang Zhao, Junnan Zhu, Lu Xiang, Jiajun Zhang, Yu Zhou, FeiFei Zhai, Chengqing Zong
To alleviate the CF, we investigate knowledge distillation based life-long learning methods.
1 code implementation • 18 Oct 2022 • Chen Wang, Yuchen Liu, Boxing Chen, Jiajun Zhang, Wei Luo, Zhongqiang Huang, Chengqing Zong
Existing zero-shot methods fail to align the two modalities of speech and text into a shared semantic space, resulting in much worse performance compared to the supervised ST methods.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
2 code implementations • ACL 2022 • Haitao Lin, Junnan Zhu, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong
Therefore, we propose a novel role interaction enhanced method for role-oriented dialogue summarization.
1 code implementation • 18 Jan 2022 • Feihu Jin, Jinliang Lu, Jiajun Zhang, Chengqing Zong
Specifically, we suppose that each learnable prompt token has a different contribution to different instances, and we learn the contribution by calculating the relevance score between an instance and each prompt token.
no code implementations • 8 Dec 2021 • Jian Sun, Yu Zhou, Chengqing Zong
To address the problem, we propose a novel model, called DyMen, to dynamically adjust the subsequent linking target based on the previously linked entities via reinforcement learning, enabling the model to select a link target that can fully use previously linked information.
no code implementations • NeurIPS Workshop AI4Scien 2021 • Shuxian Zou, Shaonan Wang, Jiajun Zhang, Chengqing Zong
However, most of the existing studies have focused on discriminating which one in two stimuli corresponds to the given brain image, which is far from directly generating text from neural activities.
2 code implementations • EMNLP 2021 • Haitao Lin, Liqun Ma, Junnan Zhu, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong
Therefore, in this paper, we introduce a novel Chinese dataset for Customer Service Dialogue Summarization (CSDS).
1 code implementation • 19 Aug 2021 • Haitao Lin, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong
We propose two strategies for finetuning process: value-based and context-based augmentation.
no code implementations • ACL 2021 • Xiangyu Wang, Chengqing Zong
Emotion category is usually divided into different ones by human beings, but it is indeed difficult to clearly distinguish and define the boundaries between different emotion categories.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Qian Wang, Jiajun Zhang, Lemao Liu, Guoping Huang, Chengqing Zong
We propose a touch-based editing method for translation, which is more flexible than traditional keyboard-mouse-based translation postediting.
no code implementations • COLING 2020 • Haoran Li, Junnan Zhu, Jiajun Zhang, Xiaodong He, Chengqing Zong
Thus, we propose a multimodal selective gate network that considers reciprocal relationships between textual and multi-level visual features, including global image descriptor, activation grids, and object proposals, to select highlights of the event when encoding the source sentence.
no code implementations • COLING 2020 • Yang Zhao, Lu Xiang, Junnan Zhu, Jiajun Zhang, Yu Zhou, Chengqing Zong
Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized.
no code implementations • COLING 2020 • Jingyuan Sun, Shaonan Wang, Jiajun Zhang, Chengqing Zong
The framework is based on language models and can be smoothly built with different language model architectures.
no code implementations • COLING 2020 • Jian Sun, Yu Zhou, Chengqing Zong
The hierarchical attention adaptively aggregates the low-hierarchy and the high-hierarchy information, which is beneficial to balance the neighborhood information of counterpart entities and distinguish non-counterpart entities with similar structures.
no code implementations • 28 Oct 2020 • Yuchen Liu, Junnan Zhu, Jiajun Zhang, Chengqing Zong
End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way.
no code implementations • EMNLP 2020 • Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong
Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence.