Search Results for author: Zhengtao Yu

Found 38 papers, 12 papers with code

Rˆ3Net:Relation-embedded Representation Reconstruction Network for Change Captioning

1 code implementation EMNLP 2021 Yunbin Tu, Liang Li, Chenggang Yan, Shengxiang Gao, Zhengtao Yu

In this paper, we propose a Relation-embedded Representation Reconstruction Network (Rˆ3Net) to explicitly distinguish the real change from the large amount of clutter and irrelevant changes.

Caption Generation Relation +1

基于阅读理解的汉越跨语言新闻事件要素抽取方法(News Events Element Extraction of Chinese-Vietnamese Cross-language Using Reading Comprehension)

no code implementations CCL 2021 Enchang Zhu, Zhengtao Yu, Chengxiang Gao, Yuxin Huang, Junjun Guo

“新闻事件要素抽取旨在抽取新闻文本中描述主题事件的事件要素, 如时间、地点、人物和组织机构名等。传统的事件要素抽取方法在资源稀缺型语言上性能欠佳, 且对长文本语义建模困难。对此, 本文提出了基于阅读理解的汉越跨语言新闻事件要素抽取方法。该方法首先利用新闻长文本关键句检索模块过滤含噪声的句子。然后利用跨语言阅读理解模型将富资源语言知识迁移到越南语, 提高越南语新闻事件要素抽取的性能。在自建的汉越双语新闻事件要素抽取数据集上的实验证明了本文方法的有效性。”

Reading Comprehension

基于图文细粒度对齐语义引导的多模态神经机器翻译方法(Based on Semantic Guidance of Fine-grained Alignment of Image-Text for Multi-modal Neural Machine Translation)

no code implementations CCL 2022 Junjie Ye, Junjun Guo, Kaiwen Tan, Yan Xiang, Zhengtao Yu

“多模态神经机器翻译旨在利用视觉信息来提高文本翻译质量。传统多模态机器翻译将图像的全局语义信息融入到翻译模型, 而忽略了图像的细粒度信息对翻译质量的影响。对此, 该文提出一种基于图文细粒度对齐语义引导的多模态神经机器翻译方法, 该方法首先跨模态交互图文信息, 以提取图文细粒度对齐语义信息, 然后以图文细粒度对齐语义信息为枢纽, 采用门控机制将多模态细粒度信息对齐到文本信息上, 实现图文多模态特征融合。在多模态机器翻译基准数据集Multi30K 英语→德语、英语→法语以及英语→捷克语翻译任务上的实验结果表明, 论文提出方法的有效性, 并且优于大多数最先进的多模态机器翻译方法。”

Machine Translation

多特征融合的越英端到端语音翻译方法(A Vietnamese-English end-to-end speech translation method based on multi-feature fusion)

no code implementations CCL 2022 Houli Ma, Ling Dong, Wenjun Wang, Jian Wang, Shengxiang Gao, Zhengtao Yu

“语音翻译的编码器需要同时编码语音中的声学和语义信息, 单一的Fbank或Wav2vec2语音特征表征能力存在不足。本文通过分析人工的Fbank特征与自监督的Wav2vec2特征间的差异性, 提出基于交叉注意力机制的声学特征融合方法, 并探究了不同的自监督特征和融合方式, 加强模型对语音中声学和语义信息的学习。结合越南语语音特点, 以Fbank特征为主、Pitch特征为辅混合编码Fbank表征, 构建多特征融合的越-英语音翻译模型。实验表明, 使用多特征的语音翻译模型相比单特征翻译效果更优, 与简单的特征拼接方法相比更有效, 所提的多特征融合方法在越-英语音翻译任务上提升了1. 97个BLEU值。”

融合双重注意力机制的缅甸语图像文本识别方法(Burmese image text recognition method with dual attention mechanism)

no code implementations CCL 2022 Fengxiao Wang, Cunli Mao, Zhengtao Yu, Shengxiang Gao, Huang Yuxin, Fuhao Liu

“由于缅甸语字符具有独特的语言编码结构以及字符组合规则, 现有图像文本识别方法在缅甸语图像识别任务中无法充分关注文字边缘的特征, 会导致缅甸语字符上下标丢失的问题。因此, 本文基于Transformer框架的图像文本识别方法做出改进, 提出一种融合通道和空间注意力机制的视觉关注模块, 旨在捕获像素级成对关系和通道依赖关系, 降低缅甸语图像中噪声干扰从而获得语义更完整的特征图。此外, 在解码过程中, 将基于多头注意力的解码单元组合为解码器, 用于将特征序列转化为缅甸语文字。实验结果表明, 该方法在自构的缅甸语图像文本识别数据集上相比Transformer识别准确率提高0. 5%, 达到95. 3%。”

基于跨语言双语预训练及Bi-LSTM的汉-越平行句对抽取方法(Chinese-Vietnamese Parallel Sentence Pair Extraction Method Based on Cross-lingual Bilingual Pre-training and Bi-LSTM)

no code implementations CCL 2020 Chang Liu, Shengxiang Gao, Zhengtao Yu, Yuxin Huang, Congcong You

汉越平行句对抽取是缓解汉越平行语料库数据稀缺的重要方法。平行句对抽取可转换为同一语义空间下的句子相似性分类任务, 其核心在于双语语义空间对齐。传统语义空间对齐方法依赖于大规模的双语平行语料, 越南语作为低资源语言获取大规模平行语料相对困难。针对这个问题本文提出一种利用种子词典进行跨语言双语预训练及Bi-LSTM(Bi-directional Long Short-Term Memory)的汉-越平行句对抽取方法。预训练中仅需要大量的汉越单语和一个汉越种子词典, 通过利用汉越种子词典将汉越双语映射到公共语义空间进行词对齐。再利用Bi-LSTM和CNN(Convolutional Neural Networks)分别提取句子的全局特征和局部特征从而最大化表示汉-越句对之间的语义相关性。实验结果表明, 本文模型在F1得分上提升7. 1%, 优于基线模型。

Sentence

基于模型不确定性约束的半监督汉缅神经机器翻译(Semi-Supervised Chinese-Myanmar Neural Machine Translation based Model-Uncertainty)

no code implementations CCL 2021 Linqin Wang, Zhengtao Yu, Cunli Mao, Chengxiang Gao, Zhibo Man, Zhenhan Wang

“基于回译的半监督神经机器翻译方法在低资源神经机器翻译取得了明显的效果, 然而, 由于汉缅双语资源稀缺、结构差异较大, 传统基于Transformer的回译方法中编码端的Self-attention机制不能有效区别回译中产生的伪平行数据的噪声对句子编码的影响, 致使译文出现漏译, 多译, 错译等问题。为此, 该文提出基于模型不确定性为约束的半监督汉缅神经机器翻译方法, 在Transformer网络中利用基于变分推断的蒙特卡洛Dropout构建模型不确定性注意力机制, 获取到能够区分噪声数据的句子向量表征, 在此基础上与Self-attention机制得到的句子编码向量进行融合, 以此得到句子有效编码表征。实验证明, 本文方法相比传统基于Transformer的回译方法在汉语-缅甸语和缅甸语-汉语两个翻译方向BLEU值分别提升了4. 01和1. 88个点, 充分验证了该方法在汉缅神经翻译任务的有效性。”

Machine Translation

基于中文信息与越南语句法指导的越南语事件检测(Vietnamese event detection based on Chinese information and Vietnamese syntax guidance)

no code implementations CCL 2021 Long Chen, Junjun Guo, Yafei Zhang, Chengxiang Gao, Zhengtao Yu

“当前基于深度学习的事件检测模型都依赖足够数量的标注数据, 而标注数据的稀缺及事件类型歧义为越南语事件检测带来了极大的挑战。根据“表达相同观点但语言不同的句子通常有相同或相似的语义成分”这一多语言一致性特征, 本文提出了一种基于中文信息与越南语句法指导的越南语事件检测框架。首先通过共享编码器策略和交叉注意力网络将中文信息融入到越南语中, 然后使用图卷积网络融入越南语依存句法信息, 最后在中文事件类型指导下实现越南语事件检测。实验结果表明, 在中文信息和越南语句法的指导下越南语事件检测取得了较好的效果。”

Event Detection

基于多语言联合训练的汉-英-缅神经机器翻译方法(Chinese-English-Burmese Neural Machine Translation Method Based on Multilingual Joint Training)

no code implementations CCL 2020 Zhibo Man, Cunli Mao, Zhengtao Yu, Xunyu Li, Shengxiang Gao, Junguo Zhu

多语言神经机器翻译是解决低资源神经机器翻译的有效方法, 现有方法通常依靠共享词表的方式解决英语、法语以及德语相似语言之间的多语言翻译问题。缅甸语属于一种典型的低资源语言, 汉语、英语以及缅甸语之间的语言结构差异性较大, 为了缓解由于差异性引起的共享词表大小受限制的问题, 提出一种基于多语言联合训练的汉英缅神经机器翻译方法。在Transformer框架下将丰富的汉英平行语料与汉缅、英缅的语料进行联合训练, 模型训练过程中分别在编码端和解码端将汉英缅映射在同一语义空间降低汉英缅语言结构差异性对共享词表的影响, 通过共享汉英语料训练参数来弥补汉缅数据缺失的问题。实验表明在一对多、多对多的翻译场景下, 提出方法相比基线模型的汉-英、英-缅以及汉-缅的BLEU值有明显的提升。

Machine Translation

StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses

no code implementations13 Mar 2024 Jia-Nan Li, Quan Tu, Cunli Mao, Zhengtao Yu, Ji-Rong Wen, Rui Yan

Accordingly, we introduce StreamingDialogue, which compresses long dialogue history into conv-attn sinks with minimal losses, and thus reduces computational complexity quadratically with the number of sinks (i. e., the number of utterances).

Single-Image HDR Reconstruction Assisted Ghost Suppression and Detail Preservation Network for Multi-Exposure HDR Imaging

1 code implementation7 Mar 2024 Huafeng Li, Zhenmei Yang, Yafei Zhang, Dapeng Tao, Zhengtao Yu

This network, comprising single-frame HDR reconstruction with enhanced stop image (SHDR-ESI) and SHDR-ESI-assisted multi-exposure HDR reconstruction (SHDRA-MHDR), effectively leverages the ghost-free characteristic of single-frame HDR reconstruction and the detail-enhancing capability of ESI in oversaturated areas.

HDR Reconstruction Image Reconstruction

"In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning

no code implementations5 Mar 2024 Chuanqi Cheng, Quan Tu, Wei Wu, Shuo Shang, Cunli Mao, Zhengtao Yu, Rui Yan

Personalized dialogue systems have gained significant attention in recent years for their ability to generate responses in alignment with different personas.

Dialogue Generation

DeepRicci: Self-supervised Graph Structure-Feature Co-Refinement for Alleviating Over-squashing

no code implementations23 Jan 2024 Li Sun, Zhenhao Huang, Hua Wu, Junda Ye, Hao Peng, Zhengtao Yu, Philip S. Yu

Graph Neural Networks (GNNs) have shown great power for learning and mining on graphs, and Graph Structure Learning (GSL) plays an important role in boosting GNNs with a refined graph.

Contrastive Learning Graph structure learning

Hierarchical and Incremental Structural Entropy Minimization for Unsupervised Social Event Detection

1 code implementation19 Dec 2023 Yuwei Cao, Hao Peng, Zhengtao Yu, Philip S. Yu

As a trending approach for social event detection, graph neural network (GNN)-based methods enable a fusion of natural language semantics and the complex social network structural information, thus showing SOTA performance.

Event Detection Graph Neural Network

Prompt Based Tri-Channel Graph Convolution Neural Network for Aspect Sentiment Triplet Extraction

1 code implementation18 Dec 2023 Kun Peng, Lei Jiang, Hao Peng, Rui Liu, Zhengtao Yu, Jiaqian Ren, Zhifeng Hao, Philip S. Yu

Aspect Sentiment Triplet Extraction (ASTE) is an emerging task to extract a given sentence's triplets, which consist of aspects, opinions, and sentiments.

Aspect Sentiment Triplet Extraction

Uncertainty-guided Boundary Learning for Imbalanced Social Event Detection

1 code implementation30 Oct 2023 Jiaqian Ren, Hao Peng, Lei Jiang, Zhiwei Liu, Jia Wu, Zhengtao Yu, Philip S. Yu

While in our observation, compared to the rarity of classes, the calibrated uncertainty estimated from well-trained evidential deep learning networks better reflects model performance.

Contrastive Learning Event Detection

CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion

no code implementations12 Sep 2023 Yafei Zhang, Keying Du, Huafeng Li, Zhengtao Yu, Yu Liu

Specifically, to skillfully sidestep aggregating complementary information in IVIF, we design a mutual information transfer (MIT) module to mutually represent features from two modalities, roughly transferring complementary information into harmonious one.

Infrared And Visible Image Fusion

Generation and Recombination for Multifocus Image Fusion with Free Number of Inputs

no code implementations9 Sep 2023 Huafeng Li, Dan Wang, Yuxin Huang, Yafei Zhang, Zhengtao Yu

To distinguish the hard pixels from the source images, we achieve the determination of hard pixels by considering the inconsistency among the detection results of focus areas in source images.

Progressive Feature Mining and External Knowledge-Assisted Text-Pedestrian Image Retrieval

no code implementations23 Aug 2023 Huafeng Li, Shedan Yang, Yafei Zhang, Dapeng Tao, Zhengtao Yu

In addition, to further reduce the negative impact of modal discrepancy and text diversity on cross-modal matching, we propose to use other sample knowledge of the same modality, i. e., external knowledge to enhance identity-consistent features and weaken identity-inconsistent features.

Image Retrieval Retrieval

Domain-adaptive Person Re-identification without Cross-camera Paired Samples

no code implementations13 Jul 2023 Huafeng Li, Yanmei Mao, Yafei Zhang, Guanqiu Qi, Zhengtao Yu

Therefore, the supervised model training is achieved under the style supervision of the target domain by exchanging styles between source-domain samples and target-domain samples, and the challenges caused by the lack of cross-camera paired samples are solved by utilizing cross-camera similar samples.

Domain Adaptive Person Re-Identification Person Re-Identification

Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

no code implementations8 Jul 2023 Huafeng Li, Le Xu, Yafei Zhang, Dapeng Tao, Zhengtao Yu

In this work, the changes of views, posture, background and modal discrepancy are considered as the main factors that cause the perturbations of person identity features.

Adversarial Attack Video-Based Person Re-Identification

Modeling the Relative Visual Tempo for Self-supervised Skeleton-based Action Recognition

1 code implementation ICCV 2023 Yisheng Zhu, Hu Han, Zhengtao Yu, Guangcan Liu

Specifically, we design a Relative Visual Tempo Learning (RVTL) task to explore the motion information in intra-video clips, and an Appearance-Consistency (AC) task to learn appearance information simultaneously, resulting in more representative spatiotemporal features.

Action Recognition Contrastive Learning +2

CSMPQ:Class Separability Based Mixed-Precision Quantization

no code implementations20 Dec 2022 Mingkai Wang, Taisong Jin, Miaohui Zhang, Zhengtao Yu

Mixed-precision quantization has received increasing attention for its capability of reducing the computational burden and speeding up the inference time.

Quantization

Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation

1 code implementation16 Nov 2022 Fan Li, Hang Zhou, Huafeng Li, Yafei Zhang, Zhengtao Yu

Specifically, we improve the interpretability of text features by providing them with consistent semantic information with image features to achieve the alignment of text and describe image region features. To address the challenges posed by the diversity of text and the corresponding person images, we treat the variation caused by diversity to features as caused by perturbation information and propose a novel adversarial attack and defense method to solve it.

Adversarial Attack Person Search +1

R$^3$Net:Relation-embedded Representation Reconstruction Network for Change Captioning

1 code implementation20 Oct 2021 Yunbin Tu, Liang Li, Chenggang Yan, Shengxiang Gao, Zhengtao Yu

In this paper, we propose a Relation-embedded Representation Reconstruction Network (R$^3$Net) to explicitly distinguish the real change from the large amount of clutter and irrelevant changes.

Caption Generation Relation +1

Dual-Stream Reciprocal Disentanglement Learning for Domain Adaptation Person Re-Identification

1 code implementation26 Jun 2021 Huafeng Li, Kaixiong Xu, Jinxing Li, Guangming Lu, Yong Xu, Zhengtao Yu, David Zhang

Since human-labeled samples are free for the target set, unsupervised person re-identification (Re-ID) has attracted much attention in recent years, by additionally exploiting the source set.

Disentanglement Domain Adaptation +2

Hazy Re-ID: An Interference Suppression Model For Domain Adaptation Person Re-identification Under Inclement Weather Condition

1 code implementation22 Apr 2021 Jian Pang, Dacheng Zhang, Huafeng Li, Weifeng Liu, Zhengtao Yu

This paper proposes a novel Interference Suppression Model (ISM) to deal with the interference caused by the hazy weather in domain adaptation person Re-ID.

Domain Adaptation Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.