Search Results for author: Hongfei Lin

Found 40 papers, 14 papers with code

RealMedDial: A Real Telemedical Dialogue Dataset Collected from Online Chinese Short-Video Clips

no code implementations COLING 2022 Bo Xu, Hongtong Zhang, Jian Wang, Xiaokun Zhang, Dezhi Hao, Linlin Zong, Hongfei Lin, Fenglong Ma

We collected and annotated a wide range of meta-data with respect to medical dialogue including doctor profiles, hospital departments, diseases and symptoms for fine-grained analysis on language usage pattern and clinical diagnosis.

Response Generation

基于HowNet的无监督汉语动词隐喻识别方法(Unsupervised Chinese Verb Metaphor Recognition Method Based on HowNet)

no code implementations CCL 2021 Minghao Zhang, Dongyu Zhang, Hongfei Lin

“隐喻是人类思维和语言理解的核心问题。随着互联网发展和海量文本出现, 利用自然语言处理技术对隐喻文本进行自动识别成为一种迫切的需求。但是目前在汉语隐喻识别研究中, 由于语义资源有限, 导致模型容易过拟合。此外, 主流隐喻识别方法存在可解释性差的缺点。针对上述问题, 本文构建了一个规模较大的汉语动词隐喻数据集, 并且提出了一种基于HowNet的无监督汉语动词隐喻识别模型。实验结果表明, 本文提出的模型能够有效地应用于动词隐喻识别任务, 识别效果超过了对比的无监督模型;并且, 与其它用于隐喻识别的深度学习模型相比, 本文模型具有结构简单、参数少、可解释性强的优点。”

软件标识符的自然语言规范性研究(Research on the Natural Language Normalness of Software Identifiers)

no code implementations CCL 2021 Dongzhen Wen, Fan Zhang, Xiao Zhang, Liang Yang, Yuan Lin, Bo Xu, Hongfei Lin

“软件源代码的理解则是软件协同开发与维护的核心, 而源代码中占半数以上的标识符的理解则在软件理解中起到重要作用, 传统软件工程主要研究通过命名规范限制标识符的命名过程以构造更易理解和交流的标识符。本文则在梳理分析常见编程语言命名规范的基础上, 提出一种全新的标识符可理解性评价标准。具体而言, 本文首先总结梳理了常见主流编程语言中的命名规范并类比自然语言语素概念本文提出基于软件语素的标识符构成过程, 即标识符的构成可被视为软件语素的生成、排列和连接过程。在此基础上, 本文提出一种结合自然语料库的软件标识符规范性评价方法, 用来衡量软件标识符是否易于理解。最后, 本文通过源代码理解数据集和乇乩乴乨乵乢平台中开源项目对规范性指标进行了验证性实验, 结果表明本文提出的规范性分数能够很好衡量软件项目的可理解性。”

基于风格化嵌入的中文文本风格迁移(Chinese text style transfer based on stylized embedding)

no code implementations CCL 2021 Chenguang Wang, Hongfei Lin, Liang Yang

“对话风格能够反映对话者的属性, 例如情感、性别和教育背景等。在对话系统中, 通过理解用户的对话风格, 能够更好地对用户进行建模。同样的, 面对不同背景的用户, 对话机器人也应该使用不同的语言风格与之交流。语言表达风格是文本的内在属性, 然而现有的大多数文本风格迁移研究, 集中在英文领域, 在中文领域则研究较少。本文构建了三个可用于中文文本风格迁移研究的数据集, 并将多种已有的文本风格迁移方法应用于该数据集。同时, 本文提出了基于DeepStyle算法与Transformer的风格迁移模型, 通过预训练可以获得不同风格的隐层向量表示。并基于Transformer构建生成端模型, 在解码阶段, 通过重建源文本的方式, 保留生成文本的内容信息, 并且引入对立风格的嵌入表示, 使得模型能够生成不同风格的文本。实验结果表明, 本文提出的模型在构建的中文数据集上均优于现有模型。”

Style Transfer Text Style Transfer

结合标签转移关系的多任务笑点识别方法(Multi-task punchlines recognition method combined with label transfer relationship)

no code implementations CCL 2021 Tongyue Zhang, Shaowu Zhang, Bo Xu, Liang Yang, Hongfei Lin

“幽默在人类交流中扮演着重要角色, 并大量存在于情景喜剧中。笑点(punchline)是情景喜剧实现幽默效果的形式之一, 在情景喜剧笑点识别任务中, 每条句子的标签代表该句是否为笑点, 但是以往的笑点识别工作通常只通过建模上下文语义关系识别笑点, 对标签的利用并不充分。为了充分利用标签序列中的信息, 本文提出了一种新的识别方法, 即结合条件随机场的单词级-句子级多任务学习模型, 该模型在两方面进行了改进, 首先将标签序列中相邻两个标签之间的转移关系看作幽默理论中不一致性的一种体现, 并使用条件随机场学习这种转移关系, 其次由于学习相邻标签之间的转移关系以及上下文语义关系均能够学习到铺垫和笑点之间的不一致性, 两者之间存在相关性, 为了使模型通过利用这种相关性提高笑点识别的效果, 该模型引入了多任务学习方法, 使用多任务学习方法同时学习每条句子的句义、组成每条句子的所有字符的词义, 单词级别的标签转移关系以及句子级别的标签转移关系。本文在CCL2020“小牛杯”幽默计算—情景喜剧笑点识别评测任务的英文数据集上进行实验, 结果表明, 本文提出的方法比目前最好的方法提高了3. 2%, 在情景喜剧幽默笑点识别任务上取得了最好的效果, 并通过消融实验证明了上述两方面改进的有效性。”

Label-Enhanced Hierarchical Contextualized Representation for Sequential Metaphor Identification

no code implementations EMNLP 2021 Shuqun Li, Liang Yang, Weidong He, Shiqi Zhang, Jingjie Zeng, Hongfei Lin

At the sentence level, we leverage the metaphor information of words that except the target word in the sentence to strengthen the reasoning ability of our model via a novel label-enhanced contextualized representation.

Language Modelling Sentence

基于预训练语言模型的案件要素识别方法(A Method for Case Factor Recognition Based on Pre-trained Language Models)

no code implementations CCL 2020 Haishun Liu, Lei Wang, Yanguang Chen, Shuchen Zhang, Yuanyuan Sun, Hongfei Lin

案件要素识别指将案件描述中重要事实描述自动抽取出来, 并根据领域专家设计的要素体系进行分类, 是智慧司法领域的重要研究内容。基于传统神经网络的文本编码难以提取深层次特征, 基于阈值的多标签分类难以捕获标签间依赖关系, 因此本文提出了基于预训练语言模型的多标签文本分类模型。该模型采用以Layer-attentive策略进行特征融合的语言模型作为编码器, 使用基于LSTM的序列生成模型作为解码器。在“CAIL2019”数据集上进行实验, 该方法比基于循环神经网络的算法在F1值上最高可提升7. 6%, 在相同超参数设置下比基础语言模型(BERT)提升约3. 2%。

基于多粒度语义交互理解网络的幽默等级识别(A Multi-Granularity Semantic Interaction Understanding Network for Humor Level Recognition)

no code implementations CCL 2020 Jinhui Zhang, Shaowu Zhang, Xiaochao Fan, Liang Yang, Hongfei Lin

幽默在人们日常交流中发挥着重要作用。随着人工智能的快速发展, 幽默等级识别成为自然语言处理领域的热点研究问题之一。已有的幽默等级识别研究往往将幽默文本看作一个整体, 忽视了幽默文本内部的语义关系。本文将幽默等级识别视为自然语言推理任务, 将幽默文本划分为“铺垫”和“笑点”两个部分, 分别对其语义和语义关系进行建模, 提出了一种多粒度语义交互理解网络, 从单词和子句两个粒度捕获幽默文本中语义的关联和交互。本文在Reddit公开幽默数据集上进行了实验, 相比之前最优结果, 模型在语料上的准确率提升了1. 3%。实验表明, 引入幽默内部的语义关系信息可以提高模型幽默识别的性能, 而本文提出的模型也可以很好地建模这种语义关系。

Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives

no code implementations23 Apr 2024 Haohao Zhu, Xiaokun Zhang, Junyu Lu, Youlin Wu, Zewen Bai, Changrong Min, Liang Yang, Bo Xu, Dongyu Zhang, Hongfei Lin

This limitation hinders a comprehensive understanding of individuals' personalities, as both stable traits and dynamic states are vital.

Disentangling ID and Modality Effects for Session-based Recommendation

1 code implementation19 Apr 2024 Xiaokun Zhang, Bo Xu, Zhaochun Ren, Xiaochen Wang, Hongfei Lin, Fenglong Ma

At the item level, we introduce a co-occurrence representation schema to explicitly incorporate cooccurrence patterns into ID representations.

counterfactual Counterfactual Inference +2

FineRec:Exploring Fine-grained Sequential Recommendation

1 code implementation19 Apr 2024 Xiaokun Zhang, Bo Xu, Youlin Wu, Yuan Zhong, Hongfei Lin, Fenglong Ma

Sequential recommendation is dedicated to offering items of interest for users based on their history behaviors.

Attribute Language Modelling +3

Side Information-Driven Session-based Recommendation: A Survey

no code implementations27 Feb 2024 Xiaokun Zhang, Bo Xu, Chenliang Li, Yao Zhou, Liangyue Li, Hongfei Lin

Emerging efforts incorporate various kinds of side information into their methods for enhancing task performance.

Session-Based Recommendations

Bi-Preference Learning Heterogeneous Hypergraph Networks for Session-based Recommendation

1 code implementation2 Nov 2023 Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Yuan Lin, Hongfei Lin

Secondly, price preference and interest preference are interdependent and collectively determine user choice, necessitating that we jointly consider both price and interest preference for intent modeling.

Multi-Task Learning Session-Based Recommendations

A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations

1 code implementation31 Oct 2023 Hui Ma, Jian Wang, Hongfei Lin, Bo Zhang, Yijia Zhang, Bo Xu

Emotion recognition in conversations (ERC), the task of recognizing the emotion of each utterance in a conversation, is crucial for building empathetic machines.

Emotion Recognition in Conversation Multimodal Emotion Recognition

Beyond Co-occurrence: Multi-modal Session-based Recommendation

1 code implementation29 Sep 2023 Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Liang Yang, Hongfei Lin

(2) How to fuse these heterogeneous descriptive information to comprehensively infer user interests?

Contrastive Learning Descriptive +2

ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation

1 code implementation1 Aug 2023 Bo Zhang, Jian Wang, Hui Ma, Bo Xu, Hongfei Lin

To overcome this challenge, we propose an innovative multimodal framework, called ZRIGF, which assimilates image-grounded information for dialogue generation in zero-resource situations.

Dialogue Generation Response Generation

Hate Speech Detection via Dual Contrastive Learning

no code implementations10 Jul 2023 Junyu Lu, Hongfei Lin, Xiaokun Zhang, Zhaoqing Li, Tongyue Zhang, Linlin Zong, Fenglong Ma, Bo Xu

Our framework jointly optimizes the self-supervised and the supervised contrastive learning loss for capturing span-level information beyond the token-level emotional semantics used in existing models, particularly detecting speech containing abusive and insulting words.

Contrastive Learning Hate Speech Detection

Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks

1 code implementation8 May 2023 Junyu Lu, Bo Xu, Xiaokun Zhang, Changrong Min, Liang Yang, Hongfei Lin

In addition, it is crucial to introduce lexical knowledge to detect the toxicity of posts, which has been a challenge for researchers.

Hate Speech Detection

Exploiting Pairwise Mutual Information for Knowledge-Grounded Dialogue

1 code implementation IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022 Bo Zhang, Jian Wang, Hongfei Lin, Hui Ma, Bo Xu

Correlation integration is designed to fully exploit the pairwise mutual information among dialogue context, knowledge, and responses, while overall integration adopts an integration gate to capture global information.

Dialogue Generation

MultiMET: A Multimodal Dataset for Metaphor Understanding

no code implementations ACL 2021 Dongyu Zhang, Minghao Zhang, Heting Zhang, Liang Yang, Hongfei Lin

Metaphor involves not only a linguistic phenomenon, but also a cognitive phenomenon structuring human thought, which makes understanding it challenging.

Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge

1 code implementation21 Nov 2019 Cong Sun, Zhihao Yang, Leilei Su, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang

Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks.

Chemical-Protein Interaction Extraction Drug Discovery +2

Telling the Whole Story: A Manually Annotated Chinese Dataset for the Analysis of Humor in Jokes

no code implementations IJCNLP 2019 Dongyu Zhang, Heting Zhang, Xikai Liu, Hongfei Lin, Feng Xia

To the best of our knowledge, we are the first to approach humor annotation for exploring the underlying mechanism of the use of humor, which may contribute to a significantly deeper analysis of humor.

An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition

1 code implementation Bioinformatics 2019 Ling Luo, Zhihao Yang, Pei Yang, Yin Zhang, Lei Wang, Hongfei Lin, Jian Wang

Motivation: In biomedical research, chemical is an important class of entities, and chemical named entity recognition (NER) is an important task in the field of biomedical information extraction.

Feature Engineering named-entity-recognition +3

WECA: A WordNet-Encoded Collocation-Attention Network for Homographic Pun Recognition

no code implementations EMNLP 2018 Yufeng Diao, Hongfei Lin, Di wu, Liang Yang, Kan Xu, Zhihao Yang, Jian Wang, Shaowu Zhang, Bo Xu, Dongyu Zhang

In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of homographic puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns.

Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions

no code implementations ACL 2018 Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu

However, there is little research on the construction of metaphor corpora annotated with emotion for the analysis of emotionality of metaphorical expressions.

Emotion Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.