no code implementations • LREC 2022 • Thomas Green, Diana Maynard, Chenghua Lin
We present the development of a benchmark suite consisting of an annotation schema, training corpus and baseline model for Entity Recognition (ER) in job descriptions, published under a Creative Commons license.
1 code implementation • COLING 2022 • Yucheng Li, Chenghua Lin, Frank Guerin
The metaphor identification module is able to perform a self-training procedure, which discovers novel metaphors from a large-scale unlabeled corpus for NM generation.
no code implementations • 25 May 2025 • Zirui Li, Siwei Wu, Xingyu Wang, Yi Zhou, Yizhi Li, Chenghua Lin
The rapid advancement of unsupervised representation learning and large-scale pre-trained vision-language models has significantly improved cross-modal retrieval tasks.
no code implementations • 29 Apr 2025 • Hanhua Hong, Chenghao Xiao, Yang Wang, Yiqi Liu, Wenge Rong, Chenghua Lin
Evaluating natural language generation (NLG) systems is challenging due to the diversity of valid outputs.
no code implementations • 2 Apr 2025 • Xiao Wang, Daniil Larionov, Siwei Wu, Yiqi Liu, Steffen Eger, Nafise Sadat Moosavi, Chenghua Lin
In this work, we introduce ContrastScore, a contrastive evaluation metric designed to enable higher-quality, less biased, and more efficient assessment of generated text.
no code implementations • 20 Mar 2025 • Emiel van Miltenburg, Chenghua Lin
The term Natural Language Generation (NLG), in its broadest definition, refers to the study of systems that verbalize some form of information through natural language.
1 code implementation • 11 Mar 2025 • Ruibin Yuan, Hanfeng Lin, Shuyue Guo, Ge Zhang, Jiahao Pan, Yongyi Zang, Haohe Liu, Yiming Liang, Wenye Ma, Xingjian Du, Xinrun Du, Zhen Ye, Tianyu Zheng, Yinghao Ma, Minghao Liu, Zeyue Tian, Ziya Zhou, Liumeng Xue, Xingwei Qu, Yizhi Li, Shangda Wu, Tianhao Shen, Ziyang Ma, Jun Zhan, Chunhui Wang, Yatian Wang, Xiaowei Chi, Xinyue Zhang, Zhenzhu Yang, Xiangzhou Wang, Shansong Liu, Lingrui Mei, Peng Li, Junjie Wang, Jianwei Yu, Guojian Pang, Xu Li, ZiHao Wang, Xiaohuan Zhou, Lijun Yu, Emmanouil Benetos, Yong Chen, Chenghua Lin, Xie Chen, Gus Xia, Zhaoxiang Zhang, Chao Zhang, Wenhu Chen, Xinyu Zhou, Xipeng Qiu, Roger Dannenberg, Jiaheng Liu, Jian Yang, Wenhao Huang, Wei Xue, Xu Tan, Yike Guo
We tackle the task of long-form music generation--particularly the challenging \textbf{lyrics-to-song} problem--by introducing YuE, a family of open foundation models based on the LLaMA2 architecture.
no code implementations • 9 Mar 2025 • Yizheng Sun, Hao Li, Chang Xu, Hongpeng Zhou, Chenghua Lin, Riza Batista-Navarro, Jingyuan Sun
Our findings are stark: despite minimal aggregate performance drops, accelerated models changed original answers up to 20% of the time.
1 code implementation • 7 Feb 2025 • Steffen Eger, Yong Cao, Jennifer D'Souza, Andreas Geiger, Christian Greisinger, Stephanie Gross, Yufang Hou, Brigitte Krenn, Anne Lauscher, Yizhi Li, Chenghua Lin, Nafise Sadat Moosavi, Wei Zhao, Tristan Miller
With the advent of large multimodal language models, science is now at a threshold of an AI-based technological transformation.
no code implementations • 23 Jan 2025 • Yizheng Sun, Yanze Xin, Hao Li, Jingyuan Sun, Chenghua Lin, Riza Batista-Navarro
Multi-modal Large Language Models (MLLMs) have achieved remarkable success by integrating visual and textual modalities.
1 code implementation • 22 Jan 2025 • Bohao Yang, Yingji Zhang, Dong Liu, André Freitas, Chenghua Lin
While multimodal large language models (MLLMs) enable direct visual processing, they face limitations in handling scientific tables due to fixed input image resolutions and insufficient numerical reasoning capabilities.
no code implementations • 9 Jan 2025 • Tomas Goldsack, Carolina Scarton, Chenghua Lin
In this work, we explore the application of Large Language Models to zero-shot Lay Summarisation.
1 code implementation • 5 Jan 2025 • Yang Wang, Chenghua Lin
Recent advancements in natural language processing have highlighted the vulnerability of deep learning models to adversarial attacks.
1 code implementation • 30 Dec 2024 • Jianfei Zhang, Jun Bai, Bei Li, Yanmeng Wang, Rumei Li, Chenghua Lin, Wenge Rong
Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human.
no code implementations • 10 Dec 2024 • Yuyang Cheng, Xingwei Qu, Tomas Goldsack, Chenghua Lin, Chung-Chi Chen
Thomas C. Schelling, awarded the 2005 Nobel Memorial Prize in Economic Sciences, pointed out that ``individuals decisions (micromotives), while often personal and localized, can lead to societal outcomes (macrobehavior) that are far more complex and different from what the individuals intended.''
1 code implementation • 1 Dec 2024 • Qianren Mao, Weifeng Jiang, Junnan Liu, Chenghua Lin, Qian Li, Xianqing Wen, JianXin Li, Jinhu Lu
The semi-supervised learning (SSL) strategy in lightweight models requires reducing annotated samples and facilitating cost-effective inference.
no code implementations • 19 Oct 2024 • Yanan Ma, Chenghao Xiao, Chenhan Yuan, Sabine N van der Veer, Lamiece Hassan, Chenghua Lin, Goran Nenadic
Experiments on news benchmark datasets and one Twitter dataset demonstrate the method's superiority in generating coherent, diverse topics, and handling noisy data, outperforming strong baselines.
1 code implementation • 17 Oct 2024 • Chenhao Zhang, Xi Feng, Yuelin Bai, Xinrun Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni
To fill the gap, we introduce the **C**hinese **I**mage **I**mplication understanding **Bench**mark, **CII-Bench**, which aims to assess the higher-order perception and understanding capabilities of MLLMs for Chinese images.
1 code implementation • 17 Oct 2024 • Siwei Wu, Zhongyuan Peng, Xinrun Du, Tuney Zheng, Minghao Liu, Jialong Wu, Jiachen Ma, Yizhi Li, Jian Yang, Wangchunshu Zhou, Qunshu Lin, Junbo Zhao, Zhaoxiang Zhang, Wenhao Huang, Ge Zhang, Chenghua Lin, J. H. Liu
In our work, to investigate the reasoning patterns of o1, we compare o1 with existing Test-time Compute methods (BoN, Step-wise BoN, Agent Workflow, and Self-Refine) by using OpenAI's GPT-4o as a backbone on general reasoning benchmarks in three domains (i. e., math, coding, commonsense reasoning).
no code implementations • 10 Oct 2024 • William Thorne, Ambrose Robinson, Bohua Peng, Chenghua Lin, Diana Maynard
This research contributes: (1) A methodology for increasing question difficulty using PPO and synthetic data; (2) Empirical evidence of the method's effectiveness, including human evaluation; (3) An in-depth error analysis and study of emergent phenomena; and (4) An open-source codebase and set of three llama-2-chat adapters for reproducibility and adaptation.
no code implementations • 7 Oct 2024 • Joseph James, Chenghao Xiao, Yucheng Li, Chenghua Lin
Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings.
no code implementations • 1 Oct 2024 • Tomas Goldsack, Yang Wang, Chenghua Lin, Chung-Chi Chen
This paper explores the use of Large Language Models (LLMs) in the generation and evaluation of analytical reports derived from Earnings Calls (ECs).
1 code implementation • 24 Sep 2024 • Jun Bai, Zhuofan Chen, Zhenzi Li, Hanhua Hong, Jianfei Zhang, Chen Li, Chenghua Lin, Wenge Rong
As a promising alternative to human intuition and brute-force fine-tuning, Transferability Estimation (TE) has emerged as an effective approach to model selection.
1 code implementation • 23 Sep 2024 • Tyler Loakman, Yucheng Li, Chenghua Lin
To investigate this, we analyse the ability of VLMs and LLMs to demonstrate sound symbolism (i. e., to recognise a non-arbitrary link between sounds and concepts) as well as their ability to "hear" via the interplay of the language and vision modules of open and closed-source multimodal models.
1 code implementation • 23 Sep 2024 • Yizhi Li, Ge Zhang, Yinghao Ma, Ruibin Yuan, Kang Zhu, Hangyu Guo, Yiming Liang, Jiaheng Liu, Zekun Wang, Jian Yang, Siwei Wu, Xingwei Qu, Jinjie Shi, Xinyue Zhang, Zhenzhu Yang, Xiangzhou Wang, Zhaoxiang Zhang, Zachary Liu, Emmanouil Benetos, Wenhao Huang, Chenghua Lin
Recent advancements in multimodal large language models (MLLMs) have focused on integrating multiple modalities, yet their ability to simultaneously process and reason across different inputs remains underexplored.
2 code implementations • 10 Sep 2024 • King Zhu, Qianbo Zang, Shian Jia, Siwei Wu, Feiteng Fang, Yizhi Li, Shawn Gavin, Tuney Zheng, Jiawei Guo, Bo Li, HaoNing Wu, Xingwei Qu, Jian Yang, Zachary Liu, Xiang Yue, J. H. Liu, Chenghua Lin, Min Yang, Shiwen Ni, Wenhao Huang, Ge Zhang
However, many of these benchmarks include overly simple or uninformative samples, complicating the effective distinction of different MLLMs' performance.
1 code implementation • 26 Aug 2024 • Yinghao Ma, Anders Øland, Anton Ragni, Bleiz MacSen Del Sette, Charalampos Saitis, Chris Donahue, Chenghua Lin, Christos Plachouras, Emmanouil Benetos, Elona Shatri, Fabio Morreale, Ge Zhang, György Fazekas, Gus Xia, huan zhang, Ilaria Manco, Jiawen Huang, Julien Guinot, Liwei Lin, Luca Marinelli, Max W. Y. Lam, Megha Sharma, Qiuqiang Kong, Roger B. Dannenberg, Ruibin Yuan, Shangda Wu, Shih-Lun Wu, Shuqi Dai, Shun Lei, Shiyin Kang, Simon Dixon, Wenhu Chen, Wenhao Huang, Xingjian Du, Xingwei Qu, Xu Tan, Yizhi Li, Zeyue Tian, Zhiyong Wu, Zhizheng Wu, Ziyang Ma, Ziyu Wang
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music.
no code implementations • 16 Aug 2024 • Tomas Goldsack, Carolina Scarton, Matthew Shardlow, Chenghua Lin
This paper presents the setup and results of the second edition of the BioLaySumm shared task on the Lay Summarisation of Biomedical Research Articles, hosted at the BioNLP Workshop at ACL 2024.
1 code implementation • 15 Aug 2024 • Yiming Liang, Ge Zhang, Xingwei Qu, Tianyu Zheng, Jiawei Guo, Xinrun Du, Zhenzhu Yang, Jiaheng Liu, Chenghua Lin, Lei Ma, Wenhao Huang, Jiajun Zhang
Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment.
no code implementations • 8 Aug 2024 • Xingwei Qu, Ge Zhang, Siwei Wu, Yizhi Li, Chenghua Lin
The goal of this shared task is to generate Chinese metaphors using machine learning techniques and effectively identifying basic components of metaphorical sentences.
1 code implementation • 24 Jul 2024 • Siwei Wu, Kang Zhu, Yu Bai, Yiming Liang, Yizhi Li, HaoNing Wu, J. H. Liu, Ruibo Liu, Xingwei Qu, Xuxin Cheng, Ge Zhang, Wenhao Huang, Chenghua Lin
Moreover, we explored the ability of LVLMs to perceive image sequences within the context of our multi-image association task.
no code implementations • 29 Jun 2024 • Kunquan Deng, Zeyu Huang, Chen Li, Chenghua Lin, Min Gao, Wenge Rong
In editing tasks, PFME further enhances the FActScore of FActScore-Alpaca13B and FActScore-ChatGPT datasets, increasing by 16. 2pp and 4. 6pp, respectively.
no code implementations • 28 Jun 2024 • Chen Tang, Bohao Yang, Kun Zhao, Bo Lv, Chenghao Xiao, Frank Guerin, Chenghua Lin
Named entity recognition (NER) stands as a fundamental and pivotal task within the realm of Natural Language Processing.
1 code implementation • 25 Jun 2024 • Kun Zhao, Chenghao Xiao, Sixing Yan, Haoteng Tang, William K. Cheung, Noura Al Moubayed, Liang Zhan, Chenghua Lin
We show that training on the layman's terms dataset encourages models to focus on the semantics of the reports, as opposed to overfitting to learning the report templates.
1 code implementation • 25 Jun 2024 • Bohao Yang, Dong Liu, Chenghao Xiao, Kun Zhao, Chen Tang, Chao Li, Lin Yuan, Guang Yang, Lanxiao Huang, Chenghua Lin
Large Language Models (LLMs) demonstrate remarkable ability to comprehend instructions and generate human-like text, enabling sophisticated agent simulation beyond basic behavior replication.
no code implementations • 19 Jun 2024 • Shun Wang, Ge Zhang, Han Wu, Tyler Loakman, Wenhao Huang, Chenghua Lin
Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data.
no code implementations • 9 Jun 2024 • Zhihao Zhang, Tomas Goldsack, Carolina Scarton, Chenghua Lin
Lay summarisation aims to produce summaries of scientific articles that are comprehensible to non-expert audiences.
1 code implementation • 29 May 2024 • Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu, Noah Wang, Quehry Que, Ruibo Liu, Sine Liu, Shawn Guo, Soren Gao, Wangchunshu Zhou, Xinyue Zhang, Yizhi Zhou, YuBo Wang, Yuelin Bai, Yuhan Zhang, Yuxiang Zhang, Zenith Wang, Zhenzhu Yang, Zijian Zhao, Jiajun Zhang, Wanli Ouyang, Wenhao Huang, Wenhu Chen
To improve the transparency of LLMs, the research community has formed to open-source truly open LLMs (e. g., Pythia, Amber, OLMo), where more details (e. g., pre-training corpus and training code) are being provided.
1 code implementation • 24 May 2024 • Kun Zhao, Bohao Yang, Chen Tang, Chenghua Lin, Liang Zhan
Our approach introduces several techniques: (1) Contrastive learning to differentiate between robust and non-robust response embeddings; (2) A novel metric for semantic sensitivity that combines embedding cosine distances with similarity learned through neural networks, and (3) a strategy for incorporating the evaluation results from both the SLM and LLMs.
1 code implementation • 28 Apr 2024 • Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo
Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
no code implementations • 26 Apr 2024 • Tyler Loakman, Chenghua Lin
This paper presents a partial reproduction of Generating Fact Checking Explanations by Anatanasova et al (2020) as part of the ReproHum element of the ReproNLP shared task to reproduce the findings of NLP research regarding human evaluation.
no code implementations • 9 Apr 2024 • Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan, Stephen W. Huang, Jie Fu, Ge Zhang
In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music.
1 code implementation • 1 Apr 2024 • Bohao Yang, Kun Zhao, Chen Tang, Dong Liu, Liang Zhan, Chenghua Lin
Trainable evaluation metrics, typically trained with true positive and randomly selected negative responses, tend to assign higher scores to responses that share greater content similarity with a given context.
no code implementations • 26 Mar 2024 • Yuelin Bai, Xinrun Du, Yiming Liang, Yonggang Jin, Junting Zhou, Ziqiang Liu, Feiteng Fang, Mingshan Chang, Tianyu Zheng, Xincheng Zhang, Nuo Ma, Zekun Wang, Ruibin Yuan, Haihong Wu, Hongquan Lin, Wenhao Huang, Jiajun Zhang, Chenghua Lin, Jie Fu, Min Yang, Shiwen Ni, Ge Zhang
Remarkable progress on English instruction tuning has facilitated the efficacy and reliability of large language models (LLMs).
no code implementations • 20 Mar 2024 • Tyler Loakman, Chen Tang, Chenghua Lin
Previous work in phonologically and phonetically grounded language generation has mainly focused on domains such as puns and poetry.
no code implementations • 7 Mar 2024 • Xingwei Qu, Yiming Liang, Yucheng Wang, Tianyu Zheng, Tommy Yue, Xingyuan Bu, Lei Ma, Stephen W. Huang, Jiajun Zhang, Yinan Shi, Chenghua Lin, Jie Fu, Ge Zhang
Our framework employs a dual 3B model approach, with each model assigned a distinct role: one focuses on task definition extraction, while the other handles learning from demonstrations.
1 code implementation • 25 Feb 2024 • Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, Jingcheng Wu, Chenghua Lin, Qifeng Liu, Tao Jiang, Wenhao Huang, Wenhu Chen, Emmanouil Benetos, Jie Fu, Gus Xia, Roger Dannenberg, Wei Xue, Shiyin Kang, Yike Guo
It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the music is treated as a second language.
no code implementations • 20 Feb 2024 • Yizhi Li, Ge Zhang, Xingwei Qu, Jiali Li, Zhaoqun Li, Zekun Wang, Hao Li, Ruibin Yuan, Yinghao Ma, Kai Zhang, Wangchunshu Zhou, Yiming Liang, Lei Zhang, Lei Ma, Jiajun Zhang, Zuowen Li, Stephen W. Huang, Chenghua Lin, Jie Fu
The advancement of large language models (LLMs) has enhanced the ability to generalize across a wide range of unseen natural language processing (NLP) tasks through instruction-following.
2 code implementations • 20 Feb 2024 • Yujie Shao, Xinrong Yao, Xingwei Qu, Chenghua Lin, Shi Wang, Stephen W. Huang, Ge Zhang, Jie Fu
These models are able to generate creative and fluent metaphor sentences more frequently induced by selected samples from our dataset, demonstrating the value of our corpus for Chinese metaphor research.
2 code implementations • 13 Feb 2024 • Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed
To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.
1 code implementation • 1 Feb 2024 • Yucheng Li, Yunhao Guo, Frank Guerin, Chenghua Lin
We measure: 1) the compression performance on the testing period as a measure of generalization on unseen data; and 2) the performance gap between the training and testing period as a measure of robustness.
no code implementations • 29 Jan 2024 • Yucheng Li, Frank Guerin, Chenghua Lin
In this paper, we test various NLP models on the VUA metaphor dataset and quantify to what extent metaphors affect models' performance on various downstream tasks.
1 code implementation • 24 Jan 2024 • Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin
We further annotate the image-text pairs with two-level subset-subcategory hierarchy annotations to facilitate a more comprehensive evaluation of the baselines.
1 code implementation • 22 Jan 2024 • Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Jie Fu
We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.
1 code implementation • 12 Jan 2024 • Tianyu Zheng, Shuyue Guo, Xingwei Qu, Jiawei Guo, Xinrun Du, Qi Jia, Chenghua Lin, Wenhao Huang, Jie Fu, Ge Zhang
In this paper, we introduce Kun, a novel approach for creating high-quality instruction-tuning datasets for large language models (LLMs) without relying on manual annotations.
no code implementations • 28 Dec 2023 • Zhihao Zhang, Yuan Zuo, Chenghua Lin, Junjie Wu
Finally, we merge the quality phrases from both the Annotator and Generator as the final predictions, considering their complementary nature and distinct characteristics.
1 code implementation • 19 Dec 2023 • Yucheng Li, Frank Guerin, Chenghua Lin
LatestEval avoids data contamination by only using texts published within a recent time window, ensuring no overlap with the training corpora of pre-trained language models.
1 code implementation • 8 Dec 2023 • Jun Bai, Xiaofeng Zhang, Chen Li, Hanhua Hong, Xi Xu, Chenghua Lin, Wenge Rong
However, there is a lack of a comprehensive comparison between these estimation methods yet.
1 code implementation • 19 Nov 2023 • Chen Tang, Tyler Loakman, Chenghua Lin
These results underscore the effectiveness of our model in leveraging context and event features to improve the quality of generated narratives.
no code implementations • 16 Nov 2023 • Yiqi Liu, Nafise Sadat Moosavi, Chenghua Lin
Automatic evaluation of generated textual content presents an ongoing challenge within the field of NLP.
no code implementations • 9 Nov 2023 • Tyler Loakman, Aaron Maladry, Chenghua Lin
Human evaluation is often considered to be the gold standard method of evaluating a Natural Language Generation system.
no code implementations • 31 Oct 2023 • Chen Tang, Frank Guerin, Chenghua Lin
This paper presents a tool called ``ACL Anthology Helper''.
1 code implementation • 26 Oct 2023 • Yucheng Li, Frank Guerin, Chenghua Lin
We also introduce an open-source pipeline that enables the community to perform contamination analysis on customised data and models.
1 code implementation • 24 Oct 2023 • Tomas Goldsack, Zhihao Zhang, Chen Tang, Carolina Scarton, Chenghua Lin
Previous approaches for automatic lay summarisation are exclusively reliant on the source article that, given it is written for a technical audience (e. g., researchers), is unlikely to explicitly define all technical concepts or state all of the background information that is relevant for a lay audience.
1 code implementation • 24 Oct 2023 • Chenghao Xiao, Yizhi Li, G Thomas Hudson, Chenghua Lin, Noura Al Moubayed
In recent years, contrastive learning (CL) has been extensively utilized to recover sentence and document-level encoding capability from pre-trained language models.
1 code implementation • 24 Oct 2023 • Chen Tang, Shun Wang, Tomas Goldsack, Chenghua Lin
Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature.
2 code implementations • 9 Oct 2023 • Yucheng Li, Bo Dong, Chenghua Lin, Frank Guerin
This paper proposes a method called Selective Context that enhances the inference efficiency of LLMs by identifying and pruning redundancy in the input context to make the input more compact.
no code implementations • 29 Sep 2023 • Tomas Goldsack, Zheheng Luo, Qianqian Xie, Carolina Scarton, Matthew Shardlow, Sophia Ananiadou, Chenghua Lin
This paper presents the results of the shared task on Lay Summarisation of Biomedical Research Articles (BioLaySumm), hosted at the BioNLP Workshop at ACL 2023.
1 code implementation • 22 Sep 2023 • Bohao Yang, Chen Tang, Kun Zhao, Chenghao Xiao, Chenghua Lin
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks.
no code implementations • 21 Sep 2023 • Yang Wang, Qibin Liang, Chenghao Xiao, Yizhi Li, Noura Al Moubayed, Chenghua Lin
Audio classification plays a crucial role in speech and sound processing tasks with a wide range of applications.
1 code implementation • 19 Sep 2023 • Bohao Yang, Chen Tang, Chenghua Lin
In this paper, We propose a novel framework that models dialogues between patients and healthcare professionals using AMR graphs, where the neural networks incorporate textual and graphical knowledge with a dual attention mechanism.
no code implementations • 11 Jul 2023 • Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu
Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech.
1 code implementation • 29 Jun 2023 • Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wei Xue, Yike Guo
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.
1 code implementation • 28 Jun 2023 • Chen Tang, Hongbo Zhang, Tyler Loakman, Chenghua Lin, Frank Guerin
Further analysis also shows that our representation learning framework can fill the semantic gap by coagulating representations of both text and graph knowledge.
1 code implementation • NeurIPS 2023 • Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu
This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark.
1 code implementation • 6 Jun 2023 • Tyler Loakman, Chen Tang, Chenghua Lin
Previous work in phonetically-grounded language generation has mainly focused on domains such as lyrics and poetry.
2 code implementations • 31 May 2023 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghao Xiao, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Zili Wang, Yike Guo, Jie Fu
Although SSL has been proven effective in speech and audio, its application to music audio has yet to be thoroughly explored.
1 code implementation • 26 May 2023 • Kun Zhao, Bohao Yang, Chenghua Lin, Wenge Rong, Aline Villavicencio, Xiaohui Cui
The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i. e., there may be multiple suitable responses which differ in semantics for a given conversational context.
1 code implementation • 26 May 2023 • Yucheng Li, Shun Wang, Chenghua Lin, Guerin Frank
One noticeable trend in metaphor detection is the embrace of linguistic theories such as the metaphor identification procedure (MIP) for model architecture design.
no code implementations • 22 May 2023 • Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.
2 code implementations • 20 May 2023 • Weifeng Jiang, Qianren Mao, Chenghua Lin, JianXin Li, Ting Deng, Weiyi Yang, Zheng Wang
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks.
1 code implementation • 10 May 2023 • Hongbo Zhang, Chen Tang, Tyler Loakman, Bohao Yang, Stefan Goetze, Chenghua Lin
Commonsense knowledge is crucial to many natural language processing tasks.
2 code implementations • 17 Apr 2023 • Ge Zhang, Yemin Shi, Ruibo Liu, Ruibin Yuan, Yizhi Li, Siwei Dong, Yu Shu, Zhaoqun Li, Zekun Wang, Chenghua Lin, Wenhao Huang, Jie Fu
Instruction tuning is widely recognized as a key technique for building generalist language models, which has attracted the attention of researchers and the public with the release of InstructGPT~\citep{ouyang2022training} and ChatGPT\footnote{\url{https://chat. openai. com/}}.
no code implementations • 18 Mar 2023 • Shekoufeh Kolahdouz-Rahimi, Kevin Lano, Chenghua Lin
We found that heuristic NLP approaches are the most common NLP techniques used for automatic RF, primary operating on structured and semi-structured data.
1 code implementation • 11 Feb 2023 • Shun Wang, Yucheng Li, Chenghua Lin, Loïc Barrault, Frank Guerin
We propose a novel RoBERTa-based model, RoPPT, which introduces a target-oriented parse tree structure in metaphor detection.
1 code implementation • 9 Feb 2023 • Yucheng Li, Shun Wang, Chenghua Lin, Frank Guerin, Loïc Barrault
In this paper, we propose FrameBERT, a RoBERTa-based model that can explicitly learn and incorporate FrameNet Embeddings for concept-level metaphor detection.
1 code implementation • 30 Jan 2023 • Yucheng Li, Frank Guerin, Chenghua Lin
Metaphors are proven to have stronger emotional impact than literal expressions.
1 code implementation • 1 Jan 2023 • Ge Zhang, Yizhi Li, Yaoyao Wu, Linyuan Zhang, Chenghua Lin, Jiayi Geng, Shi Wang, Jie Fu
As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese.
no code implementations • 8 Dec 2022 • Noor Fazilla Abd Yusof, Chenghua Lin
While outcome monitoring tends to improve the therapy outcomes, however, there are many challenges in the current method, e. g. time and financial burden for administering questionnaires, scoring and analysing the results.
no code implementations • 5 Dec 2022 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Chenghua Lin, Xingran Chen, Anton Ragni, Hanzhi Yin, Zhijie Hu, Haoyu He, Emmanouil Benetos, Norbert Gyenge, Ruibo Liu, Jie Fu
The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL).
1 code implementation • 5 Nov 2022 • Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Shi Wang, Anton Ragni, Jie Fu
In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups.
1 code implementation • 1 Nov 2022 • Jianfei Zhang, Jun Bai, Chenghua Lin, Yanmeng Wang, Wenge Rong
There are effective ways proposed to prevent posterior collapse in VAEs, but we observe that they in essence make trade-offs between posterior collapse and hole problem, i. e., mismatch between the aggregated posterior distribution and the prior distribution.
1 code implementation • 27 Oct 2022 • Chen Tang, Hongbo Zhang, Tyler Loakman, Chenghua Lin, Frank Guerin
In this paper, we propose a novel framework to improve medical dialogue generation by considering features centered on domain-specific terminology.
1 code implementation • 22 Oct 2022 • Chen Tang, Chenghua Lin, Henglin Huang, Frank Guerin, Zhihao Zhang
One of the key challenges of automatic story generation is how to generate a long narrative that can maintain fluency, relevance, and coherence.
1 code implementation • 19 Oct 2022 • Henglin Huang, Chen Tang, Tyler Loakman, Frank Guerin, Chenghua Lin
In spite of the success of prior works with the application of pre-trained models, current neural models for Chinese stories still struggle to generate high-quality long text narratives.
1 code implementation • 19 Oct 2022 • Chen Tang, Zhihao Zhang, Tyler Loakman, Chenghua Lin, Frank Guerin
To improve the performance of long text generation, recent studies have leveraged automatically planned event structures (i. e. storylines) to guide story generation.
1 code implementation • 18 Oct 2022 • Tomas Goldsack, Zhihao Zhang, Chenghua Lin, Carolina Scarton
Lay summarisation aims to jointly summarise and simplify a given text, thus making its content more comprehensible to non-experts.
Ranked #1 on
Lay Summarization
on PLOS
no code implementations • 11 Jul 2022 • Owen Millwood, Jack Miskelly, Bohao Yang, Prosanta Gope, Elif Kavun, Chenghua Lin
As the demand for highly secure and dependable lightweight systems increases in the modern world, Physically Unclonable Functions (PUFs) continue to promise a lightweight alternative to high-cost encryption techniques and secure key storage.
1 code implementation • 10 Jun 2022 • Yucheng Li, Chenghua Lin, Frank Geurin
Metaphor generation is a challenging task which can impact many downstream tasks such as improving user satisfaction with dialogue systems and story generation.
1 code implementation • 27 Apr 2022 • Yizhi Li, Wei Fan, Chao Liu, Chenghua Lin, Jiang Qian
However, such a method strictly restricts entities on the hyper-ellipsoid surfaces which limits the optimization of entity distribution, leading to suboptimal performance of knowledge graph completion.
no code implementations • 6 Mar 2022 • Chen Tang, Frank Guerin, Chenghua Lin
In recent years, considerable research has been dedicated to the application of neural models in the field of natural language generation (NLG).
1 code implementation • 12 Oct 2021 • Jiayuan Ding, Tong Xiang, Zijing Ou, Wangyang Zuo, Ruihui Zhao, Chenghua Lin, Yefeng Zheng, Bang Liu
In this paper, we introduce a new task named Reading Path Generation (RPG) which aims at automatically producing a path of papers to read for a given query.
no code implementations • 7 Oct 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin
In this paper, we provide the first focused study on the discontinuities (aka.
no code implementations • 29 Sep 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin
In this paper, we provide the first focused study on the discontinuities (aka.
no code implementations • 29 Aug 2021 • Mohamad Hardyman Barawi, Chenghua Lin, Advaith Siddharthan, Yinbin Liu
Our experimental results on three real-world datasets show that both the extractive and abstractive approaches outperform four strong baselines in terms of facilitating topic understanding and interpretation.
1 code implementation • INLG (ACL) 2021 • Chengkun Zeng, Guanyi Chen, Chenghua Lin, Ruizhe Li, Zhigang Chen
Understanding speaker's feelings and producing appropriate responses with emotion connection is a key communicative skill for empathetic dialogue systems.
1 code implementation • NAACL 2021 • Xutan Peng, Chenghua Lin, Mark Stevenson
It is therefore recommended that this strategy be adopted as a standard for CLWE methods.
no code implementations • ACL 2021 • Yi Cheng, SiYao Li, Bang Liu, Ruihui Zhao, Sujian Li, Chenghua Lin, Yefeng Zheng
This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficulty levels.
1 code implementation • 11 Apr 2021 • Xutan Peng, Chenghua Lin, Mark Stevenson
It is therefore recommended that this strategy be adopted as a standard for CLWE methods.
1 code implementation • NAACL 2021 • Xutan Peng, Guanyi Chen, Chenghua Lin, Mark Stevenson
Knowledge Graph Embeddings (KGEs) have been intensively explored in recent years due to their promise for a wide range of applications.
no code implementations • 7 Apr 2021 • Rui Mao, Chenghua Lin, Frank Guerin
The pre-trained word embeddings GloVe, ELMo and BERT have individually shown good performance on sequential metaphor identification.
no code implementations • 7 Apr 2021 • Rui Mao, Chenghua Lin, Frank Guerin
Metaphorical expressions are difficult linguistic phenomena, challenging diverse Natural Language Processing tasks.
1 code implementation • EACL 2021 • Xutan Peng, Yi Zheng, Chenghua Lin, Advaith Siddharthan
We introduce the task of historical text summarisation, where documents in historical forms of a language are summarised in the corresponding modern language.
no code implementations • 2 Dec 2020 • Jing Su, Chenghua Lin, Mian Zhou, Qingyun Dai, Haoyu Lv
In this paper, we propose an end-to-end CNN-LSTM model for generating descriptions for sequential images with a local-object attention mechanism.
1 code implementation • COLING 2020 • Ruizhe Li, Xiao Li, Guanyi Chen, Chenghua Lin
The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences.
no code implementations • EMNLP 2020 • Xiao Li, Guanyi Chen, Chenghua Lin, Ruizhe Li
We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer.
no code implementations • 16 May 2020 • Xiao Li, Kees Van Deemter, Chenghua Lin
Recent years have seen a number of proposals for performing Natural Language Generation (NLG) based in large part on statistical techniques.
no code implementations • NAACL 2021 • Dingmin Wang, Chenghua Lin, Qi Liu, Kam-Fai Wong
We present a fast and scalable architecture called Explicit Modular Decomposition (EMD), in which we incorporate both classification-based and extraction-based methods and design four modules (for classification and sequence labelling) to jointly extract dialogue states.
1 code implementation • 2 Apr 2020 • Xutan Peng, Mark Stevenson, Chenghua Lin, Chen Li
The technique of Cross-Lingual Word Embedding (CLWE) plays a fundamental role in tackling Natural Language Processing challenges for low-resource languages.
1 code implementation • WS 2019 • Ruizhe Li, Xiao Li, Chenghua Lin, Matthew Collinson, Rui Mao
Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data.
no code implementations • WS 2019 • Guanyi Chen, Kees Van Deemter, Chenghua Lin
Quantified expressions have always taken up a central position in formal theories of meaning and language use.
1 code implementation • WS 2019 • Guanyi Chen, Kees Van Deemter, Silvia Pagliaro, Louk Smalbil, Chenghua Lin
To inform these algorithms, we conducted on a series of elicitation experiments in which human speakers were asked to perform a linguistic task that invites the use of quantified expressions.
no code implementations • 13 Sep 2019 • Wenjun Liao, Chenghua Lin
The second approach is based on word embedding, where word2vec model is introduced and two document similarities calculation algorithms are implemented: wor2vec cosine similarity and WMD distance.
2 code implementations • ICML 2020 • Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin
We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.
Ranked #7 on
Image Generation
on CelebA 256x256
(FID metric)
1 code implementation • ACL 2019 • Rui Mao, Chenghua Lin, Frank Guerin
End-to-end training with Deep Neural Networks (DNN) is a currently popular method for metaphor identification.
1 code implementation • WS 2018 • Guanyi Chen, Kees Van Deemter, Chenghua Lin
We introduce SimpleNLG-ZH, a realisation engine for Mandarin that follows the software design paradigm of SimpleNLG (Gatt and Reiter, 2009).
no code implementations • WS 2018 • Xiao Li, Kees Van Deemter, Chenghua Lin
This paper argues that a new generic approach to statistical NLG can be made to perform Referring Expression Generation (REG) successfully.
no code implementations • WS 2018 • Guanyi Chen, Kees Van Deemter, Chenghua Lin
We extend the classic Referring Expressions Generation task by considering zero pronouns in {``}pro-drop{''} languages such as Chinese, modelling their use by means of the Bayesian Rational Speech Acts model (Frank and Goodman, 2012).
no code implementations • CONLL 2019 • Ruizhe Li, Chenghua Lin, Matthew Collinson, Xiao Li, Guanyi Chen
Recognising dialogue acts (DA) is important for many natural language processing tasks such as dialogue generation and intention recognition.
Ranked #4 on
Dialogue Act Classification
on Switchboard corpus
no code implementations • ACL 2018 • Rui Mao, Chenghua Lin, Frank Guerin
Metaphoric expressions are widespread in natural language, posing a significant challenge for various natural language processing tasks such as Machine Translation.
no code implementations • SEMEVAL 2018 • Rui Mao, Guanyi Chen, Ruizhe Li, Chenghua Lin
This paper describes the system that we submitted for SemEval-2018 task 10: capturing discriminative attributes.
no code implementations • WS 2017 • Noor Fazilla Abd Yusof, Chenghua Lin, Frank Guerin
We develop a computational model to discover the potential causes of depression by analysing the topics in a usergenerated text.
no code implementations • IJCNLP 2017 • Ebuka Ibeke, Chenghua Lin, Adam Wyner, Mohamad Hardyman Barawi
Contrastive opinion mining is essential in identifying, extracting and organising opinions from user generated texts.