no code implementations • WMT (EMNLP) 2021 • Meng Zhang, Minghao Wu, Pengfei Li, Liangyou Li, Qun Liu
This paper describes the NoahNMT system submitted to the WMT 2021 shared task of Very Low Resource Supervised Machine Translation.
1 code implementation • 16 Oct 2024 • Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch
Large Language Models (LLMs) have shown remarkable capabilities in natural language processing but exhibit significant performance gaps among different languages.
no code implementations • 16 Oct 2024 • Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari
In this paper, we introduce GraphFilter, a novel method that represents the dataset as a bipartite graph, linking sentences to their constituent n-grams.
no code implementations • 20 Jun 2024 • Huifang Du, Shuqin Li, Minghao Wu, Xuejing Feng, Yuan-Fang Li, Haofen Wang
Reinforcement learning (RL) is a powerful approach to enhance task-oriented dialogue (TOD) systems.
no code implementations • 13 Jun 2024 • Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari
In this work, we propose a general, model-agnostic, reinforcement learning framework, Mixture-of-Skills (MoS), that learns to optimize data usage automatically during the fine-tuning process.
1 code implementation • 13 Jun 2024 • Weixuan Wang, Barry Haddow, Minghao Wu, Wei Peng, Alexandra Birch
In this study, we aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
no code implementations • 20 May 2024 • Minghao Wu, Yulin Yuan, Gholamreza Haffari, Longyue Wang
Recent advancements in machine translation (MT) have significantly enhanced translation quality across various domains.
no code implementations • 19 May 2024 • Chiyu Zhang, Yifei Sun, Minghao Wu, Jun Chen, Jie Lei, Muhammad Abdul-Mageed, Rong Jin, Angli Liu, Ji Zhu, Sem Park, Ning Yao, Bo Long
Content-based recommendation systems play a crucial role in delivering personalized content to users in the digital world.
no code implementations • 21 Feb 2024 • Chenyang Lyu, Minghao Wu, Alham Fikri Aji
Large Language Models (LLMs) have demonstrated remarkable capabilities across various applications, fundamentally reshaping the landscape of natural language processing (NLP) research.
no code implementations • 27 Jan 2024 • Minghao Wu, YuFei Wang, George Foster, Lizhen Qu, Gholamreza Haffari
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive, in contrast to its sentence-level counterpart.
no code implementations • 12 Jan 2024 • Minghao Wu, Thuy-Trang Vu, Lizhen Qu, George Foster, Gholamreza Haffari
We provide an in-depth analysis of these LLMs tailored for DocMT, examining translation errors, discourse phenomena, strategies for training and inference, the data efficiency of parallel documents, recent test set evaluations, and zero-shot crosslingual transfer.
1 code implementation • 17 Dec 2023 • Renxi Wang, Haonan Li, Minghao Wu, Yuxia Wang, Xudong Han, Chiyu Zhang, Timothy Baldwin
Instruction tuning significantly enhances the performance of large language models (LLMs) across various tasks.
no code implementations • 25 Nov 2023 • Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.
no code implementations • 6 Jul 2023 • Minghao Wu, Alham Fikri Aji
This study investigates the behavior of crowd-sourced and expert annotators, as well as LLMs, when comparing outputs from different models.
1 code implementation • 15 Jun 2023 • Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu
Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.
1 code implementation • 24 May 2023 • Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin
However, research on multilingual instruction tuning has been limited due to the scarcity of high-quality instruction-response datasets across different languages.
no code implementations • 2 May 2023 • Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Siyou Liu, Longyue Wang
We conclude by emphasizing the critical role of LLMs in guiding the future evolution of MT and offer a roadmap for future exploration in the sector.
1 code implementation • 27 Apr 2023 • Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji
The results demonstrate that our proposed LaMini-LM models are comparable to competitive baselines, while being much smaller in size.
Ranked #15 on Word Sense Disambiguation on Words in Context
no code implementations • 16 Feb 2023 • Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari
Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies.
1 code implementation • ACL 2022 • Pengfei Li, Liangyou Li, Meng Zhang, Minghao Wu, Qun Liu
To the best of our knowledge, this is the first work to pre-train a unified model for fine-tuning on both NMT tasks.
no code implementations • EMNLP 2021 • Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari, Qun Liu
In this work, we propose an approach, MultiUAT, that dynamically adjusts the training data usage based on the model's uncertainty on a small set of trusted clean data for multi-corpus machine translation.
1 code implementation • EMNLP 2018 • Minghao Wu, Fei Liu, Trevor Cohn
Conventional wisdom is that hand-crafted features are redundant for deep learning models, as they already learn adequate representations of text automatically from corpora.
Ranked #43 on Named Entity Recognition (NER) on CoNLL 2003 (English)