no code implementations • EMNLP 2020 • Kaiyu Huang, Degen Huang, Zhuang Liu, Fengran Mo
Chinese word segmentation (CWS) is an essential task for Chinese downstream NLP tasks.
no code implementations • EMNLP 2021 • Mieradilijiang Maimaiti, Yang Liu, Yuanhang Zheng, Gang Chen, Kaiyu Huang, Ji Zhang, Huanbo Luan, Maosong Sun
Besides, the robustness of the previous neural methods is limited by the large-scale annotated data.
no code implementations • WMT (EMNLP) 2021 • Huan Liu, Junpeng Liu, Kaiyu Huang, Degen Huang
This paper describes DUT-NLP Lab’s submission to the WMT-21 triangular machine translation shared task.
no code implementations • AACL (iwdp) 2020 • Kaiyu Huang, Junpeng Liu, Jingxiang Cao, Degen Huang
This paper proposes a three-step strategy to improve the performance for discourse CWS.
no code implementations • Findings (EMNLP) 2021 • Kaiyu Huang, Hao Yu, Junpeng Liu, Wei Liu, Jingxiang Cao, Degen Huang
Experimental results on five benchmarks and four cross-domain datasets show the lexicon-based graph convolutional network successfully captures the information of candidate words and helps to improve performance on the benchmarks (Bakeoff-2005 and CTB6) and the cross-domain datasets (SIGHAN-2010).
no code implementations • 10 Jan 2025 • You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li, Ruixuan Li, Maosong Sun
The recent advancement of Multimodal Large Language Models (MLLMs) has significantly improved their fine-grained perception of single images and general comprehension across multiple images.
1 code implementation • 29 Jul 2024 • Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie
In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.
1 code implementation • 23 Jul 2024 • Fengran Mo, Longxiang Zhao, Kaiyu Huang, Yue Dong, Degen Huang, Jian-Yun Nie
Personalized conversational information retrieval (CIR) combines conversational and personalizable elements to satisfy various users' complex information needs through multi-turn interaction based on their backgrounds.
2 code implementations • 27 May 2024 • Yulong Mao, Kaiyu Huang, Changhao Guan, Ganglin Bao, Fengran Mo, Jinan Xu
Fine-tuning large-scale pre-trained models is inherently a resource-intensive task.
1 code implementation • 17 May 2024 • Kaiyu Huang, Fengran Mo, Xinyu Zhang, Hongliang Li, You Li, Yuanchi Zhang, Weijian Yi, Yulong Mao, Jinchen Liu, Yuzhuang Xu, Jinan Xu, Jian-Yun Nie, Yang Liu
The survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs.
1 code implementation • 17 Mar 2024 • Fengran Mo, Bole Yi, Kelong Mao, Chen Qu, Kaiyu Huang, Jian-Yun Nie
Conversational search provides a more convenient interface for users to search by allowing multi-turn interaction with the search engine.
no code implementations • 3 Mar 2024 • Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang
Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.
1 code implementation • 30 Jan 2024 • Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie
To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns.
1 code implementation • 5 Jun 2023 • Fengran Mo, Jian-Yun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li, Yang Liu
An effective way to improve retrieval effectiveness is to expand the current query with historical queries.
1 code implementation • 25 May 2023 • Fengran Mo, Kelong Mao, Yutao Zhu, Yihong Wu, Kaiyu Huang, Jian-Yun Nie
In this paper, we propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers.