no code implementations • 20 Aug 2024 • Viraat Aryabumi, Yixuan Su, Raymond Ma, Adrien Morisot, Ivan Zhang, Acyr Locatelli, Marzieh Fadaee, Ahmet Üstün, Sara Hooker
In this work, we systematically investigate the impact of code data on general performance.
1 code implementation • 6 Aug 2024 • Zongqian Li, Yixuan Su, Nigel Collier
Prompt compression is crucial for enhancing inference speed, reducing costs, and improving user experience.
no code implementations • 29 Apr 2024 • Pat Verga, Sebastian Hofstatter, Sophia Althammer, Yixuan Su, Aleksandra Piktus, Arkady Arkhangorodsky, Minjie Xu, Naomi White, Patrick Lewis
As Large Language Models (LLMs) have become more advanced, they have outpaced our abilities to accurately evaluate their quality.
2 code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries
Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.
Ranked #27 on Code Generation on MBPP
1 code implementation • 15 Feb 2024 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier
Recent large language models (LLMs) have shown remarkable performance in aligning generated text with user intentions across various tasks.
no code implementations • 19 Dec 2023 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier
Instruction-tuned large language models have shown remarkable performance in aligning generated text with user intentions across various tasks.
no code implementations • 23 Oct 2023 • Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai
Although instruction tuning has proven to be a data-efficient method for transforming LLMs into such generalist models, their performance still lags behind specialist models trained exclusively for specific tasks.
1 code implementation • 31 Aug 2023 • Yupan Huang, Zaiqiao Meng, Fangyu Liu, Yixuan Su, Nigel Collier, Yutong Lu
Furthermore, we construct SparklesEval, a GPT-assisted benchmark for quantitatively assessing a model's conversational competence across multiple images and dialogue turns.
1 code implementation • 25 May 2023 • Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, Deng Cai
To do so, PandaGPT combines the multimodal encoders from ImageBind and the large language models from Vicuna.
1 code implementation • 22 May 2023 • Zihao Fu, Yixuan Su, Zaiqiao Meng, Nigel Collier
To alleviate the need of human effort, dictionary-based approaches have been proposed to extract named entities simply based on a given dictionary.
1 code implementation • 25 Mar 2023 • Meiru Zhang, Yixuan Su, Zaiqiao Meng, Zihao Fu, Nigel Collier
In this study, we consider a more realistic setting of this task, namely the Oracle-Free Event Extraction (OFEE) task, where only the input context is given without any oracle information, including event type, event ontology and trigger word.
no code implementations • 9 Dec 2022 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier
Specifically, it optimizes the joint distribution of the natural language sequence and the global content plan in a plug-and-play manner.
1 code implementation • 5 Dec 2022 • Tian Lan, Yixuan Su, Shuhang Liu, Heyan Huang, Xian-Ling Mao
In this study, we formulate open-ended text generation from a new perspective, i. e., we view it as an exploration process within a directed graph.
3 code implementations • 19 Nov 2022 • Yixuan Su, Jialu Xu
In the study, we empirically compare the two recently proposed decoding methods, i. e. Contrastive Search (CS) and Contrastive Decoding (CD), for open-ended text generation.
3 code implementations • 25 Oct 2022 • Yixuan Su, Nigel Collier
Based on our findings, we further assess the contrastive search decoding method using off-the-shelf LMs on four generation tasks across 16 languages.
1 code implementation • 22 Aug 2022 • Yutao Zhu, Jian-Yun Nie, Yixuan Su, Haonan Chen, Xinyu Zhang, Zhicheng Dou
In this work, we propose a curriculum learning framework for context-aware document ranking, in which the ranking model learns matching signals between the search context and the candidate document in an easy-to-hard manner.
1 code implementation • 5 May 2022 • Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, Nigel Collier
MAGIC is a flexible framework and is theoretically compatible with any text generation tasks that incorporate image grounding.
2 code implementations • 13 Feb 2022 • Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, Nigel Collier
Text generation is of great importance to many natural language processing applications.
no code implementations • 7 Feb 2022 • Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang
First, we measure and analyze model update regression in different model update settings.
no code implementations • 2 Feb 2022 • Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community.
2 code implementations • Findings (NAACL) 2022 • Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, Nigel Collier
Masked language models (MLMs) such as BERT and RoBERTa have revolutionized the field of Natural Language Understanding in the past few years.
1 code implementation • ACL 2022 • Zaiqiao Meng, Fangyu Liu, Ehsan Shareghi, Yixuan Su, Charlotte Collins, Nigel Collier
To catalyse the research in this direction, we release a well-curated biomedical knowledge probing benchmark, MedLAMA, which is constructed based on the Unified Medical Language System (UMLS) Metathesaurus.
1 code implementation • 13 Oct 2021 • Tian Lan, Deng Cai, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao
In this study, we present a solution to directly select proper responses from a large corpus or even a nonparallel corpus that only consists of unpaired sentences, using a dense retrieval model.
2 code implementations • ACL 2022 • Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang
Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems.
2 code implementations • Findings (EMNLP) 2021 • Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, Nigel Collier
However, the lack of ability of neural models to control the structure of generated output can be limiting in certain real-world applications.
1 code implementation • Findings (EMNLP) 2021 • Yixuan Su, Zaiqiao Meng, Simon Baker, Nigel Collier
Neural table-to-text generation models have achieved remarkable progress on an array of tasks.
1 code implementation • EACL 2021 • Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier
In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance.
1 code implementation • ACL 2021 • Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang
As for IC, it progressively strengthens the model's ability in identifying the mismatching information between the dialogue context and a response candidate.
Ranked #3 on Conversational Response Selection on RRS
no code implementations • 5 Apr 2020 • Yixuan Su, Yan Wang, Simon Baker, Deng Cai, Xiaojiang Liu, Anna Korhonen, Nigel Collier
A stylistic response generator then takes the prototype and the desired language style as model input to obtain a high-quality and stylistic response.
no code implementations • 5 Apr 2020 • Yixuan Su, Deng Cai, Yan Wang, Simon Baker, Anna Korhonen, Nigel Collier, Xiaojiang Liu
To enable better balance between the content quality and the style, we introduce a new training strategy, know as Information-Guided Reinforcement Learning (IG-RL).