Search Results for author: Yixuan Su

Found 28 papers, 18 papers with code

Paper
Add Code

StarCoder 2 and The Stack v2: The Next Generation

no code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.

Ranked #24 on Code Generation on MBPP

Code Completion Code Generation +1

Paper
Add Code

Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence

1 code implementation • 15 Feb 2024 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier

Recent large language models (LLMs) have shown remarkable performance in aligning generated text with user intentions across various tasks.

Coherence Evaluation Text Generation

Paper
Code

Instruct-SCTG: Guiding Sequential Controlled Text Generation through Instructions

no code implementations • 19 Dec 2023 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier

Instruction-tuned large language models have shown remarkable performance in aligning generated text with user intentions across various tasks.

Text Generation

Paper
Add Code

Specialist or Generalist? Instruction Tuning for Specific NLP Tasks

no code implementations • 23 Oct 2023 • Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai

Although instruction tuning has proven to be a data-efficient method for transforming LLMs into such generalist models, their performance still lags behind specialist models trained exclusively for specific tasks.

Specificity

Paper
Add Code

Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models

1 code implementation • 31 Aug 2023 • Yupan Huang, Zaiqiao Meng, Fangyu Liu, Yixuan Su, Nigel Collier, Yutong Lu

Our experiments validate the effectiveness of SparklesChat in understanding and reasoning across multiple images and dialogue turns.

Instruction Following Visual Reasoning

Paper
Code

PandaGPT: One Model To Instruction-Follow Them All

1 code implementation • 25 May 2023 • Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, Deng Cai

To do so, PandaGPT combines the multimodal encoders from ImageBind and the large language models from Vicuna.

Instruction Following

726

Paper
Code

Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization

1 code implementation • 22 May 2023 • Zihao Fu, Yixuan Su, Zaiqiao Meng, Nigel Collier

To alleviate the need of human effort, dictionary-based approaches have been proposed to extract named entities simply based on a given dictionary.

named-entity-recognition Named Entity Recognition

Paper
Code

COFFEE: A Contrastive Oracle-Free Framework for Event Extraction

no code implementations • 25 Mar 2023 • Meiru Zhang, Yixuan Su, Zaiqiao Meng, Zihao Fu, Nigel Collier

In this study, we consider a more realistic setting of this task, namely the Oracle-Free Event Extraction (OFEE) task, where only the input context is given without any oracle information, including event type, event ontology and trigger word.

Event Extraction

Paper
Add Code

Plug-and-Play Recipe Generation with Content Planning

no code implementations • 9 Dec 2022 • Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier

Specifically, it optimizes the joint distribution of the natural language sequence and the global content plan in a plug-and-play manner.

Recipe Generation Sentence +1

Paper
Add Code

Momentum Decoding: Open-ended Text Generation As Graph Exploration

1 code implementation • 5 Dec 2022 • Tian Lan, Yixuan Su, Shuhang Liu, Heyan Huang, Xian-Ling Mao

In this study, we formulate open-ended text generation from a new perspective, i. e., we view it as an exploration process within a directed graph.

Text Generation

Paper
Code

An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation

2 code implementations • 19 Nov 2022 • Yixuan Su, Jialu Xu

In the study, we empirically compare the two recently proposed decoding methods, i. e. Contrastive Search (CS) and Contrastive Decoding (CD), for open-ended text generation.

Text Generation

114

Paper
Code

Contrastive Search Is What You Need For Neural Text Generation

2 code implementations • 25 Oct 2022 • Yixuan Su, Nigel Collier

Based on our findings, we further assess the contrastive search decoding method using off-the-shelf LMs on four generation tasks across 16 languages.

Contrastive Learning Language Modelling +1

444

Paper
Code

From Easy to Hard: A Dual Curriculum Learning Framework for Context-Aware Document Ranking

1 code implementation • 22 Aug 2022 • Yutao Zhu, Jian-Yun Nie, Yixuan Su, Haonan Chen, Xinyu Zhang, Zhicheng Dou

In this work, we propose a curriculum learning framework for context-aware document ranking, in which the ranking model learns matching signals between the search context and the candidate document in an easy-to-hard manner.

Document Ranking

Paper
Code

Language Models Can See: Plugging Visual Controls in Text Generation

1 code implementation • 5 May 2022 • Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, Nigel Collier

MAGIC is a flexible framework and is theoretically compatible with any text generation tasks that incorporate image grounding.

Image Captioning Image-text matching +3

250

Paper
Code

A Contrastive Framework for Neural Text Generation

2 code implementations • 13 Feb 2022 • Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, Nigel Collier

Text generation is of great importance to many natural language processing applications.

Text Generation

444

Paper
Code

Measuring and Reducing Model Update Regression in Structured Prediction for NLP

no code implementations • 7 Feb 2022 • Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang

First, we measure and analyze model update regression in different model update settings.

Dependency Parsing Knowledge Distillation +4

Paper
Add Code

A Survey on Retrieval-Augmented Text Generation

no code implementations • 2 Feb 2022 • Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu

Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community.

Machine Translation Response Generation +3

Paper
Add Code

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

2 code implementations • Findings (NAACL) 2022 • Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, Nigel Collier

Masked language models (MLMs) such as BERT and RoBERTa have revolutionized the field of Natural Language Understanding in the past few years.

Contrastive Learning Natural Language Understanding

Paper
Code

Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models

1 code implementation • ACL 2022 • Zaiqiao Meng, Fangyu Liu, Ehsan Shareghi, Yixuan Su, Charlotte Collins, Nigel Collier

To catalyse the research in this direction, we release a well-curated biomedical knowledge probing benchmark, MedLAMA, which is constructed based on the Unified Medical Language System (UMLS) Metathesaurus.

Knowledge Probing Transfer Learning

Paper
Code

Exploring Dense Retrieval for Dialogue Response Selection

1 code implementation • 13 Oct 2021 • Tian Lan, Deng Cai, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao

In this study, we present a solution to directly select proper responses from a large corpus or even a nonparallel corpus that only consists of unpaired sentences, using a dense retrieval model.

Conversational Response Selection Retrieval

Paper
Code

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

2 code implementations • ACL 2022 • Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems.

Dialogue State Tracking End-To-End Dialogue Modelling +2

150

Paper
Code

Plan-then-Generate: Controlled Data-to-Text Generation via Planning

2 code implementations • Findings (EMNLP) 2021 • Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, Nigel Collier

However, the lack of ability of neural models to control the structure of generated output can be limiting in certain real-world applications.

Data-to-Text Generation Sentence

419

Paper
Code

Few-Shot Table-to-Text Generation with Prototype Memory

1 code implementation • Findings (EMNLP) 2021 • Yixuan Su, Zaiqiao Meng, Simon Baker, Nigel Collier

Neural table-to-text generation models have achieved remarkable progress on an array of tasks.

Table-to-Text Generation

Paper
Code

Non-Autoregressive Text Generation with Pre-trained Language Models

1 code implementation • EACL 2021 • Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier

In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance.

Machine Translation Sentence +3

Paper
Code

Dialogue Response Selection with Hierarchical Curriculum Learning

1 code implementation • ACL 2021 • Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang

As for IC, it progressively strengthens the model's ability in identifying the mismatching information between the dialogue context and a response candidate.

Ranked #3 on Conversational Response Selection on RRS

Conversational Response Selection

Paper
Code

Prototype-to-Style: Dialogue Generation with Style-Aware Editing on Retrieval Memory

no code implementations • 5 Apr 2020 • Yixuan Su, Yan Wang, Simon Baker, Deng Cai, Xiaojiang Liu, Anna Korhonen, Nigel Collier

A stylistic response generator then takes the prototype and the desired language style as model input to obtain a high-quality and stylistic response.

Dialogue Generation Information Retrieval +1

Paper
Add Code

Stylistic Dialogue Generation via Information-Guided Reinforcement Learning Strategy

no code implementations • 5 Apr 2020 • Yixuan Su, Deng Cai, Yan Wang, Simon Baker, Anna Korhonen, Nigel Collier, Xiaojiang Liu

To enable better balance between the content quality and the style, we introduce a new training strategy, know as Information-Guided Reinforcement Learning (IG-RL).

Dialogue Generation reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.