Search Results for author: Terry Yue Zhuo

Found 20 papers, 12 papers with code

XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

1 code implementation • 23 Apr 2024 • Yifeng Ding, Jiawei Liu, Yuxiang Wei, Terry Yue Zhuo, Lingming Zhang

We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs).

Paper
Code

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.

Continual Pretraining Language Modelling

Paper
Add Code

StarCoder 2 and The Stack v2: The Next Generation

no code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.

Ranked #25 on Code Generation on MBPP

Code Completion Code Generation +1

Paper
Add Code

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

2 code implementations • 1 Jan 2024 • Terry Yue Zhuo, Armel Zebaze, Nitchakarn Suppattarachai, Leandro von Werra, Harm de Vries, Qian Liu, Niklas Muennighoff

Through investigations across 5 tasks and 8 different datasets encompassing both code comprehension and code generation tasks, we find that FFT generally leads to the best downstream performance across all scales, and PEFT methods differ significantly in their efficacy based on the model scale.

Code Generation

940

Paper
Code

Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

1 code implementation • 23 Oct 2023 • Xiaoxi Kang, Lizhen Qu, Lay-Ki Soon, Adnan Trakic, Terry Yue Zhuo, Patrick Charles Emerton, Genevieve Grant

Each scenario in the corpus is annotated with a complete IRAC analysis in a semi-structured format so that both machines and legal professionals are able to interpret and understand the annotations.

Legal Reasoning

Paper
Code

Fake News Detectors are Biased against Texts Generated by Large Language Models

no code implementations • 15 Sep 2023 • Jinyan Su, Terry Yue Zhuo, Jonibek Mansurov, Di Wang, Preslav Nakov

The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society.

Misinformation

Paper
Add Code

Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names?

no code implementations • 14 Sep 2023 • Terry Yue Zhuo, Xiaoning Du, Zhenchang Xing, Jiamou Sun, Haowei Quan, Li Li, Liming Zhu

The correctness and unambiguity of API usage among these code models are crucial for achieving desirable program functionalities, requiring them to learn various API fully qualified names structurally and semantically.

Code Generation Knowledge Probing

Paper
Add Code

OctoPack: Instruction Tuning Code Large Language Models

2 code implementations • 14 Aug 2023 • Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre

We benchmark CommitPack against other natural and synthetic code instructions (xP3x, Self-Instruct, OASST) on the 16B parameter StarCoder model, and achieve state-of-the-art performance among models not trained on OpenAI outputs, on the HumanEval Python benchmark (46. 2% pass@1).

Ranked #5 on Code Generation on HumanEval

Code Generation Code Repair

630

Paper
Code

Source Code Data Augmentation for Deep Learning: A Survey

1 code implementation • 31 May 2023 • Terry Yue Zhuo, Zhou Yang, Zhensu Sun, YuFei Wang, Li Li, Xiaoning Du, Zhenchang Xing, David Lo

This paper fills this gap by conducting a comprehensive and integrative survey of data augmentation for source code, wherein we systematically compile and encapsulate existing literature to provide a comprehensive overview of the field.

Data Augmentation

Paper
Code

FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing

1 code implementation • 27 May 2023 • Zhuang Li, Yuyang Chai, Terry Yue Zhuo, Lizhen Qu, Gholamreza Haffari, Fei Li, Donghong Ji, Quan Hung Tran

Textual scene graph parsing has become increasingly important in various vision-language applications, including image caption evaluation and image retrieval.

Ranked #2 on Human Judgment Correlation on Flickr8k-Expert

Paper
Code

DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

1 code implementation • 23 May 2023 • Jinyan Su, Terry Yue Zhuo, Di Wang, Preslav Nakov

One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations.

Misinformation

Paper
Code

StarCoder: may the source be with you!

4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

Ranked #43 on Code Generation on MBPP

8k Code Generation

7,108

Paper
Code

ICE-Score: Instructing Large Language Models to Evaluate Code

2 code implementations • 27 Apr 2023 • Terry Yue Zhuo

Token-matching-based metrics, such as BLEU, have demonstrated weak correlations with human practitioners in code intelligence tasks.

Code Generation Machine Translation +1

Paper
Code

Training-free Lexical Backdoor Attacks on Language Models

1 code implementation • 8 Feb 2023 • Yujin Huang, Terry Yue Zhuo, Qiongkai Xu, Han Hu, Xingliang Yuan, Chunyang Chen

In this work, we propose Training-Free Lexical Backdoor Attack (TFLexAttack) as the first training-free backdoor attack on language models.

Backdoor Attack Data Poisoning +1

Paper
Code

Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity

no code implementations • 30 Jan 2023 • Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing

We believe that our findings may give light on future efforts to determine and mitigate the ethical hazards posed by machines in LLM applications.

Ethics Language Modelling

Paper
Add Code

On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

no code implementations • 30 Jan 2023 • Terry Yue Zhuo, Zhuang Li, Yujin Huang, Fatemeh Shiri, Weiqing Wang, Gholamreza Haffari, Yuan-Fang Li

Semantic parsing is a technique aimed at constructing a structured representation of the meaning of a natural-language question.

Adversarial Robustness Language Modelling +1

Paper
Add Code

SantaCoder: don't reach for the stars!

5 code implementations • 9 Jan 2023 • Loubna Ben allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra

The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code.

Code Generation

7,108

Paper
Code

ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities

no code implementations • 11 Oct 2022 • Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu

We introduce ViLPAct, a novel vision-language benchmark for human activity planning.

Paper
Add Code

Rethinking Round-Trip Translation for Machine Translation Evaluation

1 code implementation • 15 Sep 2022 • Terry Yue Zhuo, Qiongkai Xu, Xuanli He, Trevor Cohn

Round-trip translation could be served as a clever and straightforward technique to alleviate the requirement of the parallel evaluation corpus.

Machine Translation Translation

Paper
Code

Paraphrasing Techniques for Maritime QA system

no code implementations • 21 Mar 2022 • Fatemeh Shiri, Terry Yue Zhuo, Zhuang Li, Van Nguyen, Shirui Pan, Weiqing Wang, Reza Haffari, Yuan-Fang Li

In this paper, we investigate how to exploit paraphrasing methods for the automated generation of large-scale training datasets (in the form of paraphrased utterances and their corresponding logical forms in SQL format) and present our experimental results using real-world data in the maritime domain.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.