Search Results for author: Raymond Li

Found 20 papers, 11 papers with code

T3-Vis: visual analytic for Training and fine-Tuning Transformers in NLP

1 code implementation • EMNLP (ACL) 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini

Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.

Paper
Code

SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

no code implementations • 24 Mar 2024 • Mohammadreza Pourreza, Davood Rafiei, Yuxi Feng, Raymond Li, Zhenan Fan, Weiwei Zhang

Furthermore, compared to these competitive models, our proposed encoder enhances the downstream performance of NL2SQL models in 1-shot in-context learning scenarios by 1-2\% for GPT-3. 5-turbo, 4-8\% for CodeLlama-7B, and 2-3\% for CodeLlama-13B.

In-Context Learning

Paper
Add Code

StarCoder 2 and The Stack v2: The Next Generation

no code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.

Ranked #25 on Code Generation on MBPP

Code Completion Code Generation +1

Paper
Add Code

Visual Analytics for Generative Transformer Models

no code implementations • 21 Nov 2023 • Raymond Li, Ruixin Yang, Wen Xiao, Ahmed Aburaed, Gabriel Murray, Giuseppe Carenini

While transformer-based models have achieved state-of-the-art results in a variety of classification and generation tasks, their black-box nature makes them challenging for interpretability.

Paper
Add Code

Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models

no code implementations • 24 Oct 2023 • Raymond Li, Gabriel Murray, Giuseppe Carenini

In this work, we propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models in the parameter-efficient fine-tuning (PEFT) setting.

Paper
Add Code

Diversity-Aware Coherence Loss for Improving Neural Topic Models

1 code implementation • 25 May 2023 • Raymond Li, Felipe González-Pizarro, Linzi Xing, Gabriel Murray, Giuseppe Carenini

The standard approach for neural topic modeling uses a variational autoencoder (VAE) framework that jointly minimizes the KL divergence between the estimated posterior and prior, in addition to the reconstruction loss.

Topic Models

Paper
Code

StarCoder: may the source be with you!

4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

Ranked #43 on Code Generation on MBPP

8k Code Generation

7,103

Paper
Code

NL4Opt Competition: Formulating Optimization Problems Based on Their Natural Language Descriptions

1 code implementation • 14 Mar 2023 • Rindranirina Ramamonjison, Timothy T. Yu, Raymond Li, Haley Li, Giuseppe Carenini, Bissan Ghaddar, Shiqi He, Mahdi Mostajabdaveh, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang

The Natural Language for Optimization (NL4Opt) Competition was created to investigate methods of extracting the meaning and formulation of an optimization problem based on its text description.

Language Modelling Large Language Model

Paper
Code

SantaCoder: don't reach for the stars!

5 code implementations • 9 Jan 2023 • Loubna Ben allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra

The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code.

Code Generation

7,103

Paper
Code

The Stack: 3 TB of permissively licensed source code

no code implementations • 20 Nov 2022 • Denis Kocetkov, Raymond Li, Loubna Ben allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, Dzmitry Bahdanau, Leandro von Werra, Harm de Vries

Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language processing but also for code understanding and generation.

Paper
Add Code

OSLAT: Open Set Label Attention Transformer for Medical Entity Retrieval and Span Extraction

1 code implementation • 12 Jul 2022 • Raymond Li, Ilya Valmianski, Li Deng, Xavier Amatriain, Anitha Kannan

In this paper, we propose a method for linking an open set of entities that does not require any span annotations.

Entity Linking Entity Retrieval +1

Paper
Code

Evaluating the Text-to-SQL Capabilities of Large Language Models

no code implementations • 15 Mar 2022 • Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau

We perform an empirical evaluation of Text-to-SQL capabilities of the Codex language model.

Language Modelling Text-To-SQL

Paper
Add Code

Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation

1 code implementation • 10 Dec 2021 • Raymond Li, Wen Xiao, Linzi Xing, Lanjun Wang, Gabriel Murray, Giuseppe Carenini

The multi-head self-attention mechanism of the transformer model has been thoroughly investigated recently.

Extractive Summarization Knowledge Distillation

Paper
Code

T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP

1 code implementation • 31 Aug 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini

Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.

Paper
Code

ConVIScope: Visual Analytics for Exploring Patient Conversations

no code implementations • 30 Aug 2021 • Raymond Li, Enamul Hoque, Giuseppe Carenini, Richard Lester, Raymond Chau

The proliferation of text messaging for mobile health is generating a large amount of patient-doctor conversations that can be extremely valuable to health care professionals.

Paper
Add Code

byteSteady: Fast Classification Using Byte-Level n-Gram Embeddings

no code implementations • 24 Jun 2021 • Xiang Zhang, Alexandre Drouin, Raymond Li

This article introduces byteSteady -- a fast model for classification using byte-level n-gram embeddings.

text-classification Text Classification

Paper
Add Code

DuoRAT: Towards Simpler Text-to-SQL Models

1 code implementation • NAACL 2021 • Torsten Scholak, Raymond Li, Dzmitry Bahdanau, Harm de Vries, Chris Pal

Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases.

Text-To-SQL

Paper
Code

Video Monitoring Queries

no code implementations • 24 Feb 2020 • Nick Koudas, Raymond Li, Ioannis Xarchakos

We demonstrate that the application of the techniques proposed in conjunction with declarative queries on video streams can dramatically increase the frame processing rate and speed up query processing by at least two orders of magnitude.

Image Classification object-detection +1