Search Results for author: Alvin Cheung

Found 18 papers, 11 papers with code

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

1 code implementation • 22 Apr 2024 • Tyler Griggs, Xiaoxuan Liu, Jiaxiang Yu, Doyoung Kim, Wei-Lin Chiang, Alvin Cheung, Ion Stoica

Within this space, we show that there is not a linear relationship between GPU cost and performance, and identify three key LLM service characteristics that significantly affect which GPU type is the most cost effective: model request size, request rate, and latency service-level objective (SLO).

Language Modelling

Paper
Code

Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks

1 code implementation • 7 Mar 2024 • Linyuan Gong, Sida Wang, Mostafa Elhoushi, Alvin Cheung

We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.

Ranked #1 on Code Completion on SAFIM

Code Completion

Paper
Code

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

1 code implementation • 5 Jan 2024 • Linyuan Gong, Mostafa Elhoushi, Alvin Cheung

Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature.

Code Generation

Paper
Code

Online Speculative Decoding

no code implementations • 11 Oct 2023 • Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Ion Stoica, Zhijie Deng, Alvin Cheung, Hao Zhang

We develop a prototype of online speculative decoding based on online knowledge distillation and evaluate it using both synthetic and real query data on several popular LLMs.

Knowledge Distillation

Paper
Add Code

Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations

1 code implementation • 7 Aug 2023 • Chanwut Kittivorawong, Yongming Ge, Yousef Helal, Alvin Cheung

In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos.

Management

Paper
Code

SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics

no code implementations • 29 May 2023 • Arash Ardakani, Altan Haan, Shangyin Tan, Doru Thom Popovici, Alvin Cheung, Costin Iancu, Koushik Sen

This allows SlimFit to freeze up to 95% of layers and reduce the overall on-device GPU memory usage of transformer-based models such as ViT and BERT by an average of 2. 2x, across different NLP and CV benchmarks/datasets such as GLUE, SQuAD 2. 0, CIFAR-10, CIFAR-100 and ImageNet with an average degradation of 0. 2% in accuracy.

Quantization Scheduling

Paper
Add Code

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers

1 code implementation • 21 May 2023 • Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, Alvin Cheung, Jianfeng Gao, Xia Song

This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.

Zero-shot Generalization

Paper
Code

An Evaluation of Memory Optimization Methods for Training Neural Networks

no code implementations • 26 Mar 2023 • Xiaoxuan Liu, Siddharth Jha, Alvin Cheung

To address the challenge, this paper summarizes the scenarios in which MOMs prove advantageous for model training.

Quantization

Paper
Add Code

ADELT: Transpilation Between Deep Learning Frameworks

no code implementations • 7 Mar 2023 • Linyuan Gong, Jiayi Wang, Alvin Cheung

We propose the Adversarial DEep Learning Transpiler (ADELT), a novel approach to source-to-source transpilation between deep learning frameworks.

Paper
Add Code

NumS: Scalable Array Programming for the Cloud

1 code implementation • 28 Jun 2022 • Melih Elibol, Vinamra Benara, Samyu Yagati, Lianmin Zheng, Alvin Cheung, Michael I. Jordan, Ion Stoica

LSHS is a local search method which optimizes operator placement by minimizing maximum memory and network load on any given node within a distributed system.

regression Scheduling

131

Paper
Code

GACT: Activation Compressed Training for Generic Network Architectures

1 code implementation • 22 Jun 2022 • Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung

Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint.

Paper
Code

PlotCoder: Hierarchical Decoding for Synthesizing Visualization Code in Programmatic Context

1 code implementation • ACL 2021 • Xinyun Chen, Linyuan Gong, Alvin Cheung, Dawn Song

Creating effective visualization is an important part of data analytics.

Paper
Code

Falx: Synthesis-Powered Visualization Authoring

no code implementations • 1 Feb 2021 • Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, Amy J. Ko

Modern visualization tools aim to allow data analysts to easily create exploratory visualizations.

Human-Computer Interaction Programming Languages

Paper
Add Code

New Directions in Cloud Programming

1 code implementation • 4 Jan 2021 • Alvin Cheung, Natacha Crooks, Joseph M. Hellerstein, Matthew Milano

Nearly twenty years after the launch of AWS, it remains difficult for most developers to harness the enormous potential of the cloud.

Program Synthesis Distributed, Parallel, and Cluster Computing Databases Operating Systems Programming Languages

Paper
Code

Learning Programmatic Idioms for Scalable Semantic Parsing

no code implementations • IJCNLP 2019 • Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer

Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens.

Code Generation Semantic Parsing

Paper
Add Code

Mapping Language to Code in Programmatic Context

1 code implementation • EMNLP 2018 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Luke Zettlemoyer

To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class.

Paper
Code

Learning a Neural Semantic Parser from User Feedback

no code implementations • ACL 2017 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, Luke Zettlemoyer

We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention.

Ranked #1 on SQL Parsing on Restaurants

SQL Parsing

Paper
Add Code

Summarizing Source Code using a Neural Attention Model

1 code implementation • ACL 2016 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Luke Zettlemoyer

Source Code Summarization

235

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.