no code implementations • 11 Jan 2025 • Yiming Lin, Mawil Hasan, Rohan Kosalge, Alvin Cheung, Aditya G. Parameswaran
Comprehensive evaluations on 34 diverse real-world datasets show that uncovering the template is crucial for data extraction from templatized documents.
no code implementations • 6 Aug 2024 • Charles Hong, Sahil Bhatia, Altan Haan, Shengjun Kris Dong, Dima Nikiforov, Alvin Cheung, Yakun Sophia Shao
Hardware accelerators, in particular accelerators for tensor processing, have many potential application domains.
no code implementations • 20 Jun 2024 • Xiaoxuan Liu, Cade Daniel, Langxiang Hu, Woosuk Kwon, Zhuohan Li, Xiangxi Mo, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang
SmartSpec dynamically determines the best speculation length for each request (from 0, i. e., no speculation, to many tokens) -- hence the associated speculative execution costs -- based on a new metric called goodput, which characterizes the current observed load of the entire system and the speculation accuracy.
1 code implementation • 22 Apr 2024 • Tyler Griggs, Xiaoxuan Liu, Jiaxiang Yu, Doyoung Kim, Wei-Lin Chiang, Alvin Cheung, Ion Stoica
Based on this analysis, we introduce M\'elange, a GPU allocation framework that navigates these diverse LLM service characteristics and heterogeneous GPU option space to automatically and efficiently derive the minimal-cost GPU allocation for a given LLM service.
1 code implementation • 7 Mar 2024 • Linyuan Gong, Sida Wang, Mostafa Elhoushi, Alvin Cheung
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.
Ranked #1 on Code Completion on SAFIM
1 code implementation • 5 Jan 2024 • Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature.
1 code implementation • 11 Oct 2023 • Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang
Adapting to query distribution mitigates the shifts between the training distribution of the draft model and the query distribution, enabling the draft model to more accurately predict the target model's outputs.
1 code implementation • 7 Aug 2023 • Chanwut Kittivorawong, Yongming Ge, Yousef Helal, Alvin Cheung
In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos.
no code implementations • 29 May 2023 • Arash Ardakani, Altan Haan, Shangyin Tan, Doru Thom Popovici, Alvin Cheung, Costin Iancu, Koushik Sen
This allows SlimFit to freeze up to 95% of layers and reduce the overall on-device GPU memory usage of transformer-based models such as ViT and BERT by an average of 2. 2x, across different NLP and CV benchmarks/datasets such as GLUE, SQuAD 2. 0, CIFAR-10, CIFAR-100 and ImageNet with an average degradation of 0. 2% in accuracy.
1 code implementation • 21 May 2023 • Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, Alvin Cheung, Jianfeng Gao, Xia Song
This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.
no code implementations • 26 Mar 2023 • Xiaoxuan Liu, Siddharth Jha, Alvin Cheung
To address the challenge, this paper summarizes the scenarios in which MOMs prove advantageous for model training.
no code implementations • 7 Mar 2023 • Linyuan Gong, Jiayi Wang, Alvin Cheung
We propose the Adversarial DEep Learning Transpiler (ADELT), a novel approach to source-to-source transpilation between deep learning frameworks.
1 code implementation • 28 Jun 2022 • Melih Elibol, Vinamra Benara, Samyu Yagati, Lianmin Zheng, Alvin Cheung, Michael I. Jordan, Ion Stoica
LSHS is a local search method which optimizes operator placement by minimizing maximum memory and network load on any given node within a distributed system.
1 code implementation • 22 Jun 2022 • Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung
Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint.
1 code implementation • ACL 2021 • Xinyun Chen, Linyuan Gong, Alvin Cheung, Dawn Song
Creating effective visualization is an important part of data analytics.
no code implementations • 1 Feb 2021 • Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, Amy J. Ko
Modern visualization tools aim to allow data analysts to easily create exploratory visualizations.
Human-Computer Interaction Programming Languages
1 code implementation • 4 Jan 2021 • Alvin Cheung, Natacha Crooks, Joseph M. Hellerstein, Matthew Milano
Nearly twenty years after the launch of AWS, it remains difficult for most developers to harness the enormous potential of the cloud.
Program Synthesis Distributed, Parallel, and Cluster Computing Databases Operating Systems Programming Languages
no code implementations • IJCNLP 2019 • Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer
Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens.
1 code implementation • EMNLP 2018 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Luke Zettlemoyer
To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class.
no code implementations • ACL 2017 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, Luke Zettlemoyer
We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention.
Ranked #1 on SQL Parsing on Restaurants