Search Results for author: Lingjiao Chen

Found 20 papers, 4 papers with code

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems

no code implementations4 Mar 2024 Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou

We find empirically that across multiple language tasks, surprisingly, Voting Inference Systems' performance first increases but then decreases as a function of the number of LLM calls.

Language Modelling Large Language Model

Data Acquisition: A New Frontier in Data-centric AI

no code implementations22 Nov 2023 Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative.

How is ChatGPT's behavior changing over time?

4 code implementations18 Jul 2023 Lingjiao Chen, Matei Zaharia, James Zou

We find that the performance and behavior of both GPT-3. 5 and GPT-4 can vary greatly over time.

Code Generation Language Modelling +3

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

no code implementations9 May 2023 Lingjiao Chen, Matei Zaharia, James Zou

There is a rapidly growing number of large language models (LLMs) that users can query for a fee.

SEAL : Interactive Tool for Systematic Error Analysis and Labeling

no code implementations11 Oct 2022 Nazneen Rajani, Weixin Liang, Lingjiao Chen, Meg Mitchell, James Zou

With the advent of Transformers, large language models (LLMs) have saturated well-known NLP benchmarks and leaderboards with high aggregate performance.

HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

1 code implementation18 Sep 2022 Lingjiao Chen, Zhihua Jin, Sabri Eyuboglu, Christopher Ré, Matei Zaharia, James Zou

HAPI is the first large-scale dataset of ML API usages and is a unique resource for studying ML-as-a-service (MLaaS).

object-detection Object Detection +4

Estimating and Explaining Model Performance When Both Covariates and Labels Shift

no code implementations18 Sep 2022 Lingjiao Chen, Matei Zaharia, James Zou

We further propose SEES, an algorithmic framework to characterize the distribution shift under SJS and to estimate a model's performance on new data without any labels.

Did the Model Change? Efficiently Assessing Machine Learning API Shifts

no code implementations29 Jul 2021 Lingjiao Chen, Tracy Cai, Matei Zaharia, James Zou

This motivated us to formulate the API shift assessment problem at a more fine-grained level as estimating how the API model's confusion matrix changes over time when the data distribution is constant.

BIG-bench Machine Learning

Efficient Online ML API Selection for Multi-Label Classification Tasks

no code implementations18 Feb 2021 Lingjiao Chen, Matei Zaharia, James Zou

In this work, we propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting user's budget.

General Classification Multi-Label Classification +7

The Effect of Network Width on the Performance of Large-batch Training

no code implementations NeurIPS 2018 Lingjiao Chen, Hongyi Wang, Jinman Zhao, Dimitris Papailiopoulos, Paraschos Koutris

Distributed implementations of mini-batch stochastic gradient descent (SGD) suffer from communication overheads, attributed to the high frequency of gradient updates inherent in small-batch training.

Model-based Pricing for Machine Learning in a Data Marketplace

no code implementations26 May 2018 Lingjiao Chen, Paraschos Koutris, Arun Kumar

Finally, we conduct extensive experiments, which validate that the MBP framework can provide high revenue to the seller, high affordability to the buyer, and also operate on low runtime cost.

BIG-bench Machine Learning

DRACO: Byzantine-resilient Distributed Training via Redundant Gradients

1 code implementation ICML 2018 Lingjiao Chen, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

Distributed model training is vulnerable to byzantine system failures and adversarial compute nodes, i. e., nodes that use malicious updates to corrupt the global model stored at a parameter server (PS).

The Manifold Assumption and Defenses Against Adversarial Perturbations

no code implementations ICLR 2018 Xi Wu, Uyeong Jang, Lingjiao Chen, Somesh Jha

Interestingly, we find that a recent objective by Madry et al. encourages training a model that satisfies well our formal version of the goodness property, but has a weak control of points that are wrong but with low confidence.

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training

no code implementations ICML 2018 Xi Wu, Uyeong Jang, Jiefeng Chen, Lingjiao Chen, Somesh Jha

In this paper we study leveraging confidence information induced by adversarial training to reinforce adversarial robustness of a given adversarially trained model.

Adversarial Robustness

Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent

no code implementations22 Feb 2017 Fengan Li, Lingjiao Chen, Yijing Zeng, Arun Kumar, Jeffrey F. Naughton, Jignesh M. Patel, Xi Wu

We fill this crucial research gap by proposing a new lossless compression scheme we call tuple-oriented compression (TOC) that is inspired by an unlikely source, the string/text compression scheme Lempel-Ziv-Welch, but tailored to MGD in a way that preserves tuple boundaries within mini-batches.

Data Compression Open-Ended Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.