Search Results for author: Wentao Wu

Found 21 papers, 12 papers with code

Revisiting Differentially Private Regression: Lessons From Learning Theory and their Consequences

no code implementations20 Dec 2015 Xi Wu, Matthew Fredrikson, Wentao Wu, Somesh Jha, Jeffrey F. Naughton

Perhaps more importantly, our theory reveals that the most basic mechanism in differential privacy, output perturbation, can be used to obtain a better tradeoff for all convex-Lipschitz-bounded learning tasks.

Learning Theory regression

MLBench: How Good Are Machine Learning Clouds for Binary Classification Tasks on Structured Data?

no code implementations29 Jul 2017 Yu Liu, Hantian Zhang, Luyuan Zeng, Wentao Wu, Ce Zhang

We then compare the performance of the top winning code available from Kaggle with that of running machine learning clouds from both Azure and Amazon on mlbench.

BIG-bench Machine Learning Binary Classification +1

Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads

no code implementations24 Aug 2017 Tian Li, Jie Zhong, Ji Liu, Wentao Wu, Ce Zhang

We ask, as a "service provider" that manages a shared cluster of machines among all our users running machine learning workloads, what is the resource allocation strategy that maximizes the global satisfaction of all our users?

Bayesian Optimization BIG-bench Machine Learning +4

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

no code implementations1 Mar 2019 Cedric Renggli, Bojan Karlaš, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang

Continuous integration is an indispensable step of modern software engineering practices to systematically manage the life cycles of system development.

2k BIG-bench Machine Learning

Data Science through the looking glass and what we found there

no code implementations19 Dec 2019 Fotis Psallidas, Yiwen Zhu, Bojan Karlas, Matteo Interlandi, Avrilia Floratou, Konstantinos Karanasos, Wentao Wu, Ce Zhang, Subru Krishnan, Carlo Curino, Markus Weimer

The recent success of machine learning (ML) has led to an explosive growth both in terms of new systems and algorithms built in industry and academia, and new applications built by an ever-growing community of data science (DS) practitioners.

Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions

1 code implementation11 May 2020 Bojan Karlaš, Peng Li, Renzhi Wu, Nezihe Merve Gürel, Xu Chu, Wentao Wu, Ce Zhang

Machine learning (ML) applications have been thriving recently, largely attributed to the increasing availability of data.

BIG-bench Machine Learning

Automatic Feasibility Study via Data Quality Analysis for ML: A Case-Study on Label Noise

2 code implementations16 Oct 2020 Cedric Renggli, Luka Rimanic, Luka Kolar, Wentao Wu, Ce Zhang

In our experience of working with domain experts who are using today's AutoML systems, a common problem we encountered is what we call "unrealistic expectations" -- when users are facing a very challenging task with a noisy data acquisition process, while being expected to achieve startlingly high accuracy with machine learning (ML).

AutoML BIG-bench Machine Learning

A Data Quality-Driven View of MLOps

no code implementations15 Feb 2021 Cedric Renggli, Luka Rimanic, Nezihe Merve Gürel, Bojan Karlaš, Wentao Wu, Ce Zhang

Developing machine learning models can be seen as a process similar to the one established for traditional software development.

BIG-bench Machine Learning

Towards Demystifying Serverless Machine Learning Training

1 code implementation17 May 2021 Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, Ce Zhang

The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data-intensive applications such as ETL, query processing, or machine learning (ML).

BIG-bench Machine Learning

OpenBox: A Generalized Black-box Optimization Service

6 code implementations1 Jun 2021 Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, Ce Zhang, Bin Cui

Black-box optimization (BBO) has a broad range of applications, including automatic machine learning, engineering, physics, and experimental design.

Experimental Design Transfer Learning

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition

3 code implementations19 Jul 2021 Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Bolin Ding, Yaliang Li, Jingren Zhou, Zhi Yang, Wentao Wu, Ce Zhang, Bin Cui

End-to-end AutoML has attracted intensive interests from both academia and industry, which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning.

AutoML Feature Engineering +1

Data Debugging with Shapley Importance over End-to-End Machine Learning Pipelines

1 code implementation23 Apr 2022 Bojan Karlaš, David Dao, Matteo Interlandi, Bo Li, Sebastian Schelter, Wentao Wu, Ce Zhang

We present DataScope (ease. ml/datascope), the first system that efficiently computes Shapley values of training examples over an end-to-end ML pipeline, and illustrate its applications in data debugging for ML training.

BIG-bench Machine Learning Fairness

Stochastic Gradient Descent without Full Data Shuffle

1 code implementation12 Jun 2022 Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang

In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.

Computational Efficiency

MOFI: Learning Image Representations from Noisy Entity Annotated Images

1 code implementation13 Jun 2023 Wentao Wu, Aleksei Timofeev, Chen Chen, BoWen Zhang, Kun Duan, Shuangning Liu, Yantao Zheng, Jonathon Shlens, Xianzhi Du, Zhe Gan, Yinfei Yang

Our approach involves employing a named entity recognition model to extract entities from the alt-text, and then using a CLIP model to select the correct entities as labels of the paired image.

Image Classification Image Retrieval +3

ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

no code implementations25 Aug 2023 Tarique Siddiqui, Wentao Wu

The scale and complexity of workloads in modern cloud services have brought into sharper focus a critical challenge in automated index tuning -- the need to recommend high-quality indexes while maintaining index tuning scalability.

Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception

1 code implementation15 Dec 2023 Xiao Wang, Wentao Wu, Chenglong Li, Zhicheng Zhao, Zhe Chen, Yukai Shi, Jin Tang

To address this issue, we propose a novel vehicle-centric pre-training framework called VehicleMAE, which incorporates the structural information including the spatial structure from vehicle profile information and the semantic structure from informative high-level natural language descriptions for effective masked vehicle appearance reconstruction.

Budget-aware Query Tuning: An AutoML Perspective

no code implementations29 Mar 2024 Wentao Wu, Chi Wang

We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload.

AutoML

State Space Model for New-Generation Network Alternative to Transformers: A Survey

1 code implementation15 Apr 2024 Xiao Wang, Shiao Wang, Yuhe Ding, Yuehang Li, Wentao Wu, Yao Rong, Weizhe Kong, Ju Huang, Shihao Li, Haoxiang Yang, Ziwen Wang, Bo Jiang, Chenglong Li, YaoWei Wang, Yonghong Tian, Jin Tang

In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM.

Cannot find the paper you are looking for? You can Submit a new open access paper.