no code implementations • 7 Feb 2020 • Rui Liu, Sanjay Krishnan, Aaron J. Elmore, Michael J. Franklin
As neural networks are increasingly employed in machine learning practice, how to efficiently share limited training resources among a diverse set of model training tasks becomes a crucial issue.
no code implementations • 27 Nov 2018 • Ryan Chard, Zhuozhao Li, Kyle Chard, Logan Ward, Yadu Babuji, Anna Woodard, Steve Tuecke, Ben Blaiszik, Michael J. Franklin, Ian Foster
Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications.
no code implementations • 9 Dec 2016 • Daniel Crankshaw, Xin Wang, Giulio Zhou, Michael J. Franklin, Joseph E. Gonzalez, Ion Stoica
In this paper, we introduce Clipper, a general-purpose low-latency prediction serving system.
no code implementations • 29 Oct 2016 • Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, Benjamin Recht
Modern advanced analytics applications make use of machine learning techniques and contain multiple steps of domain-specific and general-purpose processing with high resource requirements.
no code implementations • 10 Mar 2016 • Francois W. Belletti, Evan R. Sparks, Michael J. Franklin, Alexandre M. Bayen, Joseph E. Gonzalez
Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering.
no code implementations • 15 Jan 2016 • Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J. Franklin, Ken Goldberg
Data cleaning is often an important step to ensure that predictive models, such as regression and classification, are not affected by systematic errors such as inconsistent, out-of-date, or outlier data.
no code implementations • 26 May 2015 • Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar
Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks.
no code implementations • 31 Jan 2015 • Evan R. Sparks, Ameet Talwalkar, Michael J. Franklin, Michael. I. Jordan, Tim Kraska
The proliferation of massive datasets combined with the development of sophisticated analytical techniques have enabled a wide variety of novel applications such as improved product recommendations, automatic image tagging, and improved speech-driven interfaces.
2 code implementations • 12 Sep 2014 • Daniel Crankshaw, Peter Bailis, Joseph E. Gonzalez, Haoyuan Li, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, Michael. I. Jordan
In this work, we present Velox, a new component of the Berkeley Data Analytics Stack.
Databases
no code implementations • 21 Oct 2013 • Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael. I. Jordan, Tim Kraska
MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing.
no code implementations • 17 Sep 2012 • Barzan Mozafari, Purnamrita Sarkar, Michael J. Franklin, Michael. I. Jordan, Samuel Madden
Based on this observation, we present two new active learning algorithms to combine humans and algorithms together in a crowd-sourced database.