1 code implementation • 16 Oct 2023 • Manley Roberts, Himanshu Thakur, Christine Herlihy, Colin White, Samuel Dooley
Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks.
1 code implementation • 27 Jul 2023 • Colin White, Renbo Tu, Jean Kossaifi, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar
In this work, we (i) profile memory and runtime for FNO with full and mixed-precision training, (ii) conduct a study on the numerical stability of mixed-precision training of FNO, and (iii) devise a training routine which substantially decreases training time and memory usage (up to 34%), with little or no reduction in accuracy, on the Navier-Stokes and Darcy flow equations.
no code implementations • 20 Jan 2023 • Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, Frank Hutter
Specialized, high-performing neural architectures are crucial to the success of deep learning in these areas.
Natural Language Understanding
Neural Architecture Search
+2
no code implementations • 2 Nov 2022 • Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Ganesh Ramakrishnan
A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance.
1 code implementation • NeurIPS 2023 • Rhea Sanjay Sukthanker, Samuel Dooley, John P. Dickerson, Colin White, Frank Hutter, Micah Goldblum
Our search outputs a suite of models which Pareto-dominate all other high-performance architectures and existing bias mitigation methods in terms of accuracy and fairness, often by large margins, on the two most widely used datasets for face identification, CelebA and VGGFace2.
1 code implementation • 7 Oct 2022 • Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White
The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications.
1 code implementation • 6 Oct 2022 • Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, Frank Hutter
Zero-cost proxies (ZC proxies) are a recent architecture performance prediction technique aiming to significantly speed up algorithms for neural architecture search (NAS).
1 code implementation • 23 Jun 2022 • Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, John P. Dickerson, Colin White
By using far more meta-training data than prior work, RecZilla is able to substantially reduce the level of human involvement when faced with a new recommender system application.
1 code implementation • ICLR 2022 • Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, Frank Hutter
The release of tabular benchmarks, such as NAS-Bench-101 and NAS-Bench-201, has significantly lowered the computational overhead for conducting scientific research in neural architecture search (NAS).
1 code implementation • NeurIPS 2021 • Shen Yan, Colin White, Yash Savani, Frank Hutter
While early research in neural architecture search (NAS) required extreme computational resources, the recent releases of tabular and surrogate benchmarks have greatly increased the speed and reproducibility of NAS research.
1 code implementation • 23 Jun 2021 • Yang Liu, Sujay Khandagale, Colin White, Willie Neiswanger
In this work, we address this issue by releasing XAI-Bench: a suite of synthetic datasets along with a library for benchmarking feature attribution algorithms.
1 code implementation • NeurIPS 2021 • Colin White, Arber Zela, Binxin Ru, Yang Liu, Frank Hutter
Early methods in the rapidly developing field of neural architecture search (NAS) required fully training thousands of neural networks.
2 code implementations • NeurIPS 2020 • Colin White, Willie Neiswanger, Sam Nolen, Yash Savani
First we formally define architecture encodings and give a theoretical characterization on the scalability of the encodings we study Then we identify the main encoding-dependent subroutines which NAS algorithms employ, running experiments to show which encodings work best with each subroutine for many popular algorithms.
3 code implementations • NeurIPS 2020 • Yash Savani, Colin White, Naveen Sundar Govindarajulu
Intra-processing methods are designed specifically to debias large models which have been trained on a generic dataset and fine-tuned on a more specific task.
2 code implementations • 6 May 2020 • Colin White, Sam Nolen, Yash Savani
In this work, we show that (1) the simplest hill-climbing algorithm is a powerful baseline for NAS, and (2), when the noise in popular NAS benchmark datasets is reduced to a minimum, hill-climbing to outperforms many popular state-of-the-art algorithms.
3 code implementations • 25 Oct 2019 • Colin White, Willie Neiswanger, Yash Savani
Bayesian optimization (BO), which has long had success in hyperparameter optimization, has recently emerged as a very promising strategy for NAS when it is coupled with a neural predictor.
no code implementations • 25 Sep 2019 • Colin White, Willie Neiswanger, Yash Savani
We develop a path-based encoding scheme to featurize the neural architectures that are used to train the neural network model.
no code implementations • NeurIPS 2018 • Maria-Florina Balcan, Travis Dick, Colin White
Algorithms for clustering points in metric spaces is a long-studied area of research.
no code implementations • 19 May 2017 • Maria-Florina Balcan, Colin White
The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion.
no code implementations • 2 Mar 2017 • Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, David Woodruff
In this work, we study the $k$-median and $k$-means clustering problems when the data is distributed across many servers and can contain outliers.
no code implementations • 14 Nov 2016 • Maria-Florina Balcan, Vaishnavh Nagarajan, Ellen Vitercik, Colin White
We address this problem for clustering, max-cut, and other partitioning problems, such as integer quadratic programming, by designing computationally efficient and sample efficient learning algorithms which receive samples from an application-specific distribution over problem instances and learn a partitioning algorithm with high expected performance.
no code implementations • 30 May 2016 • Maria-Florina Balcan, Ellen Vitercik, Colin White
However, for real-valued functions, cardinal labels might not be accessible, or it may be difficult for an expert to consistently assign real-valued labels over the entire set of examples.
no code implementations • 15 Dec 2015 • Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola
In distributed machine learning, data is dispatched to multiple machines for processing.
no code implementations • 14 May 2015 • Maria-Florina Balcan, Nika Haghtalab, Colin White
In this work, we take this approach and provide strong positive results both for the asymmetric and symmetric $k$-center problems under a natural input stability (promise) condition called $\alpha$-perturbation resilience [Bilu and Linia 2012], which states that the optimal solution does not change under any alpha-factor perturbation to the input distances.