Search Results for author: Newsha Ardalani

Found 17 papers, 3 papers with code

Deep Learning Scaling is Predictable, Empirically

no code implementations • 1 Dec 2017 • Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou

As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art.

Language Modelling Machine Translation +3

Paper
Add Code

A Proposed Hierarchy of Deep Learning Tasks

no code implementations • 27 Sep 2018 • Joel Hestness, Sharan Narang, Newsha Ardalani, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou, Gregory Diamos, Kenneth Church

As the pace of deep learning innovation accelerates, it becomes increasingly important to organize the space of problems by relative difficultly.

Paper
Add Code

Empirically Characterizing Overparameterization Impact on Convergence

no code implementations • ICLR 2019 • Newsha Ardalani, Joel Hestness, Gregory Diamos

A long-held conventional wisdom states that larger models train more slowly when using gradient descent.

Paper
Add Code

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning

no code implementations • 18 Jun 2019 • Newsha Ardalani, Urmish Thakker, Aws Albarghouthi, Karu Sankaralingam

Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement.

BIG-bench Machine Learning Binary Classification

Paper
Add Code

Beyond Human-Level Accuracy: Computational Challenges in Deep Learning

1 code implementation • 3 Sep 2019 • Joel Hestness, Newsha Ardalani, Greg Diamos

However, recent prior work shows that as dataset sizes grow, DL model accuracy and model size grow predictably.

Paper
Code

Model Architecture Controls Gradient Descent Dynamics: A Combinatorial Path-Based Formula

no code implementations • 25 Sep 2019 • Xin Zhou, Newsha Ardalani

However, our theoretical understanding of how model architecture affects performance or accuracy is limited.

Paper
Add Code

Sustainable AI: Environmental Implications, Challenges and Opportunities

no code implementations • 30 Oct 2021 • Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Gloria Chang, Fiona Aga Behram, James Huang, Charles Bai, Michael Gschwind, Anurag Gupta, Myle Ott, Anastasia Melnikov, Salvatore Candido, David Brooks, Geeta Chauhan, Benjamin Lee, Hsien-Hsin S. Lee, Bugra Akyildiz, Maximilian Balandat, Joe Spisak, Ravi Jain, Mike Rabbat, Kim Hazelwood

This paper explores the environmental impact of the super-linear growth trends for AI from a holistic perspective, spanning Data, Algorithms, and System Hardware.

Paper
Add Code

Time and the Value of Data

no code implementations • 17 Mar 2022 • Ehsan Valavi, Joel Hestness, Newsha Ardalani, Marco Iansiti

In addition, we argue that increasing the stock of data by including older datasets may, in fact, damage the model's accuracy.

BIG-bench Machine Learning

Paper
Add Code

Time Dependency, Data Flow, and Competitive Advantage

no code implementations • 17 Mar 2022 • Ehsan Valavi, Joel Hestness, Marco Iansiti, Newsha Ardalani, Feng Zhu, Karim R. Lakhani

Relating the text topics to various business areas of interest, we argue that competing in a business area in which data value decays rapidly alters strategies to acquire competitive advantage.

Paper
Add Code

DataPerf: Benchmarks for Data-Centric AI Development

1 code implementation • NeurIPS 2023 • Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Douwe Kiela, David Jurado, David Kanter, Rafael Mosquera, Juan Ciro, Lora Aroyo, Bilge Acun, Lingjiao Chen, Mehul Smriti Raje, Max Bartolo, Sabri Eyuboglu, Amirata Ghorbani, Emmett Goodman, Oana Inel, Tariq Kane, Christine R. Kirkpatrick, Tzu-Sheng Kuo, Jonas Mueller, Tristan Thrush, Joaquin Vanschoren, Margaret Warren, Adina Williams, Serena Yeung, Newsha Ardalani, Praveen Paritosh, Lilith Bat-Leah, Ce Zhang, James Zou, Carole-Jean Wu, Cody Coleman, Andrew Ng, Peter Mattson, Vijay Janapa Reddi

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems.

Paper
Code

Understanding Scaling Laws for Recommendation Models

no code implementations • 17 Aug 2022 • Newsha Ardalani, Carole-Jean Wu, Zeliang Chen, Bhargav Bhushanam, Adnan Aziz

We show that parameter scaling is out of steam for the model architecture under study, and until a higher-performing model architecture emerges, data scaling is the path forward.

Paper
Add Code

MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation

no code implementations • 21 Feb 2023 • Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements.

Recommendation Systems

Paper
Add Code

Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference

no code implementations • 10 Mar 2023 • Haiyang Huang, Newsha Ardalani, Anna Sun, Liu Ke, Hsien-Hsin S. Lee, Anjali Sridhar, Shruti Bhosale, Carole-Jean Wu, Benjamin Lee

We propose three optimization techniques to mitigate sources of inefficiencies, namely (1) Dynamic gating, (2) Expert Buffering, and (3) Expert load balancing.

Language Modelling Machine Translation

Paper
Add Code

Sieve: Multimodal Dataset Pruning Using Image Captioning Models

1 code implementation • 3 Oct 2023 • Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, Yu Yang, Newsha Ardalani, Hugh Leather, Ari Morcos

We propose a pruning signal, Sieve, that employs synthetic captions generated by image-captioning models pretrained on small, diverse, and well-aligned image-text pairs to evaluate the alignment of noisy image-text pairs.

Image Captioning Language Modelling +1

Paper
Code

MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems

no code implementations • 4 Oct 2023 • Samuel Hsia, Alicia Golden, Bilge Acun, Newsha Ardalani, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Training and deploying large machine learning (ML) models is time-consuming and requires significant distributed computing infrastructures.

Distributed Computing

Paper
Add Code

Data Acquisition: A New Frontier in Data-centric AI

no code implementations • 22 Nov 2023 • Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative.

Paper
Add Code

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations • 5 Dec 2023 • Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.