Search Results for author: Newsha Ardalani

Found 18 papers, 3 papers with code

Text Quality-Based Pruning for Efficient Training of Language Models

no code implementations26 Apr 2024 Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Shang-Wen Li, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer

In recent times training Language Models (LMs) have relied on computationally heavy training over massive datasets which makes this training process extremely laborious.

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations5 Dec 2023 Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

Data Acquisition: A New Frontier in Data-centric AI

no code implementations22 Nov 2023 Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative.

MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems

no code implementations4 Oct 2023 Samuel Hsia, Alicia Golden, Bilge Acun, Newsha Ardalani, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs.

Distributed Computing

Sieve: Multimodal Dataset Pruning Using Image Captioning Models

1 code implementation CVPR 2024 Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, Yu Yang, Newsha Ardalani, Hugh Leather, Ari Morcos

We propose a pruning signal, Sieve, that employs synthetic captions generated by image-captioning models pretrained on small, diverse, and well-aligned image-text pairs to evaluate the alignment of noisy image-text pairs.

Diversity Image Captioning +2

Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference

no code implementations10 Mar 2023 Haiyang Huang, Newsha Ardalani, Anna Sun, Liu Ke, Hsien-Hsin S. Lee, Anjali Sridhar, Shruti Bhosale, Carole-Jean Wu, Benjamin Lee

We propose three optimization techniques to mitigate sources of inefficiencies, namely (1) Dynamic gating, (2) Expert Buffering, and (3) Expert load balancing.

Decoder Language Modelling +1

MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation

no code implementations21 Feb 2023 Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements.

Recommendation Systems

Understanding Scaling Laws for Recommendation Models

no code implementations17 Aug 2022 Newsha Ardalani, Carole-Jean Wu, Zeliang Chen, Bhargav Bhushanam, Adnan Aziz

We show that parameter scaling is out of steam for the model architecture under study, and until a higher-performing model architecture emerges, data scaling is the path forward.

Time Dependency, Data Flow, and Competitive Advantage

no code implementations17 Mar 2022 Ehsan Valavi, Joel Hestness, Marco Iansiti, Newsha Ardalani, Feng Zhu, Karim R. Lakhani

Relating the text topics to various business areas of interest, we argue that competing in a business area in which data value decays rapidly alters strategies to acquire competitive advantage.

Time and the Value of Data

no code implementations17 Mar 2022 Ehsan Valavi, Joel Hestness, Newsha Ardalani, Marco Iansiti

In addition, we argue that increasing the stock of data by including older datasets may, in fact, damage the model's accuracy.

BIG-bench Machine Learning

Model Architecture Controls Gradient Descent Dynamics: A Combinatorial Path-Based Formula

no code implementations25 Sep 2019 Xin Zhou, Newsha Ardalani

However, our theoretical understanding of how model architecture affects performance or accuracy is limited.

Beyond Human-Level Accuracy: Computational Challenges in Deep Learning

1 code implementation3 Sep 2019 Joel Hestness, Newsha Ardalani, Greg Diamos

However, recent prior work shows that as dataset sizes grow, DL model accuracy and model size grow predictably.

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning

no code implementations18 Jun 2019 Newsha Ardalani, Urmish Thakker, Aws Albarghouthi, Karu Sankaralingam

Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement.

BIG-bench Machine Learning Binary Classification

Empirically Characterizing Overparameterization Impact on Convergence

no code implementations ICLR 2019 Newsha Ardalani, Joel Hestness, Gregory Diamos

A long-held conventional wisdom states that larger models train more slowly when using gradient descent.

A Proposed Hierarchy of Deep Learning Tasks

no code implementations27 Sep 2018 Joel Hestness, Sharan Narang, Newsha Ardalani, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou, Gregory Diamos, Kenneth Church

As the pace of deep learning innovation accelerates, it becomes increasingly important to organize the space of problems by relative difficultly.

Deep Learning Scaling is Predictable, Empirically

no code implementations1 Dec 2017 Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou

As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art.

Language Modelling Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.