Search Results for author: Carole-Jean Wu

Found 45 papers, 12 papers with code

Exploiting Parallelism Opportunities with Deep Learning Frameworks

1 code implementation13 Aug 2019 Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, David Brooks

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers.

BIG-bench Machine Learning

DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference

no code implementations8 Jan 2020 Udit Gupta, Samuel Hsia, Vikram Saraph, Xiaodong Wang, Brandon Reagen, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, Carole-Jean Wu

Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure.

Distributed, Parallel, and Cluster Computing

Developing a Recommendation Benchmark for MLPerf Training and Inference

no code implementations16 Mar 2020 Carole-Jean Wu, Robin Burke, Ed H. Chi, Joseph Konstan, Julian McAuley, Yves Raimond, Hao Zhang

Deep learning-based recommendation models are used pervasively and broadly, for example, to recommend movies, products, or other information most relevant to users, in order to enhance the user experience.

Image Classification object-detection +3

GEVO: GPU Code Optimization using Evolutionary Computation

1 code implementation17 Apr 2020 Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, Carole-Jean Wu

If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline version by an average of 51. 08%.

BIG-bench Machine Learning Handwriting Recognition +1

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

no code implementations6 May 2020 Young Geun Kim, Carole-Jean Wu

Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution, where signal strength variations of the wireless networks and resource interference can significantly affect real-time inference performance and system energy efficiency.

Understanding Capacity-Driven Scale-Out Neural Recommendation Inference

no code implementations4 Nov 2020 Michael Lui, Yavuz Yetim, Özgür Özkan, Zhuoran Zhao, Shin-Yeh Tsai, Carole-Jean Wu, Mark Hempstead

One approach to support this scale is with distributed serving, or distributed inference, which divides the memory requirements of a single large model across multiple servers.

Recommendation Systems

CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery

no code implementations5 Nov 2020 Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark C. Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, Carole-Jean Wu

The paper is the first to the extent of our knowledge to perform a data-driven, in-depth analysis of applying partial recovery to recommendation models and identified a trade-off between accuracy and performance.

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

no code implementations11 Nov 2020 Bilge Acun, Matthew Murphy, Xiaodong Wang, Jade Nie, Carole-Jean Wu, Kim Hazelwood

The use of GPUs has proliferated for machine learning workflows and is now considered mainstream for many deep learning models.

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

1 code implementation25 Jan 2021 Chunxing Yin, Bilge Acun, Xing Liu, Carole-Jean Wu

TT-Rec achieves 117 times and 112 times model size compression, for Kaggle and Terabyte, respectively.

RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

no code implementations29 Jan 2021 Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, Gu-Yeon Wei

Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment.

Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale

no code implementations26 May 2021 Zhaoxia, Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie, Yang, Hector Yuen, Jianyu Huang, Daya Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Satish Nadathur, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy

We share in this paper our search strategies to adapt reference recommendation models to low-precision hardware, our optimization of low-precision compute kernels, and the design and development of tool chain so as to maintain our models' accuracy throughout their lifespan during which topic trends and users' interests inevitably evolve.

Recommendation Systems

SVP-CF: Selection via Proxy for Collaborative Filtering Data

no code implementations11 Jul 2021 Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance -- masking performance deficiencies in algorithms or altering the relative performance of algorithms, as compared to models trained on the complete dataset.

Collaborative Filtering Recommendation Systems

AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning

no code implementations16 Jul 2021 Young Geun Kim, Carole-Jean Wu

Federated learning enables a cluster of decentralized mobile devices at the edge to collaboratively train a shared machine learning model, while keeping all the raw training samples on device.

Federated Learning

Papaya: Practical, Private, and Scalable Federated Learning

no code implementations8 Nov 2021 Dzmitry Huba, John Nguyen, Kshitiz Malik, Ruiyu Zhu, Mike Rabbat, Ashkan Yousefpour, Carole-Jean Wu, Hongyuan Zhan, Pavel Ustinov, Harish Srinivas, Kaikai Wang, Anthony Shoumikhin, Jesik Min, Mani Malek

Our work tackles the aforementioned issues, sketches of some of the system design challenges and their solutions, and touches upon principles that emerged from building a production FL system for millions of clients.

Federated Learning

On Sampling Collaborative Filtering Datasets

1 code implementation13 Jan 2022 Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

We study the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms.

Collaborative Filtering Recommendation Systems

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

no code implementations25 Jan 2022 Geet Sethi, Bilge Acun, Niket Agarwal, Christos Kozyrakis, Caroline Trippel, Carole-Jean Wu

EMBs exhibit distinct memory characteristics, providing performance optimization opportunities for intelligent EMB partitioning and placement across a tiered memory hierarchy.

Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity

no code implementations30 May 2022 Kiwan Maeng, Haiyu Lu, Luca Melis, John Nguyen, Mike Rabbat, Carole-Jean Wu

Federated learning (FL) is an effective mechanism for data privacy in recommender systems by running machine learning model training on-device.

Fairness Federated Learning +2

Infinite Recommendation Networks: A Data-Centric Approach

5 code implementations3 Jun 2022 Noveen Sachdeva, Mehak Preet Dhaliwal, Carole-Jean Wu, Julian McAuley

We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers.

 Ranked #1 on Recommendation Systems on Douban (AUC metric)

Information Retrieval Recommendation Systems

Understanding Scaling Laws for Recommendation Models

no code implementations17 Aug 2022 Newsha Ardalani, Carole-Jean Wu, Zeliang Chen, Bhargav Bhushanam, Adnan Aziz

We show that parameter scaling is out of steam for the model architecture under study, and until a higher-performing model architecture emerges, data scaling is the path forward.

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

no code implementations9 Nov 2022 Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis

RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets.

FedGPO: Heterogeneity-Aware Global Parameter Optimization for Efficient Federated Learning

no code implementations30 Nov 2022 Young Geun Kim, Carole-Jean Wu

Federated learning (FL) has emerged as a solution to deal with the risk of privacy leaks in machine learning training.

Federated Learning

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

no code implementations8 Jan 2023 Geet Sethi, Pallab Bhattacharya, Dhruv Choudhary, Carole-Jean Wu, Christos Kozyrakis

Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests.

MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation

no code implementations21 Feb 2023 Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements.

Recommendation Systems

Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference

no code implementations10 Mar 2023 Haiyang Huang, Newsha Ardalani, Anna Sun, Liu Ke, Hsien-Hsin S. Lee, Anjali Sridhar, Shruti Bhosale, Carole-Jean Wu, Benjamin Lee

We propose three optimization techniques to mitigate sources of inefficiencies, namely (1) Dynamic gating, (2) Expert Buffering, and (3) Expert load balancing.

Language Modelling Machine Translation

Green Federated Learning

no code implementations26 Mar 2023 Ashkan Yousefpour, Shen Guo, Ashish Shenoy, Sayan Ghosh, Pierre Stock, Kiwan Maeng, Schalk-Willem Krüger, Michael Rabbat, Carole-Jean Wu, Ilya Mironov

The rapid progress of AI is fueled by increasingly large and computationally intensive machine learning models and datasets.

Federated Learning

READ: Recurrent Adaptation of Large Transformers

no code implementations24 May 2023 Sid Wang, John Nguyen, Ke Li, Carole-Jean Wu

However, fine-tuning all pre-trained model parameters becomes impractical as the model size and number of tasks increase.

Transfer Learning

GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation

no code implementations16 Oct 2023 Jhe-Yu Liou, Stephanie Forrest, Carole-Jean Wu

For the training workloads, GEVO-ML finds a 4. 88% improvement in model accuracy, from 91% to 96%, without sacrificing training or testing speed.

Data Acquisition: A New Frontier in Data-centric AI

no code implementations22 Nov 2023 Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative.

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations5 Dec 2023 Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

no code implementations22 Dec 2023 Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Yejin Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency.

3D Generation

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

no code implementations7 Mar 2024 Gyudong Kim, Mehdi Ghasemi, Soroush Heidari, Seungryong Kim, Young Geun Kim, Sarma Vrudhula, Carole-Jean Wu

Such fragmentation introduces a new type of data heterogeneity in FL, namely \textit{system-induced data heterogeneity}, as each device generates distinct data depending on its hardware and software configurations.

Domain Generalization Fairness +1

Introducing v0.5 of the AI Safety Benchmark from MLCommons

no code implementations18 Apr 2024 Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

Cannot find the paper you are looking for? You can Submit a new open access paper.