Search Results for author: Iman Mirzadeh

Found 9 papers, 4 papers with code

Computational Bottlenecks of Training Small-scale Large Language Models

no code implementations25 Oct 2024 Saleh Ashkboos, Iman Mirzadeh, Keivan Alizadeh, Mohammad Hossein Sekhavat, Moin Nabi, Mehrdad Farajtabar, Fartash Faghri

While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers.

Language Modeling Language Modelling

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

2 code implementations7 Oct 2024 Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, Mehrdad Farajtabar

While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics.

GSM8K Logical Reasoning +1

Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models

no code implementations1 Oct 2024 Keivan Alizadeh, Iman Mirzadeh, Hooman Shahrokhi, Dmitry Belenko, Frank Sun, Minsik Cho, Mohammad Hossein Sekhavat, Moin Nabi, Mehrdad Farajtabar

Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization.

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

1 code implementation19 Sep 2024 Mohammad Samragh, Iman Mirzadeh, Keivan Alizadeh Vahid, Fartash Faghri, Minsik Cho, Moin Nabi, Devang Naik, Mehrdad Farajtabar

In this paper, we introduce HyperCloning, a method that can expand the parameters of a pre-trained language model to those of a larger model with increased hidden dimensions.

Language Modeling Language Modelling +2

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

no code implementations12 Dec 2023 Keivan Alizadeh, Iman Mirzadeh, Dmitry Belenko, Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar

These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively.

Language Modeling Language Modelling +2

ActiLabel: A Combinatorial Transfer Learning Framework for Activity Recognition

no code implementations16 Mar 2020 Parastoo Alinia, Iman Mirzadeh, Hassan Ghasemzadeh

Sensor-based human activity recognition has become a critical component of many emerging applications ranging from behavioral medicine to gaming.

Diversity Human Activity Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.