Search Results for author: Mehrdad Farajtabar

Found 43 papers, 14 papers with code

Weight subcloning: direct initialization of transformers using larger pretrained ones

no code implementations14 Dec 2023 Mohammad Samragh, Mehrdad Farajtabar, Sachin Mehta, Raviteja Vemulapalli, Fartash Faghri, Devang Naik, Oncel Tuzel, Mohammad Rastegari

The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed.

Image Classification Transfer Learning

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

no code implementations12 Dec 2023 Keivan Alizadeh, Iman Mirzadeh, Dmitry Belenko, Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar

These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively.

Language Modelling Large Language Model +1

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

no code implementations30 Nov 2023 Raviteja Vemulapalli, Hadi Pouransari, Fartash Faghri, Sachin Mehta, Mehrdad Farajtabar, Mohammad Rastegari, Oncel Tuzel

Motivated by this, we ask the following important question, "How can we leverage the knowledge from a large VFM to train a small task-specific model for a new target task with limited labeled training data?

Image Retrieval Retrieval +1

TiC-CLIP: Continual Training of CLIP Models

1 code implementation24 Oct 2023 Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri

We introduce the first set of web-scale Time-Continual (TiC) benchmarks for training vision-language models: TiC-DataComp, TiC-YFCC, and TiC-Redcaps.

Continual Learning Retrieval

An Empirical Study of Implicit Regularization in Deep Offline RL

no code implementations5 Jul 2022 Caglar Gulcehre, Srivatsan Srinivasan, Jakub Sygnowski, Georg Ostrovski, Mehrdad Farajtabar, Matt Hoffman, Razvan Pascanu, Arnaud Doucet

Also, we empirically identify three phases of learning that explain the impact of implicit regularization on the learning dynamics and found that bootstrapping alone is insufficient to explain the collapse of the effective rank.

Offline RL

Continual Learning Beyond a Single Model

no code implementations20 Feb 2022 Thang Doan, Seyed Iman Mirzadeh, Mehrdad Farajtabar

A growing body of research in continual learning focuses on the catastrophic forgetting problem.

Continual Learning

Architecture Matters in Continual Learning

no code implementations1 Feb 2022 Seyed Iman Mirzadeh, Arslan Chaudhry, Dong Yin, Timothy Nguyen, Razvan Pascanu, Dilan Gorur, Mehrdad Farajtabar

However, in this work, we show that the choice of architecture can significantly impact the continual learning performance, and different architectures lead to different trade-offs between the ability to remember previous tasks and learning new ones.

Continual Learning

Wide Neural Networks Forget Less Catastrophically

no code implementations21 Oct 2021 Seyed Iman Mirzadeh, Arslan Chaudhry, Dong Yin, Huiyi Hu, Razvan Pascanu, Dilan Gorur, Mehrdad Farajtabar

A primary focus area in continual learning research is alleviating the "catastrophic forgetting" problem in neural networks by designing new algorithms that are more robust to the distribution shifts.

Continual Learning

Task-agnostic Continual Learning with Hybrid Probabilistic Models

no code implementations ICML Workshop INNF 2021 Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning.

Anomaly Detection Continual Learning +1

Balance Regularized Neural Network Models for Causal Effect Estimation

no code implementations23 Nov 2020 Mehrdad Farajtabar, Andrew Lee, Yuanjian Feng, Vishal Gupta, Peter Dolan, Harish Chandran, Martin Szummer

Estimating individual and average treatment effects from observational data is an important problem in many domains such as healthcare and e-commerce.

Representation Learning

Linear Mode Connectivity in Multitask and Continual Learning

1 code implementation ICLR 2021 Seyed Iman Mirzadeh, Mehrdad Farajtabar, Dilan Gorur, Razvan Pascanu, Hassan Ghasemzadeh

Continual (sequential) training and multitask (simultaneous) training are often attempting to solve the same overall objective: to find a solution that performs well on all considered tasks.

Continual Learning Linear Mode Connectivity

Optimization and Generalization of Regularization-Based Continual Learning: a Loss Approximation Viewpoint

no code implementations19 Jun 2020 Dong Yin, Mehrdad Farajtabar, Ang Li, Nir Levine, Alex Mott

This problem is often referred to as catastrophic forgetting, a key challenge in continual learning of neural networks.

Continual Learning

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

no code implementations NeurIPS 2020 Nevena Lazic, Dong Yin, Mehrdad Farajtabar, Nir Levine, Dilan Gorur, Chris Harris, Dale Schuurmans

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs).

Off-policy evaluation

Understanding the Role of Training Regimes in Continual Learning

4 code implementations NeurIPS 2020 Seyed Iman Mirzadeh, Mehrdad Farajtabar, Razvan Pascanu, Hassan Ghasemzadeh

However, there has been limited prior work extensively analyzing the impact that different training regimes -- learning rate, batch size, regularization method-- can have on forgetting.

Continual Learning

Learning to Incentivize Other Learning Agents

2 code implementations NeurIPS 2020 Jiachen Yang, Ang Li, Mehrdad Farajtabar, Peter Sunehag, Edward Hughes, Hongyuan Zha

The challenge of developing powerful and general Reinforcement Learning (RL) agents has received increasing attention in recent years.

General Reinforcement Learning Reinforcement Learning (RL)

Dropout as an Implicit Gating Mechanism For Continual Learning

2 code implementations24 Apr 2020 Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Hassan Ghasemzadeh

However, it is more reliable to preserve the knowledge it has learned from the previous tasks.

Continual Learning

Self-Distillation Amplifies Regularization in Hilbert Space

no code implementations NeurIPS 2020 Hossein Mobahi, Mehrdad Farajtabar, Peter L. Bartlett

Knowledge distillation introduced in the deep learning context is a method to transfer knowledge from one architecture to another.

Knowledge Distillation L2 Regularization

Orthogonal Gradient Descent for Continual Learning

no code implementations15 Oct 2019 Mehrdad Farajtabar, Navid Azizan, Alex Mott, Ang Li

In this paper, we propose to address this issue from a parameter space perspective and study an approach to restrict the direction of the gradient updates to avoid forgetting previously-learned data.

Continual Learning

Cross-View Policy Learning for Street Navigation

1 code implementation ICCV 2019 Ang Li, Huiyi Hu, Piotr Mirowski, Mehrdad Farajtabar

The ability to navigate from visual observations in unfamiliar environments is a core component of intelligent agents and an ongoing challenge for Deep Reinforcement Learning (RL).

Navigate Reinforcement Learning (RL) +1

DyRep: Learning Representations over Dynamic Graphs

2 code implementations ICLR 2019 Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, Hongyuan Zha

We present DyRep - a novel modeling framework for dynamic graphs that posits representation learning as a latent mediation process bridging two observed processes namely -- dynamics of the network (realized as topological evolution) and dynamics on the network (realized as activities between nodes).

Dynamic Link Prediction Representation Learning

Improved Knowledge Distillation via Teacher Assistant

3 code implementations9 Feb 2019 Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, Hassan Ghasemzadeh

To alleviate this shortcoming, we introduce multi-step knowledge distillation, which employs an intermediate-sized network (teacher assistant) to bridge the gap between the student and the teacher.

Knowledge Distillation

Adapting Auxiliary Losses Using Gradient Similarity

1 code implementation5 Dec 2018 Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan

One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations.

Atari Games reinforcement-learning +1

Representation Learning over Dynamic Graphs

no code implementations11 Mar 2018 Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, Hongyuan Zha

How can we effectively encode evolving information over dynamic graphs into low-dimensional representations?

Dynamic Link Prediction Representation Learning

More Robust Doubly Robust Off-policy Evaluation

no code implementations ICML 2018 Mehrdad Farajtabar, Yin-Lam Chow, Mohammad Ghavamzadeh

In particular, we focus on the doubly robust (DR) estimators that consist of an importance sampling (IS) component and a performance model, and utilize the low (or zero) bias of IS and low variance of the model at the same time.

Multi-Armed Bandits Off-policy evaluation

Hawkes Processes for Invasive Species Modeling and Management

no code implementations12 Dec 2017 Amrita Gupta, Mehrdad Farajtabar, Bistra Dilkina, Hongyuan Zha

The spread of invasive species to new areas threatens the stability of ecosystems and causes major economic losses in agriculture and forestry.

Management

Wasserstein Learning of Deep Generative Point Process Models

1 code implementation NeurIPS 2017 Shuai Xiao, Mehrdad Farajtabar, Xiaojing Ye, Junchi Yan, Le Song, Hongyuan Zha

Point processes are becoming very popular in modeling asynchronous sequential data due to their sound mathematical foundation and strength in modeling a variety of real-world phenomena.

Point Processes

Fake News Mitigation via Point Process Based Intervention

no code implementations ICML 2017 Mehrdad Farajtabar, Jiachen Yang, Xiaojing Ye, Huan Xu, Rakshit Trivedi, Elias Khalil, Shuang Li, Le Song, Hongyuan Zha

We propose the first multistage intervention framework that tackles fake news in social networks by combining reinforcement learning with a point process network activity model.

reinforcement-learning Reinforcement Learning (RL)

Recurrent Poisson Factorization for Temporal Recommendation

1 code implementation4 Mar 2017 Seyed Abbas Hosseini, Keivan Alizadeh, Ali Khodadadi, Ali Arabzadeh, Mehrdad Farajtabar, Hongyuan Zha, Hamid R. Rabiee

Poisson factorization is a probabilistic model of users and items for recommendation systems, where the so-called implicit consumer data is modeled by a factorized Poisson distribution.

Recommendation Systems

Distilling Information Reliability and Source Trustworthiness from Digital Traces

no code implementations24 Oct 2016 Behzad Tabibian, Isabel Valera, Mehrdad Farajtabar, Le Song, Bernhard Schölkopf, Manuel Gomez-Rodriguez

Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness.

Smart broadcasting: Do you want to be seen?

no code implementations22 May 2016 Mohammad Reza Karimi, Erfan Tavakoli, Mehrdad Farajtabar, Le Song, Manuel Gomez-Rodriguez

Many users in online social networks are constantly trying to gain attention from their followers by broadcasting posts to them.

Point Processes

Detecting weak changes in dynamic events over networks

no code implementations29 Mar 2016 Shuang Li, Yao Xie, Mehrdad Farajtabar, Apurv Verma, Le Song

Large volume of networked streaming event data are becoming increasingly available in a wide variety of applications, such as social network analysis, Internet traffic monitoring and healthcare analytics.

Change Point Detection Point Processes

Learning Granger Causality for Hawkes Processes

no code implementations14 Feb 2016 Hongteng Xu, Mehrdad Farajtabar, Hongyuan Zha

In this paper, we propose an effective method, learning Granger causality, for a special but significant type of point processes --- Hawkes process.

Clustering Point Processes

A Continuous-time Mutually-Exciting Point Process Framework for Prioritizing Events in Social Media

no code implementations13 Nov 2015 Mehrdad Farajtabar, Safoora Yousefi, Long Q. Tran, Le Song, Hongyuan Zha

In our experiments, we demonstrate that our algorithm is able to achieve the-state-of-the-art performance in terms of analyzing, predicting, and prioritizing events.

COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution

1 code implementation NeurIPS 2015 Mehrdad Farajtabar, Yichen Wang, Manuel Gomez Rodriguez, Shuang Li, Hongyuan Zha, Le Song

Information diffusion in online social networks is affected by the underlying network topology, but it also has the power to change it.

Shaping Social Activity by Incentivizing Users

no code implementations NeurIPS 2014 Mehrdad Farajtabar, Nan Du, Manuel Gomez Rodriguez, Isabel Valera, Hongyuan Zha, Le Song

Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network.

From Local Similarity to Global Coding: An Application to Image Classification

no code implementations CVPR 2013 Amirreza Shaban, Hamid R. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad

Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed.

General Classification Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.