Search Results for author: Ali Behrouz

Found 11 papers, 5 papers with code

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

no code implementations17 Apr 2025 Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, Vahab Mirrokni

Going beyond these objectives, we present a set of alternative attentional bias configurations along with their effective approximations to stabilize their training procedure.

All Language Modeling +2

Titans: Learning to Memorize at Test Time

1 code implementation31 Dec 2024 Ali Behrouz, Peilin Zhong, Vahab Mirrokni

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention.

Common Sense Reasoning Language Modeling +1

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

no code implementations23 Nov 2024 Ali Behrouz, Ali Parviz, Mahdi Karami, Clayton Sanford, Bryan Perozzi, Vahab Mirrokni

Our theoretical evaluations of the representation power of Transformers and modern recurrent models through the lens of global and local graph tasks show that there are both negative and positive sides for both types of models.

Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models

no code implementations6 Jun 2024 Ali Behrouz, Michele Santacatterina, Ramin Zabih

Despite recent attempts to improve the expressive power of SSMs by using deep structured SSMs, the existing methods are either limited to univariate time series, fail to model complex patterns (e. g., seasonal patterns), fail to dynamically model the dependencies of variate and time dimensions, and/or are input-independent.

Anomaly Detection Mamba +5

MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection

no code implementations29 Mar 2024 Ali Behrouz, Michele Santacatterina, Ramin Zabih

Motivated by the success of SSMs, we present MambaMixer, a new architecture with data-dependent weights that uses a dual selection mechanism across tokens and channels, called Selective Token and Channel Mixer.

channel selection Image Classification +6

Graph Mamba: Towards Learning on Graphs with State Space Models

1 code implementation13 Feb 2024 Ali Behrouz, Farnoosh Hashemi

Motivated by the recent success of State Space Models (SSMs), such as Mamba, we present Graph Mamba Networks (GMNs), a general framework for a new class of GNNs based on selective SSMs.

Graph Representation Learning Mamba +1

CAT-Walk: Inductive Hypergraph Learning via Set Walks

1 code implementation NeurIPS 2023 Ali Behrouz, Farnoosh Hashemi, Sadaf Sadeghian, Margo Seltzer

Our evaluation on 10 hypergraph benchmark datasets shows that CAT-Walk attains outstanding performance on temporal hyperedge prediction benchmarks in both inductive and transductive settings.

Hyperedge Prediction Node Classification +1

CS-TGN: Community Search via Temporal Graph Neural Networks

1 code implementation15 Mar 2023 Farnoosh Hashemi, Ali Behrouz, Milad Rezaei Hajidehi

The evolution of these networks over time has motivated several recent studies to identify local communities in temporal networks.

Community Search Graph Embedding

Anomaly Detection in Multiplex Dynamic Networks: from Blockchain Security to Brain Disease Prediction

1 code implementation15 Nov 2022 Ali Behrouz, Margo Seltzer

The problem of identifying anomalies in dynamic networks is a fundamental task with a wide range of applications.

Anomaly Detection Disease Prediction

CS-MLGCN : Multiplex Graph Convolutional Networks for Community Search in Multiplex Networks

no code implementations17 Oct 2022 Ali Behrouz, Farnoosh Hashemi

Existing CS approaches in multiplex networks adopt pre-defined subgraph patterns to model the communities, which cannot find communities that do not have such pre-defined patterns in real-world networks.

Community Search Graph Embedding

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

no code implementations13 Oct 2022 Ali Behrouz, Mathias Lecuyer, Cynthia Rudin, Margo Seltzer

Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used.

Cannot find the paper you are looking for? You can Submit a new open access paper.