Search Results for author: Sridhar Mahadevan

Found 32 papers, 4 papers with code

GAIA: Categorical Foundations of Generative AI

no code implementations28 Feb 2024 Sridhar Mahadevan

In this paper, we propose GAIA, a generative AI architecture based on category theory.

Zero-th Order Algorithm for Softmax Attention Optimization

no code implementations17 Jul 2023 Yichuan Deng, Zhihang Li, Sridhar Mahadevan, Zhao Song

We demonstrate the convergence of our algorithm, highlighting its effectiveness in efficiently computing gradients for large-scale LLMs.

Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension

no code implementations10 Apr 2023 Yichuan Deng, Sridhar Mahadevan, Zhao Song

It runs in $\widetilde{O}(\mathrm{nnz}(X) + n^{\omega} ) $ time, has $1-\delta$ succeed probability, and chooses $m = O(n \log(n/\delta))$.

Sentence

An Over-parameterized Exponential Regression

no code implementations29 Mar 2023 Yeqi Gao, Sridhar Mahadevan, Zhao Song

Mathematically, we define the neural function $F: \mathbb{R}^{d \times m} \times \mathbb{R}^d \rightarrow \mathbb{R}$ using an exponential activation function.

regression

A Layered Architecture for Universal Causality

no code implementations18 Dec 2022 Sridhar Mahadevan

At the second layer, causal models are defined by a graph-type category.

Causal Inference LEMMA

Privacy Aware Experiments without Cookies

no code implementations3 Nov 2022 Shiv Shankar, Ritwik Sinha, Saayan Mitra, Viswanathan Swaminathan, Sridhar Mahadevan, Moumita Sinha

We propose a two-stage experimental design, where the two brands only need to agree on high-level aggregate parameters of the experiment to test the alternate experiences.

Experimental Design valid

Unifying Causal Inference and Reinforcement Learning using Higher-Order Category Theory

no code implementations13 Sep 2022 Sridhar Mahadevan

We present a unified formalism for structure discovery of causal models and predictive state representation (PSR) models in reinforcement learning (RL) using higher-order category theory.

Causal Inference reinforcement-learning +1

Categoroids: Universal Conditional Independence

no code implementations23 Aug 2022 Sridhar Mahadevan

Categoroids are defined as a hybrid of two categories: one encoding a preordered lattice structure defined by objects and arrows between them; the second dual parameterization involves trigonoidal objects and morphisms defining a conditional independence structure, with bridge morphisms providing the interface between the binary and ternary structures.

Causal Inference

On The Universality of Diagrams for Causal Inference and The Causal Reproducing Property

no code implementations6 Jul 2022 Sridhar Mahadevan

The second result, the Causal Reproducing Property (CRP), states that any causal influence of a object X on another object Y is representable as a natural transformation between two abstract causal diagrams.

Causal Inference LEMMA

Smoothed Online Combinatorial Optimization Using Imperfect Predictions

no code implementations23 Apr 2022 Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds.

Combinatorial Optimization

Universal Decision Models

no code implementations28 Oct 2021 Sridhar Mahadevan

Decision objects in a UDM correspond to instances of decision tasks, ranging from causal models and dynamical systems such as Markov decision processes and predictive state representations, to network multiplayer games and Witsenhausen's intrinsic models, which generalizes all these previous formalisms.

Causal Inference

Asymptotic Causal Inference

no code implementations20 Sep 2021 Sridhar Mahadevan

Semantic entropy quantifies the reduction in entropy where edges are removed by causal intervention.

Causal Inference Experimental Design

Causal Inference in Network Economics

no code implementations20 Sep 2021 Sridhar Mahadevan

Network economics is the study of a rich class of equilibrium problems that occur in the real world, from traffic management to supply chains and two-sided online marketplaces.

Causal Inference Management

Causal Homotopy

no code implementations20 Sep 2021 Sridhar Mahadevan

Second, a diverse range ofgraphical models used to represent causal structures can be represented in a unified way in terms of a topological representation of the induced poset structure.

Causal Discovery

Multiscale Manifold Warping

no code implementations19 Sep 2021 Sridhar Mahadevan, Anup Rao, Georgios Theocharous, Jennifer Healey

Many real-world applications require aligning two temporal sequences, including bioinformatics, handwriting recognition, activity recognition, and human-robot coordination.

Activity Recognition Dynamic Time Warping +2

Regularized Off-Policy TD-Learning

no code implementations NeurIPS 2012 Bo Liu, Sridhar Mahadevan, Ji Liu

We present a novel $l_1$ regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity.

feature selection

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

1 code implementation6 Jun 2020 Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, Marek Petrik

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Finite-Sample Analysis of Proximal Gradient TD Algorithms

no code implementations6 Jun 2020 Bo Liu, Ji Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, Marek Petrik

In this paper, we analyze the convergence rate of the gradient temporal difference learning (GTD) family of algorithms.

Optimizing for the Future in Non-Stationary MDPs

1 code implementation ICML 2020 Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary.

Global Convergence to the Equilibrium of GANs using Variational Inequalities

no code implementations4 Aug 2018 Ian Gemp, Sridhar Mahadevan

In optimization, the negative gradient of a function denotes the direction of steepest descent.

Generative Adversarial Network

A Unified Framework for Domain Adaptation using Metric Learning on Manifolds

1 code implementation28 Apr 2018 Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh

We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds.

Domain Adaptation Metric Learning +1

Online Monotone Games

no code implementations19 Oct 2017 Ian Gemp, Sridhar Mahadevan

Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games.

Reinforcement Learning (RL)

A Manifold Approach to Learning Mutually Orthogonal Subspaces

no code implementations8 Mar 2017 Stephen Giguere, Francisco Garcia, Sridhar Mahadevan

Although many machine learning algorithms involve learning subspaces with particular characteristics, optimizing a parameter matrix that is constrained to represent a subspace can be challenging.

Domain Adaptation Riemannian optimization

Generative Multi-Adversarial Networks

1 code implementation5 Nov 2016 Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.

Ranked #70 on Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Online Monotone Optimization

no code implementations29 Aug 2016 Ian Gemp, Sridhar Mahadevan

This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems.

Inverting Variational Autoencoders for Improved Generative Accuracy

no code implementations21 Aug 2016 Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).

Deep Reinforcement Learning With Macro-Actions

no code implementations15 Jun 2016 Ishan P. Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan

Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain.

Atari Games reinforcement-learning +1

Reasoning about Linguistic Regularities in Word Embeddings using Matrix Manifolds

no code implementations28 Jul 2015 Sridhar Mahadevan, Sarath Chandar

In this paper, we introduce a new approach to capture analogies in continuous word representations, based on modeling not just individual word vectors, but rather the subspaces spanned by groups of words.

Word Embeddings

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

no code implementations26 May 2014 Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.

Decision Making reinforcement-learning +2

Projected Natural Actor-Critic

no code implementations NeurIPS 2013 Philip S. Thomas, William C. Dabney, Stephen Giguere, Sridhar Mahadevan

Natural actor-critics are a popular class of policy search algorithms for finding locally optimal policies for Markov decision processes.

reinforcement-learning Reinforcement Learning (RL)

Basis Construction from Power Series Expansions of Value Functions

no code implementations NeurIPS 2010 Sridhar Mahadevan, Bo Liu

This paper explores links between basis construction methods in Markov decision processes and power series expansions of value functions.

Unity

Cannot find the paper you are looking for? You can Submit a new open access paper.