Search Results for author: Anima Anandkumar

Found 209 papers, 89 papers with code

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

23 code implementations NeurIPS 2021 Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

C++ code Semantic Segmentation +1

Voyager: An Open-Ended Embodied Agent with Large Language Models

1 code implementation25 May 2023 Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, Anima Anandkumar

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention.

Eureka: Human-Level Reward Design via Coding Large Language Models

1 code implementation19 Oct 2023 Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar

The generality of Eureka also enables a new gradient-free in-context learning approach to reinforcement learning from human feedback (RLHF), readily incorporating human inputs to improve the quality and the safety of the generated rewards without model updating.

Decision Making In-Context Learning +1

Speeding up Fourier Neural Operators via Mixed Precision

1 code implementation27 Jul 2023 Colin White, Renbo Tu, Jean Kossaifi, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar

In this work, we (i) profile memory and runtime for FNO with full and mixed-precision training, (ii) conduct a study on the numerical stability of mixed-precision training of FNO, and (iii) devise a training routine which substantially decreases training time and memory usage (up to 34%), with little or no reduction in accuracy, on the Navier-Stokes and Darcy flow equations.

Neural Operator: Learning Maps Between Function Spaces

1 code implementation19 Aug 2021 Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

The classical development of neural networks has primarily focused on learning mappings between finite dimensional Euclidean spaces or finite sets.

Operator learning

TensorLy: Tensor Learning in Python

1 code implementation29 Oct 2016 Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, Maja Pantic

In addition, using the deep-learning frameworks as backend allows users to easily design and train deep tensorized neural networks.

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

1 code implementation6 Mar 2024 Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

VIMA: General Robot Manipulation with Multimodal Prompts

2 code implementations6 Oct 2022 Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

We show that a wide spectrum of robot manipulation tasks can be expressed with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +3

Understanding The Robustness in Vision Transformers

2 code implementations26 Apr 2022 Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng, Jose M. Alvarez

Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations.

Ranked #4 on Domain Generalization on ImageNet-R (using extra training data)

Domain Generalization Image Classification +3

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

3 code implementations NeurIPS 2023 Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.

Automated Theorem Proving Math +1

Pre-Trained Language Models for Interactive Decision-Making

1 code implementation3 Feb 2022 Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.

Imitation Learning Language Modelling

Long-Short Transformer: Efficient Transformers for Language and Vision

3 code implementations NeurIPS 2021 Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro

For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.

Language Modelling

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation CVPR 2022 Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

Neural Operator: Graph Kernel Network for Partial Differential Equations

6 code implementations ICLR Workshop DeepDiffEq 2019 Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

The classical development of neural networks has been primarily for mappings between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces.

Multipole Graph Neural Operator for Parametric Partial Differential Equations

4 code implementations NeurIPS 2020 Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

One of the main challenges in using deep learning-based methods for simulating physical systems and solving partial differential equations (PDEs) is formulating physics-based data in the desired structure for neural networks.

Fast Training of Diffusion Models with Masked Transformers

1 code implementation15 Jun 2023 Hongkai Zheng, Weili Nie, Arash Vahdat, Anima Anandkumar

For masked training, we introduce an asymmetric encoder-decoder architecture consisting of a transformer encoder that operates only on unmasked patches and a lightweight transformer decoder on full patches.

Denoising Representation Learning

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

2 code implementations3 Aug 2022 De-An Huang, Zhiding Yu, Anima Anandkumar

By only training a query-based image instance segmentation model, MinVIS outperforms the previous best result on the challenging Occluded VIS dataset by over 10% AP.

Instance Segmentation Segmentation +2

Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

2 code implementations6 Jun 2023 Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, Anima Anandkumar

Fourier Neural Operators (FNOs) have proven to be an efficient and effective method for resolution-independent operator learning in a broad variety of application areas across scientific machine learning.

Operator learning

Learning Dissipative Dynamics in Chaotic Systems

2 code implementations13 Jun 2021 Zongyi Li, Miguel Liu-Schiaffini, Nikola Kovachki, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

Chaotic systems are notoriously challenging to predict because of their sensitivity to perturbations and errors due to time stepping.

Diffusion Models for Adversarial Purification

2 code implementations16 May 2022 Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.

InRank: Incremental Low-Rank Learning

1 code implementation20 Jun 2023 Jiawei Zhao, Yifei Zhang, Beidi Chen, Florian Schäfer, Anima Anandkumar

To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training.

Computational Efficiency

I$^2$SB: Image-to-Image Schrödinger Bridge

1 code implementation12 Feb 2023 Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions.

Deblurring Image Restoration +1

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers

2 code implementations24 Nov 2021 John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro

AFNO is based on a principled foundation of operator learning which allows us to frame token mixing as a continuous global convolution without any dependence on the input resolution.

Computational Efficiency Operator learning +1

Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

1 code implementation21 Dec 2022 Shengchao Liu, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang, Chaowei Xiao, Anima Anandkumar

Here we present a multi-modal molecule structure-text model, MoleculeSTM, by jointly learning molecules' chemical structures and textual descriptions via a contrastive learning strategy.

Contrastive Learning Drug Discovery +2

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

3 code implementations ICLR 2019 Milan Cvitkovic, Badal Singh, Anima Anandkumar

Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques.

Code Completion

Probabilistic FastText for Multi-Sense Word Embeddings

1 code implementation ACL 2018 Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information.

Word Embeddings Word Similarity

Fourier Neural Operator with Learned Deformations for PDEs on General Geometries

6 code implementations11 Jul 2022 Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, Anima Anandkumar

The resulting geo-FNO model has both the computation efficiency of FFT and the flexibility of handling arbitrary geometries.

valid

Neural-Fly Enables Rapid Learning for Agile Flight in Strong Winds

1 code implementation13 May 2022 Michael O'Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

Last, our control design extrapolates to unseen wind conditions, is shown to be effective for outdoor flights with only onboard sensors, and can transfer across drones with minimal performance degradation.

Meta-Learning

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

1 code implementation ICCV 2023 Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

3D Object Detection Autonomous Driving +2

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

2 code implementations15 Sep 2022 Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.

Image Classification Zero-shot Generalization

Competitive Gradient Descent

8 code implementations NeurIPS 2019 Florian Schäfer, Anima Anandkumar

We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games.

Implicit competitive regularization in GANs

3 code implementations ICML 2020 Florian Schäfer, Hongkai Zheng, Anima Anandkumar

We show that opponent-aware modelling of generator and discriminator, as present in competitive gradient descent (CGD), can significantly strengthen ICR and thus stabilize GAN training without explicit regularization.

Image Generation

OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

1 code implementation2 Oct 2021 Josiah Wong, Viktor Makoviychuk, Anima Anandkumar, Yuke Zhu

Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.

Robot Manipulation

MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework

1 code implementation1 May 2020 Chiyu Max Jiang, Soheil Esmaeilzadeh, Kamyar Azizzadenesheli, Karthik Kashinath, Mustafa Mustafa, Hamdi A. Tchelepi, Philip Marcus, Prabhat, Anima Anandkumar

We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs.

Super-Resolution

Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials

1 code implementation NeurIPS 2023 Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, ZhiMing Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang

Artificial intelligence for scientific discovery has recently generated significant interest within the machine learning and scientific communities, particularly in the domains of chemistry, biology, and material discovery.

Benchmarking

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

1 code implementation21 Feb 2024 Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model.

Image Generation

signSGD: Compressed Optimisation for Non-Convex Problems

3 code implementations ICML 2018 Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar

Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD.

signSGD with Majority Vote is Communication Efficient And Fault Tolerant

3 code implementations ICLR 2019 Jeremy Bernstein, Jia-Wei Zhao, Kamyar Azizzadenesheli, Anima Anandkumar

Workers transmit only the sign of their gradient vector to a server, and the overall update is decided by a majority vote.

Benchmarking

U-FNO -- An enhanced Fourier neural operator-based deep-learning model for multiphase flow

1 code implementation3 Sep 2021 Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

Here we present U-FNO, a novel neural network architecture for solving multiphase flow problems with superior accuracy, speed, and data efficiency.

Decision Making

ZerO Initialization: Initializing Neural Networks with only Zeros and Ones

1 code implementation25 Oct 2021 Jiawei Zhao, Florian Schäfer, Anima Anandkumar

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training.

Image Classification

Long-term Forecasting using Higher Order Tensor RNNs

1 code implementation ICLR 2018 Rose Yu, Stephan Zheng, Anima Anandkumar, Yisong Yue

We present Higher-Order Tensor RNN (HOT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.

Time Series Time Series Analysis

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation ICLR 2022 Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Human-Object Interaction Detection Object +5

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation CVPR 2022 Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Benchmarking Few-Shot Image Classification +5

Born Again Neural Networks

2 code implementations ICML 2018 Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar

Knowledge distillation (KD) consists of transferring knowledge from one machine learning model (the teacher}) to another (the student).

Image Classification Knowledge Distillation

Learning compositional functions via multiplicative weight updates

1 code implementation NeurIPS 2020 Jeremy Bernstein, Jia-Wei Zhao, Markus Meister, Ming-Yu Liu, Anima Anandkumar, Yisong Yue

This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.

LEMMA

StrassenNets: Deep Learning with a Multiplication Budget

1 code implementation ICML 2018 Michael Tschannen, Aran Khanna, Anima Anandkumar

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers.

Image Classification Knowledge Distillation +2

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

1 code implementation17 Jun 2021 Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.

Autonomous Driving Image Augmentation +3

Retrieval-based Controllable Molecule Generation

1 code implementation23 Aug 2022 Zichao Wang, Weili Nie, Zhuoran Qiao, Chaowei Xiao, Richard Baraniuk, Anima Anandkumar

On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

Drug Discovery Retrieval

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Noisy Matrix Decomposition

2 code implementations NeurIPS 2014 Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

For sparse optimization, we establish that the modified ADMM method has an optimal convergence rate of $\mathcal{O}(s\log d/T)$, where $s$ is the sparsity level, $d$ is the data dimension and $T$ is the number of steps.

Fully Attentional Networks with Self-emerging Token Labeling

1 code implementation ICCV 2023 Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez

With the proposed STL framework, our best model based on FAN-L-Hybrid (77. 3M parameters) achieves 84. 8% Top-1 accuracy and 42. 1% mCE on ImageNet-1K and ImageNet-C, and sets a new state-of-the-art for ImageNet-A (46. 1%) and ImageNet-R (56. 6%) without using extra data, outperforming the original FAN counterpart by significant margins.

Semantic Segmentation

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

1 code implementation23 Oct 2022 Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy.

Segmentation Semantic Segmentation

Generative Adversarial Neural Operators

2 code implementations6 May 2022 Md Ashiqur Rahman, Manuel A. Florez, Anima Anandkumar, Zachary E. Ross, Kamyar Azizzadenesheli

The inputs to the generator are samples of functions from a user-specified probability measure, e. g., Gaussian random field (GRF), and the generator outputs are synthetic data functions.

Hyperparameter Optimization

Competitive Policy Optimization

4 code implementations18 Jun 2020 Manish Prajapat, Kamyar Azizzadenesheli, Alexander Liniger, Yisong Yue, Anima Anandkumar

A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties.

Policy Gradient Methods

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds

1 code implementation NeurIPS 2021 Yujia Huang, huan zhang, Yuanyuan Shi, J Zico Kolter, Anima Anandkumar

Certified robustness is a desirable property for deep neural networks in safety-critical applications, and popular training algorithms can certify robustness of a neural network by computing a global bound on its Lipschitz constant.

Stability Constrained Reinforcement Learning for Decentralized Real-Time Voltage Control

1 code implementation16 Sep 2022 Jie Feng, Yuanyuan Shi, Guannan Qu, Steven H. Low, Anima Anandkumar, Adam Wierman

In this paper, we propose a stability-constrained reinforcement learning (RL) method for real-time voltage control, that guarantees system stability both during policy learning and deployment of the learned policy.

reinforcement-learning Reinforcement Learning (RL)

Learning From Noisy Singly-labeled Data

1 code implementation ICLR 2018 Ashish Khetan, Zachary C. Lipton, Anima Anandkumar

We propose a new algorithm for jointly modeling labels and worker quality from noisy crowd-sourced data.

Neural Networks with Recurrent Generative Feedback

1 code implementation NeurIPS 2020 Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, Anima Anandkumar

This mechanism can be interpreted as a form of self-consistency between the maximum a posteriori (MAP) estimation of an internal generative model and the external environment.

Adversarial Robustness

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

1 code implementation12 Apr 2021 Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez

As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.

Object

Deep Bayesian Quadrature Policy Optimization

1 code implementation28 Jun 2020 Akella Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Anima Anandkumar, Yisong Yue

On the other hand, more sample efficient alternatives like Bayesian quadrature methods have received little attention due to their high computational complexity.

Continuous Control Policy Gradient Methods

DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

1 code implementation6 Mar 2024 Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, Jun Zhu

Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings.

Denoising

Active Learning with Partial Feedback

1 code implementation ICLR 2019 Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan

While many active learning papers assume that the learner can simply ask for a label and receive it, real annotation often presents a mismatch between the form of a label (say, one among many classes), and the form of an annotation (typically yes/no binary feedback).

Active Learning

Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

1 code implementation19 Mar 2024 Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A. Yeh, Jean Kossaifi, Kamyar Azizzadenesheli, Anima Anandkumar

On complex downstream tasks with limited data, such as fluid flow simulations and fluid-structure interactions, we found CoDA-NO to outperform existing methods on the few-shot learning task by over $36\%$.

Few-Shot Learning Self-Supervised Learning

Langevin Monte Carlo for Contextual Bandits

1 code implementation22 Jun 2022 Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i. e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices.

Multi-Armed Bandits Thompson Sampling

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

1 code implementation14 Dec 2021 Kevin Huang, Sahin Lale, Ugo Rosolia, Yuanyuan Shi, Anima Anandkumar

It then uses the top trajectories as initialization for gradient descent and applies gradient updates to each of these trajectories to find the optimal action sequence.

Continuous Control Model-based Reinforcement Learning +1

OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation

1 code implementation17 Aug 2020 Hongyu Ren, Yuke Zhu, Jure Leskovec, Anima Anandkumar, Animesh Garg

We propose a variational inference framework OCEAN to perform online task inference for compositional tasks.

Variational Inference

Competitive Mirror Descent

3 code implementations17 Jun 2020 Florian Schäfer, Anima Anandkumar, Houman Owhadi

Finally, we obtain the next iterate by following this direction according to the dual geometry induced by the Bregman potential.

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

1 code implementation29 May 2023 Haque Ishfaq, Qingfeng Lan, Pan Xu, A. Rupam Mahmood, Doina Precup, Anima Anandkumar, Kamyar Azizzadenesheli

One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings.

Efficient Exploration reinforcement-learning +2

Tensor Regression Networks

no code implementations26 Jul 2017 Jean Kossaifi, Zachary C. Lipton, Arinbjorn Kolbeinsson, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction.

regression

Compact Tensor Pooling for Visual Question Answering

no code implementations20 Jun 2017 Yang Shi, Tommaso Furlanello, Anima Anandkumar

Performing high level cognitive tasks requires the integration of feature maps with drastically different structure.

Question Answering Visual Question Answering

Homotopy Analysis for Tensor PCA

no code implementations28 Oct 2016 Anima Anandkumar, Yuan Deng, Rong Ge, Hossein Mobahi

For the challenging problem of tensor PCA, we prove global convergence of the homotopy method in the "high noise" regime.

Tensor Contraction Layers for Parsimonious Deep Nets

no code implementations1 Jun 2017 Jean Kossaifi, Aran Khanna, Zachary C. Lipton, Tommaso Furlanello, Anima Anandkumar

Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers.

Model Compression

Training Input-Output Recurrent Neural Networks through Spectral Methods

no code implementations3 Mar 2016 Hanie Sedghi, Anima Anandkumar

We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks.

POS POS Tagging

Efficient approaches for escaping higher order saddle points in non-convex optimization

no code implementations18 Feb 2016 Anima Anandkumar, Rong Ge

Local search heuristics for non-convex optimizations are popular in applied machine learning.

Provable Tensor Methods for Learning Mixtures of Generalized Linear Models

no code implementations9 Dec 2014 Hanie Sedghi, Majid Janzamin, Anima Anandkumar

In contrast, we present a tensor decomposition method which is guaranteed to correctly recover the parameters.

General Classification Tensor Decomposition

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

no code implementations28 Jun 2015 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks.

Tensor Decomposition

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

no code implementations6 Nov 2014 Anima Anandkumar, Rong Ge, Majid Janzamin

We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension.

A Scale Mixture Perspective of Multiplicative Noise in Neural Networks

no code implementations10 Jun 2015 Eric Nalisnick, Anima Anandkumar, Padhraic Smyth

Corrupting the input and hidden layers of deep neural networks (DNNs) with multiplicative noise, often drawn from the Bernoulli distribution (or 'dropout'), provides regularization that has significantly contributed to deep learning's success.

Model Compression

Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

no code implementations2 May 2015 Tejaswi Nimmagadda, Anima Anandkumar

We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input.

Classification Clustering +4

Provable Methods for Training Neural Networks with Sparse Connectivity

no code implementations8 Dec 2014 Hanie Sedghi, Anima Anandkumar

We provide novel guaranteed approaches for training feedforward neural networks with sparse connectivity.

Score Function Features for Discriminative Learning

no code implementations19 Dec 2014 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Guaranteed Scalable Learning of Latent Tree Models

no code implementations18 Jun 2014 Furong Huang, Niranjan U. N., Ioakeim Perros, Robert Chen, Jimeng Sun, Anima Anandkumar

We present an integrated approach for structure and parameter estimation in latent tree graphical models.

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

no code implementations9 Dec 2014 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Tensor decompositions for learning latent variable models

no code implementations29 Oct 2012 Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, Matus Telgarsky

This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order).

A Tensor Approach to Learning Mixed Membership Community Models

no code implementations12 Feb 2013 Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade

We provide guaranteed recovery of community memberships and model parameters and present a careful finite sample analysis of our learning method.

Community Detection Stochastic Block Model

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

no code implementations NeurIPS 2014 Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e. g. sparse optimization), and then extend to the multi-block setting with multiple regularizers and multiple variables (e. g. matrix decomposition into sparse and low rank components).

A Spectral Algorithm for Latent Dirichlet Allocation

no code implementations NeurIPS 2012 Anima Anandkumar, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Yi-Kai Liu

This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA).

Clustering Topic Models

Learning Mixtures of Tree Graphical Models

no code implementations NeurIPS 2012 Anima Anandkumar, Daniel J. Hsu, Furong Huang, Sham M. Kakade

We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables.

Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning

no code implementations ICLR 2019 Nhat Ho, Tan Nguyen, Ankit B. Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk

The conjugate prior yields a new regularizer for learning based on the paths rendered in the generative model for training CNNs–the Rendering Path Normalization (RPN).

Neural Rendering

Tensor Contraction & Regression Networks

no code implementations ICLR 2018 Jean Kossaifi, Zack Chase Lipton, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

Second, we introduce tensor regression layers, which express the output of a neural network as a low-rank multi-linear mapping from a high-order activation tensor to the softmax layer.

regression

Long-term Forecasting using Tensor-Train RNNs

no code implementations ICLR 2018 Rose Yu, Stephan Zheng, Anima Anandkumar, Yisong Yue

We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.

Stochastic Linear Bandits with Hidden Low Rank Structure

no code implementations28 Jan 2019 Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

We modify the image classification task into the SLB setting and empirically show that, when a pre-trained DNN provides the high dimensional feature representations, deploying PSLB results in significant reduction of regret and faster convergence to an accurate model compared to state-of-art algorithm.

Decision Making Dimensionality Reduction +2

Tensor Dropout for Robust Learning

no code implementations27 Feb 2019 Arinbjörn Kolbeinsson, Jean Kossaifi, Yannis Panagakis, Adrian Bulat, Anima Anandkumar, Ioanna Tzoulaki, Paul Matthews

CNNs achieve remarkable performance by leveraging deep, over-parametrized architectures, trained on large datasets.

Image Classification Inductive Bias

Robust Regression for Safe Exploration in Control

no code implementations L4DC 2020 Anqi Liu, Guanya Shi, Soon-Jo Chung, Anima Anandkumar, Yisong Yue

To address this challenge, we present a deep robust regression model that is trained to directly predict the uncertainty bounds for safe exploration.

Generalization Bounds regression +1

Learning Causal State Representations of Partially Observable Environments

no code implementations25 Jun 2019 Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP).

Causal Inference

Directivity Modes of Earthquake Populations with Unsupervised Learning

no code implementations30 Jun 2019 Zachary E. Ross, Daniel T. Trugman, Kamyar Azizzadenesheli, Anima Anandkumar

A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster.

Multi Sense Embeddings from Topic Models

no code implementations WS 2019 Shobhit Jain, Sravan Babu Bodapati, Ramesh Nallapati, Anima Anandkumar

Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due to their success in capturing useful semantic information.

Topic Models Word Embeddings +1

Triply Robust Off-Policy Evaluation

no code implementations13 Nov 2019 Anqi Liu, Hao liu, Anima Anandkumar, Yisong Yue

Ours is a general approach that can be used to augment any existing OPE method that utilizes the direct method.

Multi-Armed Bandits Off-policy evaluation +1

InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

no code implementations9 Dec 2019 Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model

no code implementations1 Nov 2018 Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk

This conjugate prior yields a new regularizer based on paths rendered in the generative model for training CNNs-the Rendering Path Normalization (RPN).

Regret Minimization in Partially Observable Linear Quadratic Control

no code implementations31 Jan 2020 Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

We propose a novel way to decompose the regret and provide an end-to-end sublinear regret upper bound for partially observable linear quadratic control.

Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting

no code implementations12 Mar 2020 Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

We study the problem of adaptive control in partially observable linear quadratic Gaussian control systems, where the model dynamics are unknown a priori.

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

no code implementations9 May 2020 Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

The Info-SNOC algorithm is used to compute a sub-optimal pool of safe motion plans that aid in exploration for learning unknown residual dynamics under safety constraints.

Motion Planning Optimal Motion Planning +1

Unsupervised Controllable Generation with Self-Training

no code implementations17 Jul 2020 Grigorios G. Chrysos, Jean Kossaifi, Zhiding Yu, Anima Anandkumar

Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.

Disentanglement

A Coach-Player Framework for Dynamic Team Composition

no code implementations1 Jan 2021 Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar

The performance of our method is comparable or even better than the setting where all players have a full view of the environment, but no coach.

Zero-shot Generalization

Transferable Unsupervised Robust Representation Learning

no code implementations1 Jan 2021 De-An Huang, Zhiding Yu, Anima Anandkumar

We upend this view and show that URRL improves both the natural accuracy of unsupervised representation learning and its robustness to corruptions and adversarial noise.

Data Augmentation Representation Learning +1

Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach

no code implementations8 Oct 2020 Haoxuan Wang, Zhiding Yu, Yisong Yue, Anima Anandkumar, Anqi Liu, Junchi Yan

We propose a framework for learning calibrated uncertainties under domain shifts, where the source (training) distribution differs from the target (test) distribution.

Density Ratio Estimation Unsupervised Domain Adaptation

Stability and Identification of Random Asynchronous Linear Time-Invariant Systems

no code implementations8 Dec 2020 Sahin Lale, Oguzhan Teke, Babak Hassibi, Anima Anandkumar

In this model, each state variable is updated randomly and asynchronously with some probability according to the underlying system dynamics.

Dynamic Social Media Monitoring for Fast-Evolving Online Discussions

no code implementations24 Feb 2021 Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, Anima Anandkumar

However, collecting social media data using a static set of keywords fails to satisfy the growing need to monitor dynamic conversations and to study fast-changing topics.

Decision Making Time Series +1

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

no code implementations12 Mar 2021 Zahra Ghodsi, Siva Kumar Sastry Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen W. Keckler, Siddharth Garg, Anima Anandkumar

Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems.

Autonomous Vehicles

Stable Online Control of Linear Time-Varying Systems

no code implementations29 Apr 2021 Guannan Qu, Yuanyuan Shi, Sahin Lale, Anima Anandkumar, Adam Wierman

In this work, we propose an efficient online control algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees input-to-state stability for a large class of LTV systems while also minimizing the control cost.

Informing Geometric Deep Learning with Electronic Interactions to Accelerate Quantum Chemistry

no code implementations31 May 2021 Zhuoran Qiao, Anders S. Christensen, Matthew Welborn, Frederick R. Manby, Anima Anandkumar, Thomas F. Miller III

Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials.

Tensor Methods in Computer Vision and Deep Learning

no code implementations7 Jul 2021 Yannis Panagakis, Jean Kossaifi, Grigorios G. Chrysos, James Oldfield, Mihalis A. Nicolaou, Anima Anandkumar, Stefanos Zafeiriou

Tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions.

Representation Learning

Finite-time System Identification and Adaptive Control in Autoregressive Exogenous Systems

no code implementations26 Aug 2021 Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

Using these guarantees, we design adaptive control algorithms for unknown ARX systems with arbitrary strongly convex or convex quadratic regulating costs.

Auditing AI models for Verified Deployment under Semantic Specifications

no code implementations25 Sep 2021 Homanga Bharadhwaj, De-An Huang, Chaowei Xiao, Anima Anandkumar, Animesh Garg

We enable such unit tests through variations in a semantically-interpretable latent space of a generative model.

Face Recognition

Stability Constrained Reinforcement Learning for Real-Time Voltage Control

no code implementations30 Sep 2021 Yuanyuan Shi, Guannan Qu, Steven Low, Anima Anandkumar, Adam Wierman

Deep reinforcement learning (RL) has been recognized as a promising tool to address the challenges in real-time control of power systems.

reinforcement-learning Reinforcement Learning (RL)

Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators

no code implementations ICLR 2022 John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro

AFNO is based on a principled foundation of operator learning which allows us to frame token mixing as a continuous global convolution without any dependence on the input resolution.

Computational Efficiency Operator learning +1

Scaling Fair Learning to Hundreds of Intersectional Groups

no code implementations29 Sep 2021 Eric Zhao, De-An Huang, Hao liu, Zhiding Yu, Anqi Liu, Olga Russakovsky, Anima Anandkumar

In real-world applications, however, there are multiple protected attributes yielding a large number of intersectional protected groups.

Attribute Fairness +1

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

no code implementations15 Nov 2021 Youngwoon Lee, Joseph J. Lim, Anima Anandkumar, Yuke Zhu

However, these approaches require larger state distributions to be covered as more policies are sequenced, and thus are limited to short skill sequences.

Reinforcement Learning (RL) Robot Manipulation

Polymatrix Competitive Gradient Descent

no code implementations16 Nov 2021 Jeffrey Ma, Alistair Letcher, Florian Schäfer, Yuanyuan Shi, Anima Anandkumar

In this work we propose polymatrix competitive gradient descent (PCGD) as a method for solving general sum competitive optimization involving arbitrary numbers of agents.

Multi-agent Reinforcement Learning

Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions

no code implementations NeurIPS 2021 Jiachen Sun, Yulong Cao, Christopher B. Choy, Zhiding Yu, Anima Anandkumar, Zhuoqing Morley Mao, Chaowei Xiao

In this paper, we systematically study the impact of various self-supervised learning proxy tasks on different architectures and threat models for 3D point clouds with adversarial training.

Adversarial Robustness Autonomous Driving +1

InfoCNF: Efficient Conditional Continuous Normalizing Flow Using Adaptive Solvers

no code implementations25 Sep 2019 Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

Distributionally Robust Learning for Unsupervised Domain Adaptation

no code implementations28 Sep 2020 Haoxuan Wang, Anqi Liu, Zhiding Yu, Yisong Yue, Anima Anandkumar

This formulation motivates the use of two neural networks that are jointly trained --- a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network.

Density Ratio Estimation Unsupervised Domain Adaptation

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

no code implementations15 Dec 2021 Shrimai Prabhumoye, Rafal Kocielnik, Mohammad Shoeybi, Anima Anandkumar, Bryan Catanzaro

We then provide the LM with instruction that consists of this subset of labeled exemplars, the query text to be classified, a definition of bias, and prompt it to make a decision.

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

no code implementations14 Mar 2022 Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability.

Contrastive Learning Deformable Object Manipulation

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

no code implementations11 Apr 2022 Enze Xie, Zhiding Yu, Daquan Zhou, Jonah Philion, Anima Anandkumar, Sanja Fidler, Ping Luo, Jose M. Alvarez

In this paper, we propose M$^2$BEV, a unified framework that jointly performs 3D object detection and map segmentation in the Birds Eye View~(BEV) space with multi-camera image inputs.

3D Object Detection object-detection +1

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

no code implementations3 Jun 2022 Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam Wierman, Anima Anandkumar

However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems.

reinforcement-learning Reinforcement Learning (RL)

Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

no code implementations7 Jun 2022 Tianyuan Jin, Pan Xu, Xiaokui Xiao, Anima Anandkumar

We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family, which covers many common reward distributions including Bernoulli, Gaussian, Gamma, Exponential, etc.

Multi-Armed Bandits Thompson Sampling

Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear Quadratic Control

no code implementations17 Jun 2022 Taylan Kargin, Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

By carefully prescribing an early exploration strategy and a policy update rule, we show that TS achieves order-optimal regret in adaptive control of multidimensional stabilizable LQRs.

Decision Making Decision Making Under Uncertainty +1

Large Scale Mask Optimization Via Convolutional Fourier Neural Operator and Litho-Guided Self Training

no code implementations8 Jul 2022 HaoYu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Anima Anandkumar, Brucek Khailany, Vivek Singh, Haoxing Ren

Machine learning techniques have been extensively studied for mask optimization problems, aiming at better mask printability, shorter turnaround time, better mask manufacturability, and so on.

BIG-bench Machine Learning

Robust Trajectory Prediction against Adversarial Attacks

no code implementations29 Jul 2022 Yulong Cao, Danfei Xu, Xinshuo Weng, Zhuoqing Mao, Anima Anandkumar, Chaowei Xiao, Marco Pavone

We demonstrate that our method is able to improve the performance by 46% on adversarial data and at the cost of only 3% performance degradation on clean data, compared to the model trained with clean data.

Autonomous Driving Data Augmentation +1

DensePure: Understanding Diffusion Models towards Adversarial Robustness

no code implementations1 Nov 2022 Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process.

Adversarial Robustness Denoising

Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

no code implementations21 Nov 2022 Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar

Finally, we find that not all transfer scenarios yield a positive gain, which seems related to the PLMs initial performance on the target-domain task.

Active Learning Transfer Learning

Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

no code implementations28 Nov 2022 Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar

Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows.

Machine Learning Accelerated PDE Backstepping Observers

no code implementations28 Nov 2022 Yuanyuan Shi, Zongyi Li, Huan Yu, Drew Steeves, Anima Anandkumar, Miroslav Krstic

State estimation is important for a variety of tasks, from forecasting to substituting for unmeasured states in feedback controllers.

Computational Efficiency

Fourier Continuation for Exact Derivative Computation in Physics-Informed Neural Operators

no code implementations29 Nov 2022 Haydn Maust, Zongyi Li, YiXuan Wang, Daniel Leibovici, Oscar Bruno, Thomas Hou, Anima Anandkumar

The physics-informed neural operator (PINO) is a machine learning architecture that has shown promising empirical results for learning partial differential equations.

Towards Neural Variational Monte Carlo That Scales Linearly with System Size

no code implementations21 Dec 2022 Or Sharir, Garnet Kin-Lic Chan, Anima Anandkumar

Quantum many-body problems are some of the most challenging problems in science and are central to demystifying some exotic quantum phenomena, e. g., high-temperature superconductors.

Quantization Variational Monte Carlo

Vision Transformers Are Good Mask Auto-Labelers

no code implementations CVPR 2023 Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar

We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations.

Instance Segmentation Segmentation +1

Forecasting subcritical cylinder wakes with Fourier Neural Operators

no code implementations19 Jan 2023 Peter I Renn, Cong Wang, Sahin Lale, Zongyi Li, Anima Anandkumar, Morteza Gharib

The learned FNO solution operator can be evaluated in milliseconds, potentially enabling faster-than-real-time modeling for predictive flow control in physical systems.

Operator learning

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

no code implementations13 Feb 2023 Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

In this paper, we propose PerAda, a parameter-efficient pFL framework that reduces communication and computational costs and exhibits superior generalization performance, especially under test-time distribution shifts.

Generalization Bounds Knowledge Distillation +2

BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models

no code implementations14 Feb 2023 Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar

We thus enable seamless open-ended social bias testing of PLMs by domain experts through an automatic large-scale generation of diverse test sentences for any combination of social categories and attributes.

Sentence Text Generation

Score-based Diffusion Models in Function Space

no code implementations14 Feb 2023 Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

no code implementations CVPR 2023 Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima Anandkumar

To apply this representation to monocular scene reconstruction, we develop a scale calibration algorithm for fast geometric initialization from monocular depth priors.

Indoor Scene Reconstruction

Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs

no code implementations27 Jul 2023 Or Sharir, Anima Anandkumar

Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.

Document Classification Knowledge Distillation +2

Tipping Point Forecasting in Non-Stationary Dynamics on Function Spaces

no code implementations17 Aug 2023 Miguel Liu-Schiaffini, Clare E. Singer, Nikola Kovachki, Tapio Schneider, Kamyar Azizzadenesheli, Anima Anandkumar

Tipping points are abrupt, drastic, and often irreversible changes in the evolution of non-stationary and chaotic dynamical systems.

Conformal Prediction

Spacetime Surface Regularization for Neural Dynamic Scene Reconstruction

no code implementations ICCV 2023 Jaesung Choe, Christopher Choy, Jaesik Park, In So Kweon, Anima Anandkumar

We propose an algorithm, 4DRegSDF, for the spacetime surface regularization to improve the fidelity of neural rendering and reconstruction in dynamic scenes.

Neural Rendering

Neural Operators for Accelerating Scientific Simulations and Design

no code implementations27 Sep 2023 Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, Anima Anandkumar

Scientific discovery and engineering design are currently limited by the time and cost of physical experiments, selected mostly through trial-and-error and intuition that require deep domain expertise.

Super-Resolution Weather Forecasting

Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs

no code implementations29 Sep 2023 Jean Kossaifi, Nikola Kovachki, Kamyar Azizzadenesheli, Anima Anandkumar

Our contributions are threefold: i) we enable parallelization over input samples with a novel multi-grid-based domain decomposition, ii) we represent the parameters of the model in a high-order latent subspace of the Fourier domain, through a global tensor factorization, resulting in an extreme reduction in the number of parameters and improved generalization, and iii) we propose architectural improvements to the backbone FNO.

Operator learning

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations6 Oct 2023 Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

EKGNet: A 10.96μW Fully Analog Neural Network for Intra-Patient Arrhythmia Classification

no code implementations24 Oct 2023 Benyamin Haghi, Lin Ma, Sahin Lale, Anima Anandkumar, Azita Emami

We present an integrated approach by combining analog computing and deep learning for electrocardiogram (ECG) arrhythmia classification.

Classification

Deep Multimodal Fusion for Surgical Feedback Classification

no code implementations6 Dec 2023 Rafal Kocielnik, Elyssa Y. Wong, Timothy N. Chu, Lydia Lin, De-An Huang, Jiayun Wang, Anima Anandkumar, Andrew J. Hung

This work offers an important first look at the feasibility of automated classification of real-world live surgical feedback based on text, audio, and video modalities.

Classification

Perspectives on the State and Future of Deep Learning -- 2023

no code implementations7 Dec 2023 Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.