Search Results for author: Anima Anandkumar

Found 209 papers, 89 papers with code

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

23 code implementations • NeurIPS 2021 • Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

Ranked #1 on Semantic Segmentation on COCO-Stuff full

C++ code Semantic Segmentation +1

124,527

Paper
Code

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

1 code implementation • 8 Feb 2022 • Boxin Wang, Wei Ping, Chaowei Xiao, Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Bo Li, Anima Anandkumar, Bryan Catanzaro

In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models.

8,472

Paper
Code

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

1 code implementation • 13 Apr 2023 • Boxin Wang, Wei Ping, Peng Xu, Lawrence McAfee, Zihan Liu, Mohammad Shoeybi, Yi Dong, Oleksii Kuchaiev, Bo Li, Chaowei Xiao, Anima Anandkumar, Bryan Catanzaro

Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval?

Open-Ended Question Answering Retrieval +1

8,472

Paper
Code

Voyager: An Open-Ended Embodied Agent with Large Language Models

1 code implementation • 25 May 2023 • Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, Anima Anandkumar

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention.

5,137

Paper
Code

Eureka: Human-Level Reward Design via Coding Large Language Models

1 code implementation • 19 Oct 2023 • Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar

The generality of Eureka also enables a new gradient-free in-context learning approach to reinforcement learning from human feedback (RLHF), readily incorporating human inputs to improve the quality and the safety of the generated rewards without model updating.

Decision Making In-Context Learning +1

2,583

Paper
Code

Fourier Neural Operator for Parametric Partial Differential Equations

17 code implementations • ICLR 2021 • Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

The classical development of neural networks has primarily focused on learning mappings between finite-dimensional Euclidean spaces.

Super-Resolution

1,769

Paper
Code

Speeding up Fourier Neural Operators via Mixed Precision

1 code implementation • 27 Jul 2023 • Colin White, Renbo Tu, Jean Kossaifi, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar

In this work, we (i) profile memory and runtime for FNO with full and mixed-precision training, (ii) conduct a study on the numerical stability of mixed-precision training of FNO, and (iii) devise a training routine which substantially decreases training time and memory usage (up to 34%), with little or no reduction in accuracy, on the Navier-Stokes and Darcy flow equations.

1,769

Paper
Code

Neural Operator: Learning Maps Between Function Spaces

1 code implementation • 19 Aug 2021 • Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

The classical development of neural networks has primarily focused on learning mappings between finite dimensional Euclidean spaces or finite sets.

Operator learning

1,767

Paper
Code

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

1 code implementation • 17 Jun 2022 • Linxi Fan, Guanzhi Wang, Yunfan Jiang, Ajay Mandlekar, Yuncong Yang, Haoyi Zhu, Andrew Tang, De-An Huang, Yuke Zhu, Anima Anandkumar

Autonomous agents have made great strides in specialist domains like Atari games and Go.

Atari Games

1,659

Paper
Code

TensorLy: Tensor Learning in Python

1 code implementation • 29 Oct 2016 • Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, Maja Pantic

In addition, using the deep-learning frameworks as backend allows users to easily design and train deep tensorized neural networks.

1,494

Paper
Code

Prismer: A Vision-Language Model with Multi-Task Experts

2 code implementations • 4 Mar 2023 • Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar

Recent vision-language models have shown impressive multi-modal generation capabilities.

Ranked #1 on Image Captioning on nocaps val

Few-Shot Learning Image Captioning +2

1,287

Paper
Code

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

1 code implementation • 6 Mar 2024 • Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

1,085

Paper
Code

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

1 code implementation • CVPR 2023 • Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M. Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images.

Ranked #3 on 3D Semantic Scene Completion from a single RGB image on KITTI-360

3D Semantic Scene Completion from a single RGB image Depth Estimation

960

Paper
Code

VIMA: General Robot Manipulation with Multimodal Prompts

2 code implementations • 6 Oct 2022 • Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

We show that a wide spectrum of robot manipulation tasks can be expressed with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +3

667

Paper
Code

FB-BEV: BEV Representation from Forward-Backward View Transformations

1 code implementation • ICCV 2023 • Zhiqi Li, Zhiding Yu, Wenhai Wang, Anima Anandkumar, Tong Lu, Jose M. Alvarez

Currently, the two most prominent VTM paradigms are forward projection and backward projection.

536

Paper
Code

Understanding The Robustness in Vision Transformers

2 code implementations • 26 Apr 2022 • Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng, Jose M. Alvarez

Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations.

Ranked #4 on Domain Generalization on ImageNet-R (using extra training data)

Domain Generalization Image Classification +3

458

Paper
Code

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

3 code implementations • NeurIPS 2023 • Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.

Automated Theorem Proving Math +1

433

Paper
Code

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

1 code implementation • 17 Jul 2023 • Xuan Zhang, Limei Wang, Jacob Helwig, Youzhi Luo, Cong Fu, Yaochen Xie, Meng Liu, Yuchao Lin, Zhao Xu, Keqiang Yan, Keir Adams, Maurice Weiler, Xiner Li, Tianfan Fu, Yucheng Wang, Haiyang Yu, Yuqing Xie, Xiang Fu, Alex Strasser, Shenglong Xu, Yi Liu, Yuanqi Du, Alexandra Saxton, Hongyi Ling, Hannah Lawrence, Hannes Stärk, Shurui Gui, Carl Edwards, Nicholas Gao, Adriana Ladera, Tailin Wu, Elyssa F. Hofgard, Aria Mansouri Tehrani, Rui Wang, Ameya Daigavane, Montgomery Bohde, Jerry Kurtin, Qian Huang, Tuong Phung, Minkai Xu, Chaitanya K. Joshi, Simon V. Mathis, Kamyar Azizzadenesheli, Ada Fang, Alán Aspuru-Guzik, Erik Bekkers, Michael Bronstein, Marinka Zitnik, Anima Anandkumar, Stefano Ermon, Pietro Liò, Rose Yu, Stephan Günnemann, Jure Leskovec, Heng Ji, Jimeng Sun, Regina Barzilay, Tommi Jaakkola, Connor W. Coley, Xiaoning Qian, Xiaofeng Qian, Tess Smidt, Shuiwang Ji

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences.

Out-of-Distribution Generalization Transfer Learning +1

398

Paper
Code

Pre-Trained Language Models for Interactive Decision-Making

1 code implementation • 3 Feb 2022 • Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.

Imitation Learning Language Modelling

384

Paper
Code

Long-Short Transformer: Efficient Transformers for Language and Vision

3 code implementations • NeurIPS 2021 • Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro

For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.

Ranked #1 on Language Modelling on enwik8 dev

Language Modelling

313

Paper
Code

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation • CVPR 2022 • Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

309

Paper
Code

Physics-Informed Neural Operator for Learning Partial Differential Equations

4 code implementations • 6 Nov 2021 • Zongyi Li, Hongkai Zheng, Nikola Kovachki, David Jin, Haoxuan Chen, Burigede Liu, Kamyar Azizzadenesheli, Anima Anandkumar

Specifically, in PINO, we combine coarse-resolution training data with PDE constraints imposed at a higher resolution.

Operator learning Super-Resolution

282

Paper
Code

Neural Operator: Graph Kernel Network for Partial Differential Equations

6 code implementations • ICLR Workshop DeepDiffEq 2019 • Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

The classical development of neural networks has been primarily for mappings between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces.

281

Paper
Code

Multipole Graph Neural Operator for Parametric Partial Differential Equations

4 code implementations • NeurIPS 2020 • Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

One of the main challenges in using deep learning-based methods for simulating physical systems and solving partial differential equations (PDEs) is formulating physics-based data in the desired structure for neural networks.

281

Paper
Code

Fast Training of Diffusion Models with Masked Transformers

1 code implementation • 15 Jun 2023 • Hongkai Zheng, Weili Nie, Arash Vahdat, Anima Anandkumar

For masked training, we introduce an asymmetric encoder-decoder architecture consisting of a transformer encoder that operates only on unmasked patches and a lightweight transformer decoder on full patches.

Denoising Representation Learning

279

Paper
Code

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

2 code implementations • 3 Aug 2022 • De-An Huang, Zhiding Yu, Anima Anandkumar

By only training a query-based image instance segmentation model, MinVIS outperforms the previous best result on the challenging Occluded VIS dataset by over 10% AP.

Ranked #13 on Video Instance Segmentation on YouTube-VIS validation

Instance Segmentation Segmentation +2

261

Paper
Code

Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

2 code implementations • 6 Jun 2023 • Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, Anima Anandkumar

Fourier Neural Operators (FNOs) have proven to be an efficient and effective method for resolution-independent operator learning in a broad variety of application areas across scientific machine learning.

Operator learning

258

Paper
Code

Learning Dissipative Dynamics in Chaotic Systems

2 code implementations • 13 Jun 2021 • Zongyi Li, Miguel Liu-Schiaffini, Nikola Kovachki, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

Chaotic systems are notoriously challenging to predict because of their sensitivity to perturbations and errors due to time stepping.

226

Paper
Code

Diffusion Models for Adversarial Purification

2 code implementations • 16 May 2022 • Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.

218

Paper
Code

InRank: Incremental Low-Rank Learning

1 code implementation • 20 Jun 2023 • Jiawei Zhao, Yifei Zhang, Beidi Chen, Florian Schäfer, Anima Anandkumar

To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training.

Computational Efficiency

211

Paper
Code

I$^2$SB: Image-to-Image Schrödinger Bridge

1 code implementation • 12 Feb 2023 • Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions.

Deblurring Image Restoration +1

205

Paper
Code

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

1 code implementation • CVPR 2022 • Zhiqi Li, Wenhai Wang, Enze Xie, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo, Tong Lu

Specifically, we supervise the attention modules in the mask decoder in a layer-wise manner.

Ranked #4 on Panoptic Segmentation on COCO test-dev

Instance Segmentation Panoptic Segmentation +1

195

Paper
Code

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers

2 code implementations • 24 Nov 2021 • John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro

AFNO is based on a principled foundation of operator learning which allows us to frame token mixing as a continuous global convolution without any dependence on the input resolution.

Computational Efficiency Operator learning +1

184

Paper
Code

Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

1 code implementation • 21 Dec 2022 • Shengchao Liu, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang, Chaowei Xiao, Anima Anandkumar

Here we present a multi-modal molecule structure-text model, MoleculeSTM, by jointly learning molecules' chemical structures and textual descriptions via a contrastive learning strategy.

Contrastive Learning Drug Discovery +2

174

Paper
Code

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

3 code implementations • ICLR 2019 • Milan Cvitkovic, Badal Singh, Anima Anandkumar

Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques.

Code Completion

165

Paper
Code

Probabilistic FastText for Multi-Sense Word Embeddings

1 code implementation • ACL 2018 • Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information.

Word Embeddings Word Similarity

148

Paper
Code

State-specific protein-ligand complex structure prediction with a multi-scale deep generative model

1 code implementation • 30 Sep 2022 • Zhuoran Qiao, Weili Nie, Arash Vahdat, Thomas F. Miller III, Anima Anandkumar

The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life.

Benchmarking Blind Docking +3

139

Paper
Code

Fourier Neural Operator with Learned Deformations for PDEs on General Geometries

6 code implementations • 11 Jul 2022 • Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, Anima Anandkumar

The resulting geo-FNO model has both the computation efficiency of FFT and the flexibility of handling arbitrary geometries.

valid

137

Paper
Code

Neural-Fly Enables Rapid Learning for Agile Flight in Strong Winds

1 code implementation • 13 May 2022 • Michael O'Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

Last, our control design extrapolates to unseen wind conditions, is shown to be effective for outdoor flights with only onboard sensors, and can transfer across drones with minimal performance degradation.

Meta-Learning

133

Paper
Code

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

1 code implementation • ICCV 2023 • Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

3D Object Detection Autonomous Driving +2

130

Paper
Code

AugMax: Adversarial Composition of Random Augmentations for Robust Training

1 code implementation • NeurIPS 2021 • Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, Zhangyang Wang

Diversity and hardness are two complementary dimensions of data augmentation to achieve robustness.

Data Augmentation

123

Paper
Code

A Text-guided Protein Design Framework

1 code implementation • 9 Feb 2023 • Shengchao Liu, Yanjing Li, Zhuoxinran Li, Anthony Gitter, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Arvind Ramanathan, Chaowei Xiao, Jian Tang, Hongyu Guo, Anima Anandkumar

Current AI-assisted protein design mainly utilizes protein sequential and structural information.

Property Prediction Protein Design

118

Paper
Code

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

2 code implementations • 15 Sep 2022 • Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.

Image Classification Zero-shot Generalization

117

Paper
Code

Competitive Gradient Descent

8 code implementations • NeurIPS 2019 • Florian Schäfer, Anima Anandkumar

We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games.

112

Paper
Code

Implicit competitive regularization in GANs

3 code implementations • ICML 2020 • Florian Schäfer, Hongkai Zheng, Anima Anandkumar

We show that opponent-aware modelling of generator and discriminator, as present in competitive gradient descent (CGD), can significantly strengthen ICR and thus stabilize GAN training without explicit regularization.

Image Generation

112

Paper
Code

OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

1 code implementation • 2 Oct 2021 • Josiah Wong, Viktor Makoviychuk, Anima Anandkumar, Yuke Zhu

Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.

Robot Manipulation

102

Paper
Code

MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework

1 code implementation • 1 May 2020 • Chiyu Max Jiang, Soheil Esmaeilzadeh, Kamyar Azizzadenesheli, Karthik Kashinath, Mustafa Mustafa, Hamdi A. Tchelepi, Philip Marcus, Prabhat, Anima Anandkumar

We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs.

Super-Resolution

Paper
Code

Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials

1 code implementation • NeurIPS 2023 • Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, ZhiMing Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang

Artificial intelligence for scientific discovery has recently generated significant interest within the machine learning and scientific communities, particularly in the domains of chemistry, biology, and material discovery.

Benchmarking

Paper
Code

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

3 code implementations • ICCV 2021 • Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision.

Ranked #1 on Weakly-supervised instance segmentation on COCO 2017 val

Box-supervised Instance Segmentation Segmentation +2

Paper
Code

ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation

1 code implementation • NeurIPS 2023 • Sungduk Yu, Walter Hannah, Liran Peng, Jerry Lin, Mohamed Aziz Bhouri, Ritwik Gupta, Björn Lütjens, Justus Christopher Will, Gunnar Behrens, Julius Busecke, Nora Loose, Charles I Stern, Tom Beucler, Bryce Harrop, Benjamin R Hillman, Andrea Jenney, Savannah Ferretti, Nana Liu, Anima Anandkumar, Noah D Brenowitz, Veronika Eyring, Nicholas Geneva, Pierre Gentine, Stephan Mandt, Jaideep Pathak, Akshay Subramaniam, Carl Vondrick, Rose Yu, Laure Zanna, Tian Zheng, Ryan Abernathey, Fiaz Ahmed, David C Bader, Pierre Baldi, Elizabeth Barnes, Christopher Bretherton, Peter Caldwell, Wayne Chuang, Yilun Han, Yu Huang, Fernando Iglesias-Suarez, Sanket Jantre, Karthik Kashinath, Marat Khairoutdinov, Thorsten Kurth, Nicholas Lutsko, Po-Lun Ma, Griffin Mooers, J. David Neelin, David Randall, Sara Shamekh, Mark A Taylor, Nathan Urban, Janni Yuval, Guang Zhang, Michael Pritchard

The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators.

Paper
Code

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

1 code implementation • 21 Feb 2024 • Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model.

Image Generation

Paper
Code

signSGD: Compressed Optimisation for Non-Convex Problems

3 code implementations • ICML 2018 • Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar

Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD.

Paper
Code

signSGD with Majority Vote is Communication Efficient And Fault Tolerant

3 code implementations • ICLR 2019 • Jeremy Bernstein, Jia-Wei Zhao, Kamyar Azizzadenesheli, Anima Anandkumar

Workers transmit only the sign of their gradient vector to a server, and the overall update is decided by a majority vote.

Benchmarking

Paper
Code

U-FNO -- An enhanced Fourier neural operator-based deep-learning model for multiphase flow

1 code implementation • 3 Sep 2021 • Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

Here we present U-FNO, a novel neural network architecture for solving multiphase flow problems with superior accuracy, speed, and data efficiency.

Decision Making

Paper
Code

ZerO Initialization: Initializing Neural Networks with only Zeros and Ones

1 code implementation • 25 Oct 2021 • Jiawei Zhao, Florian Schäfer, Anima Anandkumar

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training.

Image Classification

Paper
Code

Controllable and Compositional Generation with Latent-Space Energy-Based Models

1 code implementation • NeurIPS 2021 • Weili Nie, Arash Vahdat, Anima Anandkumar

In compositional generation, our method excels at zero-shot generation of unseen attribute combinations.

Attribute Image Generation

Paper
Code

Long-term Forecasting using Higher Order Tensor RNNs

1 code implementation • ICLR 2018 • Rose Yu, Stephan Zheng, Anima Anandkumar, Yisong Yue

We present Higher-Order Tensor RNN (HOT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.

Time Series Time Series Analysis

Paper
Code

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation • ICLR 2022 • Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO

Human-Object Interaction Detection Object +5

Paper
Code

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation • CVPR 2022 • Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Ranked #1 on Few-Shot Image Classification on Bongard-HOI

Benchmarking Few-Shot Image Classification +5

Paper
Code

Born Again Neural Networks

2 code implementations • ICML 2018 • Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar

Knowledge distillation (KD) consists of transferring knowledge from one machine learning model (the teacher}) to another (the student).

Image Classification Knowledge Distillation

Paper
Code

Learning compositional functions via multiplicative weight updates

1 code implementation • NeurIPS 2020 • Jeremy Bernstein, Jia-Wei Zhao, Markus Meister, Ming-Yu Liu, Anima Anandkumar, Yisong Yue

This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.

LEMMA

Paper
Code

StrassenNets: Deep Learning with a Multiplication Budget

1 code implementation • ICML 2018 • Michael Tschannen, Aran Khanna, Anima Anandkumar

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers.

Image Classification Knowledge Distillation +2

Paper
Code

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

1 code implementation • 17 Jun 2021 • Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.

Autonomous Driving Image Augmentation +3

Paper
Code

Retrieval-based Controllable Molecule Generation

1 code implementation • 23 Aug 2022 • Zichao Wang, Weili Nie, Zhuoran Qiao, Chaowei Xiao, Richard Baraniuk, Anima Anandkumar

On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

Drug Discovery Retrieval

Paper
Code

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Noisy Matrix Decomposition

2 code implementations • NeurIPS 2014 • Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

For sparse optimization, we establish that the modified ADMM method has an optimal convergence rate of $\mathcal{O}(s\log d/T)$, where $s$ is the sparsity level, $d$ is the data dimension and $T$ is the number of steps.

Paper
Code

Automated Synthetic-to-Real Generalization

1 code implementation • ICML 2020 • Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

Models trained on synthetic images often face degraded generalization to real data.

Domain Adaptation

Paper
Code

Contrastive Syn-to-Real Generalization

2 code implementations • ICLR 2021 • Wuyang Chen, Zhiding Yu, Shalini De Mello, Sifei Liu, Jose M. Alvarez, Zhangyang Wang, Anima Anandkumar

Training on synthetic data can be beneficial for label or data-scarce scenarios.

Domain Generalization Inductive Bias

Paper
Code

Fully Attentional Networks with Self-emerging Token Labeling

1 code implementation • ICCV 2023 • Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez

With the proposed STL framework, our best model based on FAN-L-Hybrid (77. 3M parameters) achieves 84. 8% Top-1 accuracy and 42. 1% mCE on ImageNet-1K and ImageNet-C, and sets a new state-of-the-art for ImageNet-A (46. 1%) and ImageNet-R (56. 6%) without using extra data, outperforming the original FAN counterpart by significant margins.

Ranked #16 on Domain Generalization on ImageNet-C

Semantic Segmentation

Paper
Code

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

1 code implementation • 23 Oct 2022 • Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy.

Segmentation Semantic Segmentation

Paper
Code

Generative Adversarial Neural Operators

2 code implementations • 6 May 2022 • Md Ashiqur Rahman, Manuel A. Florez, Anima Anandkumar, Zachary E. Ross, Kamyar Azizzadenesheli

The inputs to the generator are samples of functions from a user-specified probability measure, e. g., Gaussian random field (GRF), and the generator outputs are synthetic data functions.

Hyperparameter Optimization

Paper
Code

Competitive Policy Optimization

4 code implementations • 18 Jun 2020 • Manish Prajapat, Kamyar Azizzadenesheli, Alexander Liniger, Yisong Yue, Anima Anandkumar

A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties.

Policy Gradient Methods

Paper
Code

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds

1 code implementation • NeurIPS 2021 • Yujia Huang, huan zhang, Yuanyuan Shi, J Zico Kolter, Anima Anandkumar

Certified robustness is a desirable property for deep neural networks in safety-critical applications, and popular training algorithms can certify robustness of a neural network by computing a global bound on its Lipschitz constant.

Paper
Code

Stability Constrained Reinforcement Learning for Decentralized Real-Time Voltage Control

1 code implementation • 16 Sep 2022 • Jie Feng, Yuanyuan Shi, Guannan Qu, Steven H. Low, Anima Anandkumar, Adam Wierman

In this paper, we propose a stability-constrained reinforcement learning (RL) method for real-time voltage control, that guarantees system stability both during policy learning and deployment of the learned policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Learning From Noisy Singly-labeled Data

1 code implementation • ICLR 2018 • Ashish Khetan, Zachary C. Lipton, Anima Anandkumar

We propose a new algorithm for jointly modeling labels and worker quality from noisy crowd-sourced data.

Paper
Code

Neural Networks with Recurrent Generative Feedback

1 code implementation • NeurIPS 2020 • Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, Anima Anandkumar

This mechanism can be interpreted as a form of self-consistency between the maximum a posteriori (MAP) estimation of an internal generative model and the external environment.

Adversarial Robustness

Paper
Code

Stochastic Activation Pruning for Robust Adversarial Defense

1 code implementation • ICLR 2018 • Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, Anima Anandkumar

Neural networks are known to be vulnerable to adversarial examples.

Adversarial Defense

Paper
Code

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

1 code implementation • 12 Apr 2021 • Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez

As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.

Object

Paper
Code

Fast Sampling of Diffusion Models via Operator Learning

1 code implementation • 24 Nov 2022 • Hongkai Zheng, Weili Nie, Arash Vahdat, Kamyar Azizzadenesheli, Anima Anandkumar

Diffusion models have found widespread adoption in various areas.

Operator learning

Paper
Code

Deep Bayesian Quadrature Policy Optimization

1 code implementation • 28 Jun 2020 • Akella Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Anima Anandkumar, Yisong Yue

On the other hand, more sample efficient alternatives like Bayesian quadrature methods have received little attention due to their high computational complexity.

Continuous Control Policy Gradient Methods

Paper
Code

DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

1 code implementation • 6 Mar 2024 • Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, Jun Zhu

Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings.

Denoising

Paper
Code

Active Learning with Partial Feedback

1 code implementation • ICLR 2019 • Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan

While many active learning papers assume that the learner can simply ask for a label and receive it, real annotation often presents a mismatch between the form of a label (say, one among many classes), and the form of an annotation (typically yes/no binary feedback).

Active Learning

Paper
Code

Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

1 code implementation • 19 Mar 2024 • Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A. Yeh, Jean Kossaifi, Kamyar Azizzadenesheli, Anima Anandkumar

On complex downstream tasks with limited data, such as fluid flow simulations and fluid-structure interactions, we found CoDA-NO to outperform existing methods on the few-shot learning task by over $36\%$.

Few-Shot Learning Self-Supervised Learning

Paper
Code

Langevin Monte Carlo for Contextual Bandits

1 code implementation • 22 Jun 2022 • Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i. e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices.

Multi-Armed Bandits Thompson Sampling

Paper
Code

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

1 code implementation • 14 Dec 2021 • Kevin Huang, Sahin Lale, Ugo Rosolia, Yuanyuan Shi, Anima Anandkumar

It then uses the top trajectories as initialization for gradient descent and applies gradient updates to each of these trajectories to find the optimal action sequence.

Continuous Control Model-based Reinforcement Learning +1

Paper
Code

Finding Social Media Trolls: Dynamic Keyword Selection Methods for Rapidly-Evolving Online Debates

2 code implementations • 13 Nov 2019 • Anqi Liu, Maya Srikanth, Nicholas Adams-Cohen, R. Michael Alvarez, Anima Anandkumar

Online harassment is a significant social problem.

Paper
Code

OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation

1 code implementation • 17 Aug 2020 • Hongyu Ren, Yuke Zhu, Jure Leskovec, Anima Anandkumar, Animesh Garg

We propose a variational inference framework OCEAN to perform online task inference for compositional tasks.

Variational Inference

Paper
Code

Augmenting Deep Classifiers with Polynomial Neural Networks

2 code implementations • 16 Apr 2021 • Grigorios G Chrysos, Markos Georgopoulos, Jiankang Deng, Jean Kossaifi, Yannis Panagakis, Anima Anandkumar

The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.

Ranked #2 on Audio Classification on Speech Commands

Audio Classification General Classification +4

Paper
Code

Competitive Mirror Descent

3 code implementations • 17 Jun 2020 • Florian Schäfer, Anima Anandkumar, Houman Owhadi

Finally, we obtain the next iterate by following this direction according to the dual geometry induced by the Bregman potential.

Paper
Code

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

1 code implementation • 29 May 2023 • Haque Ishfaq, Qingfeng Lan, Pan Xu, A. Rupam Mahmood, Doina Precup, Anima Anandkumar, Kamyar Azizzadenesheli

One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings.

Efficient Exploration reinforcement-learning +2

Paper
Code

Tensor Regression Networks

no code implementations • 26 Jul 2017 • Jean Kossaifi, Zachary C. Lipton, Arinbjorn Kolbeinsson, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction.

regression

Paper
Add Code

Compact Tensor Pooling for Visual Question Answering

no code implementations • 20 Jun 2017 • Yang Shi, Tommaso Furlanello, Anima Anandkumar

Performing high level cognitive tasks requires the integration of feature maps with drastically different structure.

Question Answering Visual Question Answering

Paper
Add Code

Homotopy Analysis for Tensor PCA

no code implementations • 28 Oct 2016 • Anima Anandkumar, Yuan Deng, Rong Ge, Hossein Mobahi

For the challenging problem of tensor PCA, we prove global convergence of the homotopy method in the "high noise" regime.

Paper
Add Code

Tensor Contraction Layers for Parsimonious Deep Nets

no code implementations • 1 Jun 2017 • Jean Kossaifi, Aran Khanna, Zachary C. Lipton, Tommaso Furlanello, Anima Anandkumar

Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers.

Model Compression

Paper
Add Code

Training Input-Output Recurrent Neural Networks through Spectral Methods

no code implementations • 3 Mar 2016 • Hanie Sedghi, Anima Anandkumar

We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks.

POS POS Tagging

Paper
Add Code

Efficient approaches for escaping higher order saddle points in non-convex optimization

no code implementations • 18 Feb 2016 • Anima Anandkumar, Rong Ge

Local search heuristics for non-convex optimizations are popular in applied machine learning.

Paper
Add Code

Provable Tensor Methods for Learning Mixtures of Generalized Linear Models

no code implementations • 9 Dec 2014 • Hanie Sedghi, Majid Janzamin, Anima Anandkumar

In contrast, we present a tensor decomposition method which is guaranteed to correctly recover the parameters.

General Classification Tensor Decomposition

Paper
Add Code

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

no code implementations • 28 Jun 2015 • Majid Janzamin, Hanie Sedghi, Anima Anandkumar

We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks.

Tensor Decomposition

Paper
Add Code

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

no code implementations • 6 Nov 2014 • Anima Anandkumar, Rong Ge, Majid Janzamin

We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension.

Paper
Add Code

A Scale Mixture Perspective of Multiplicative Noise in Neural Networks

no code implementations • 10 Jun 2015 • Eric Nalisnick, Anima Anandkumar, Padhraic Smyth

Corrupting the input and hidden layers of deep neural networks (DNNs) with multiplicative noise, often drawn from the Bernoulli distribution (or 'dropout'), provides regularization that has significantly contributed to deep learning's success.

Model Compression

Paper
Add Code

Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

no code implementations • 2 May 2015 • Tejaswi Nimmagadda, Anima Anandkumar

We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input.

Classification Clustering +4

Paper
Add Code

Provable Methods for Training Neural Networks with Sparse Connectivity

no code implementations • 8 Dec 2014 • Hanie Sedghi, Anima Anandkumar

We provide novel guaranteed approaches for training feedforward neural networks with sparse connectivity.

Paper
Add Code

Learning Mixed Membership Community Models in Social Tagging Networks through Tensor Methods

no code implementations • 16 Mar 2015 • Anima Anandkumar, Hanie Sedghi

Community detection in graphs has been extensively studied both in theory and in applications.

Community Detection TAG +1

Paper
Add Code

Score Function Features for Discriminative Learning

no code implementations • 19 Dec 2014 • Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Paper
Add Code

Guaranteed Scalable Learning of Latent Tree Models

no code implementations • 18 Jun 2014 • Furong Huang, Niranjan U. N., Ioakeim Perros, Robert Chen, Jimeng Sun, Anima Anandkumar

We present an integrated approach for structure and parameter estimation in latent tree graphical models.

Paper
Add Code

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

no code implementations • 9 Dec 2014 • Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Paper
Add Code

Tensor decompositions for learning latent variable models

no code implementations • 29 Oct 2012 • Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, Matus Telgarsky

This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order).

Paper
Add Code

A Tensor Approach to Learning Mixed Membership Community Models

no code implementations • 12 Feb 2013 • Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade

We provide guaranteed recovery of community memberships and model parameters and present a careful finite sample analysis of our learning method.

Community Detection Stochastic Block Model

Paper
Add Code

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

no code implementations • NeurIPS 2014 • Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e. g. sparse optimization), and then extend to the multi-block setting with multiple regularizers and multiple variables (e. g. matrix decomposition into sparse and low rank components).

Paper
Add Code

A Spectral Algorithm for Latent Dirichlet Allocation

no code implementations • NeurIPS 2012 • Anima Anandkumar, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Yi-Kai Liu

This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA).

Clustering Topic Models

Paper
Add Code

Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

no code implementations • NeurIPS 2012 • Anima Anandkumar, Ragupathyraj Valluvan

We consider a challenging instance of this problem when some of the nodes are latent or hidden.

Model Selection

Paper
Add Code

Learning Mixtures of Tree Graphical Models

no code implementations • NeurIPS 2012 • Anima Anandkumar, Daniel J. Hsu, Furong Huang, Sham M. Kakade

We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables.

Paper
Add Code

Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning

no code implementations • ICLR 2019 • Nhat Ho, Tan Nguyen, Ankit B. Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk

The conjugate prior yields a new regularizer for learning based on the paths rendered in the generative model for training CNNs–the Rendering Path Normalization (RPN).

Neural Rendering

Paper
Add Code

Tensor Contraction & Regression Networks

no code implementations • ICLR 2018 • Jean Kossaifi, Zack Chase Lipton, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

Second, we introduce tensor regression layers, which express the output of a neural network as a low-rank multi-linear mapping from a high-order activation tensor to the softmax layer.

regression

Paper
Add Code

Convergence rate of sign stochastic gradient descent for non-convex functions

no code implementations • ICLR 2018 • Jeremy Bernstein, Kamyar Azizzadenesheli, Yu-Xiang Wang, Anima Anandkumar

The sign stochastic gradient descent method (signSGD) utilizes only the sign of the stochastic gradient in its updates.

Distributed Optimization Quantization

Paper
Add Code

Long-term Forecasting using Tensor-Train RNNs

no code implementations • ICLR 2018 • Rose Yu, Stephan Zheng, Anima Anandkumar, Yisong Yue

We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.

Paper
Add Code

Stochastic Linear Bandits with Hidden Low Rank Structure

no code implementations • 28 Jan 2019 • Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

We modify the image classification task into the SLB setting and empirically show that, when a pre-trained DNN provides the high dimensional feature representations, deploying PSLB results in significant reduction of regret and faster convergence to an accurate model compared to state-of-art algorithm.

Decision Making Dimensionality Reduction +2

Paper
Add Code

Tensor Dropout for Robust Learning

no code implementations • 27 Feb 2019 • Arinbjörn Kolbeinsson, Jean Kossaifi, Yannis Panagakis, Adrian Bulat, Anima Anandkumar, Ioanna Tzoulaki, Paul Matthews

CNNs achieve remarkable performance by leveraging deep, over-parametrized architectures, trained on large datasets.

Image Classification Inductive Bias

Paper
Add Code

Robust Regression for Safe Exploration in Control

no code implementations • L4DC 2020 • Anqi Liu, Guanya Shi, Soon-Jo Chung, Anima Anandkumar, Yisong Yue

To address this challenge, we present a deep robust regression model that is trained to directly predict the uncertainty bounds for safe exploration.

Generalization Bounds regression +1

Paper
Add Code

Learning Causal State Representations of Partially Observable Environments

no code implementations • 25 Jun 2019 • Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP).

Causal Inference

Paper
Add Code

Directivity Modes of Earthquake Populations with Unsupervised Learning

no code implementations • 30 Jun 2019 • Zachary E. Ross, Daniel T. Trugman, Kamyar Azizzadenesheli, Anima Anandkumar

A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster.

Paper
Add Code

Out-of-Distribution Detection Using Neural Rendering Generative Models

no code implementations • 10 Jul 2019 • Yujia Huang, Sihui Dai, Tan Nguyen, Richard G. Baraniuk, Anima Anandkumar

Our results show that when trained on CIFAR-10, lower likelihood (of latent variables) is assigned to SVHN images.

Neural Rendering Out-of-Distribution Detection +1

Paper
Add Code

Multi Sense Embeddings from Topic Models

no code implementations • WS 2019 • Shobhit Jain, Sravan Babu Bodapati, Ramesh Nallapati, Anima Anandkumar

Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due to their success in capturing useful semantic information.

Topic Models Word Embeddings +1

Paper
Add Code

Triply Robust Off-Policy Evaluation

no code implementations • 13 Nov 2019 • Anqi Liu, Hao liu, Anima Anandkumar, Yisong Yue

Ours is a general approach that can be used to augment any existing OPE method that utilizes the direct method.

Multi-Armed Bandits Off-policy evaluation +1

Paper
Add Code

Angular Visual Hardness

no code implementations • ICML 2020 • Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar

We also find that AVH has a statistically significant correlation with human visual hardness.

Domain Generalization

Paper
Add Code

InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

no code implementations • 9 Dec 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

Paper
Add Code

A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model

no code implementations • 1 Nov 2018 • Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael. I. Jordan, Richard G. Baraniuk

This conjugate prior yields a new regularizer based on paths rendered in the generative model for training CNNs-the Rendering Path Normalization (RPN).

Paper
Add Code

Brain-inspired Robust Vision using Convolutional Neural Networks with Feedback

no code implementations • NeurIPS 2019 Workshop Neuro AI 2019 • Yujia Huang, Sihui Dai, Tan Nguyen, Pinglei Bao, Doris Y. Tsao, Richard G. Baraniuk, Anima Anandkumar

Primates have a remarkable ability to correctly classify images even in the presence of significant noise and degradation.

Disentanglement

Paper
Add Code

Regret Minimization in Partially Observable Linear Quadratic Control

no code implementations • 31 Jan 2020 • Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

We propose a novel way to decompose the regret and provide an end-to-end sublinear regret upper bound for partially observable linear quadratic control.

Paper
Add Code

Semi-Supervised StyleGAN for Disentanglement Learning

no code implementations • ICML 2020 • Weili Nie, Tero Karras, Animesh Garg, Shoubhik Debnath, Anjul Patney, Ankit B. Patel, Anima Anandkumar

Disentanglement learning is crucial for obtaining disentangled representations and controllable generation.

Disentanglement

Paper
Add Code

Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting

no code implementations • 12 Mar 2020 • Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

We study the problem of adaptive control in partially observable linear quadratic Gaussian control systems, where the model dynamics are unknown a priori.

Paper
Add Code

Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems

no code implementations • NeurIPS 2020 • Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

We study the problem of system identification and adaptive control in partially observable linear dynamical systems.

counterfactual

Paper
Add Code

Spectral Learning on Matrices and Tensors

no code implementations • 16 Apr 2020 • Majid Janzamin, Rong Ge, Jean Kossaifi, Anima Anandkumar

PCA and other spectral techniques applied to matrices have several limitations.

Dimensionality Reduction Tensor Decomposition

Paper
Add Code

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

no code implementations • 9 May 2020 • Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

The Info-SNOC algorithm is used to compute a sub-optimal pool of safe motion plans that aid in exploration for learning unknown residual dynamics under safety constraints.

Motion Planning Optimal Motion Planning +1

Paper
Add Code

Unsupervised Controllable Generation with Self-Training

no code implementations • 17 Jul 2020 • Grigorios G. Chrysos, Jean Kossaifi, Zhiding Yu, Anima Anandkumar

Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.

Disentanglement

Paper
Add Code

Reinforcement Learning with Fast Stabilization in Linear Dynamical Systems

no code implementations • 23 Jul 2020 • Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

In this work, we study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models

no code implementations • EMNLP 2020 • Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

We showcase the controllability of our model by replacing the keywords used to generate stories and re-running the generation process.

Sentence Sentence Embedding +2

Paper
Add Code

A Coach-Player Framework for Dynamic Team Composition

no code implementations • 1 Jan 2021 • Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar

The performance of our method is comparable or even better than the setting where all players have a full view of the environment, but no coach.

Zero-shot Generalization

Paper
Add Code

Transferable Unsupervised Robust Representation Learning

no code implementations • 1 Jan 2021 • De-An Huang, Zhiding Yu, Anima Anandkumar

We upend this view and show that URRL improves both the natural accuracy of unsupervised representation learning and its robustness to corruptions and adversarial noise.

Data Augmentation Representation Learning +1

Paper
Add Code

Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach

no code implementations • 8 Oct 2020 • Haoxuan Wang, Zhiding Yu, Yisong Yue, Anima Anandkumar, Anqi Liu, Junchi Yan

We propose a framework for learning calibrated uncertainties under domain shifts, where the source (training) distribution differs from the target (test) distribution.

Density Ratio Estimation Unsupervised Domain Adaptation

Paper
Add Code

Stability and Identification of Random Asynchronous Linear Time-Invariant Systems

no code implementations • 8 Dec 2020 • Sahin Lale, Oguzhan Teke, Babak Hassibi, Anima Anandkumar

In this model, each state variable is updated randomly and asynchronously with some probability according to the underlying system dynamics.

Paper
Add Code

Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

no code implementations • 17 Jan 2021 • Anqi Liu, Hao liu, Tongxin Li, Saeed Karimi-Bidhendi, Yisong Yue, Anima Anandkumar

Thus, we provide a principled approach to tackling the joint problem of causal discovery and latent variable inference.

Causal Discovery Tensor Decomposition

Paper
Add Code

Dynamic Social Media Monitoring for Fast-Evolving Online Discussions

no code implementations • 24 Feb 2021 • Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, Anima Anandkumar

However, collecting social media data using a static set of keywords fails to satisfy the growing need to monitor dynamic conversations and to study fast-changing topics.

Decision Making Time Series +1

Paper
Add Code

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

no code implementations • 12 Mar 2021 • Zahra Ghodsi, Siva Kumar Sastry Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen W. Keckler, Siddharth Garg, Anima Anandkumar

Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems.

Autonomous Vehicles

Paper
Add Code

Stable Online Control of Linear Time-Varying Systems

no code implementations • 29 Apr 2021 • Guannan Qu, Yuanyuan Shi, Sahin Lale, Anima Anandkumar, Adam Wierman

In this work, we propose an efficient online control algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees input-to-state stability for a large class of LTV systems while also minimizing the control cost.

Paper
Add Code

Informing Geometric Deep Learning with Electronic Interactions to Accelerate Quantum Chemistry

no code implementations • 31 May 2021 • Zhuoran Qiao, Anders S. Christensen, Matthew Welborn, Frederick R. Manby, Anima Anandkumar, Thomas F. Miller III

Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials.

Paper
Add Code

Not All Labels Are Equal: Rationalizing The Labeling Costs for Training Object Detection

no code implementations • CVPR 2022 • Ismail Elezi, Zhiding Yu, Anima Anandkumar, Laura Leal-Taixe, Jose M. Alvarez

Deep neural networks have reached high accuracy on object detection but their success hinges on large amounts of labeled data.

Active Learning object-detection +1

Paper
Add Code

LNS-Madam: Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

no code implementations • 26 Jun 2021 • Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, Mustafa Ali, Ming-Yu Liu, Brucek Khailany, Bill Dally, Anima Anandkumar

Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction.

Quantization

Paper
Add Code

Tensor Methods in Computer Vision and Deep Learning

no code implementations • 7 Jul 2021 • Yannis Panagakis, Jean Kossaifi, Grigorios G. Chrysos, James Oldfield, Mihalis A. Nicolaou, Anima Anandkumar, Stefanos Zafeiriou

Tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions.

Representation Learning

Paper
Add Code

Finite-time System Identification and Adaptive Control in Autoregressive Exogenous Systems

no code implementations • 26 Aug 2021 • Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

Using these guarantees, we design adaptive control algorithms for unknown ARX systems with arbitrary strongly convex or convex quadratic regulating costs.

Paper
Add Code

Auditing AI models for Verified Deployment under Semantic Specifications

no code implementations • 25 Sep 2021 • Homanga Bharadhwaj, De-An Huang, Chaowei Xiao, Anima Anandkumar, Animesh Garg

We enable such unit tests through variations in a semantically-interpretable latent space of a generative model.

Face Recognition

Paper
Add Code

Stability Constrained Reinforcement Learning for Real-Time Voltage Control

no code implementations • 30 Sep 2021 • Yuanyuan Shi, Guannan Qu, Steven Low, Anima Anandkumar, Adam Wierman

Deep reinforcement learning (RL) has been recognized as a promising tool to address the challenges in real-time control of power systems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators

no code implementations • ICLR 2022 • John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro

AFNO is based on a principled foundation of operator learning which allows us to frame token mixing as a continuous global convolution without any dependence on the input resolution.

Computational Efficiency Operator learning +1

Paper
Add Code

Cost-Sensitive Hierarchical Classification through Layer-wise Abstentions

no code implementations • 29 Sep 2021 • Alycia Lee, Anthony L Pineci, Uriah Israel, Omer Bar-Tal, Leeat Keren, David A. Van Valen, Anima Anandkumar, Yisong Yue, Anqi Liu

For each layer, we also achieve higher accuracy when the overall accuracy is kept fixed across different methods.

Classification

Paper
Add Code

Scaling Fair Learning to Hundreds of Intersectional Groups

no code implementations • 29 Sep 2021 • Eric Zhao, De-An Huang, Hao liu, Zhiding Yu, Anqi Liu, Olga Russakovsky, Anima Anandkumar

In real-world applications, however, there are multiple protected attributes yielding a large number of intersectional protected groups.

Attribute Fairness +1

Paper
Add Code

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

no code implementations • 15 Nov 2021 • Youngwoon Lee, Joseph J. Lim, Anima Anandkumar, Yuke Zhu

However, these approaches require larger state distributions to be covered as more policies are sequenced, and thus are limited to short skill sequences.

Reinforcement Learning (RL) Robot Manipulation

Paper
Add Code

Polymatrix Competitive Gradient Descent

no code implementations • 16 Nov 2021 • Jeffrey Ma, Alistair Letcher, Florian Schäfer, Yuanyuan Shi, Anima Anandkumar

In this work we propose polymatrix competitive gradient descent (PCGD) as a method for solving general sum competitive optimization involving arbitrary numbers of agents.

Multi-agent Reinforcement Learning

Paper
Add Code

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

no code implementations • NeurIPS 2021 • Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz

It is therefore interesting to study how these two tasks can be coupled to benefit each other.

Edge Detection Image Segmentation +2

Paper
Add Code

Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions

no code implementations • NeurIPS 2021 • Jiachen Sun, Yulong Cao, Christopher B. Choy, Zhiding Yu, Anima Anandkumar, Zhuoqing Morley Mao, Chaowei Xiao

In this paper, we systematically study the impact of various self-supervised learning proxy tasks on different architectures and threat models for 3D point clouds with adversarial training.

Adversarial Robustness Autonomous Driving +1

Paper
Add Code

InfoCNF: Efficient Conditional Continuous Normalizing Flow Using Adaptive Solvers

no code implementations • 25 Sep 2019 • Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.

Conditional Image Generation Time Series +1

Paper
Add Code

Disentangled GANs for Controllable Generation of High-Resolution Images

no code implementations • 25 Sep 2019 • Weili Nie, Tero Karras, Animesh Garg, Shoubhik Debhath, Anjul Patney, Ankit B. Patel, Anima Anandkumar

Generative adversarial networks (GANs) have achieved great success at generating realistic samples.

Disentanglement Vocal Bursts Intensity Prediction

Paper
Add Code

Distributionally Robust Learning for Unsupervised Domain Adaptation

no code implementations • 28 Sep 2020 • Haoxuan Wang, Anqi Liu, Zhiding Yu, Yisong Yue, Anima Anandkumar

This formulation motivates the use of two neural networks that are jointly trained --- a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network.

Density Ratio Estimation Unsupervised Domain Adaptation

Paper
Add Code

Simulation Intelligence: Towards a New Generation of Scientific Methods

no code implementations • 6 Dec 2021 • Alexander Lavin, David Krakauer, Hector Zenil, Justin Gottschlich, Tim Mattson, Johann Brehmer, Anima Anandkumar, Sanjay Choudry, Kamil Rocki, Atılım Güneş Baydin, Carina Prunkl, Brooks Paige, Olexandr Isayev, Erik Peterson, Peter L. McMahon, Jakob Macke, Kyle Cranmer, Jiaxin Zhang, Haruko Wainwright, Adi Hanuka, Manuela Veloso, Samuel Assefa, Stephan Zheng, Avi Pfeffer

We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence.

Probabilistic Programming

Paper
Add Code

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

no code implementations • 15 Dec 2021 • Shrimai Prabhumoye, Rafal Kocielnik, Mohammad Shoeybi, Anima Anandkumar, Bryan Catanzaro

We then provide the LM with instruction that consists of this subset of labeled exemplars, the query text to be classified, a definition of bias, and prompt it to make a decision.

Paper
Add Code

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

no code implementations • 14 Mar 2022 • Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability.

Contrastive Learning Deformable Object Manipulation

Paper
Add Code

Generic Lithography Modeling with Dual-band Optics-Inspired Neural Networks

no code implementations • 12 Mar 2022 • HaoYu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Mark Kilgard, Anima Anandkumar, Brucek Khailany, Vivek Singh, Haoxing Ren

Lithography simulation is a critical step in VLSI design and optimization for manufacturability.

BIG-bench Machine Learning

Paper
Add Code

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

no code implementations • 11 Apr 2022 • Enze Xie, Zhiding Yu, Daquan Zhou, Jonah Philion, Anima Anandkumar, Sanja Fidler, Ping Luo, Jose M. Alvarez

In this paper, we propose M$^2$BEV, a unified framework that jointly performs 3D object detection and map segmentation in the Birds Eye View~(BEV) space with multi-camera image inputs.

3D Object Detection object-detection +1

Paper
Add Code

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

no code implementations • 3 Jun 2022 • Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam Wierman, Anima Anandkumar

However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

no code implementations • 7 Jun 2022 • Tianyuan Jin, Pan Xu, Xiaokui Xiao, Anima Anandkumar

We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family, which covers many common reward distributions including Bernoulli, Gaussian, Gamma, Exponential, etc.

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear Quadratic Control

no code implementations • 17 Jun 2022 • Taylan Kargin, Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

By carefully prescribing an early exploration strategy and a policy update rule, we show that TS achieves order-optimal regret in adaptive control of multidimensional stabilizable LQRs.

Decision Making Decision Making Under Uncertainty +1

Paper
Add Code

Large Scale Mask Optimization Via Convolutional Fourier Neural Operator and Litho-Guided Self Training

no code implementations • 8 Jul 2022 • HaoYu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Anima Anandkumar, Brucek Khailany, Vivek Singh, Haoxing Ren

Machine learning techniques have been extensively studied for mask optimization problems, aiming at better mask printability, shorter turnaround time, better mask manufacturability, and so on.

BIG-bench Machine Learning

Paper
Add Code

Robust Trajectory Prediction against Adversarial Attacks

no code implementations • 29 Jul 2022 • Yulong Cao, Danfei Xu, Xinshuo Weng, Zhuoqing Mao, Anima Anandkumar, Chaowei Xiao, Marco Pavone

We demonstrate that our method is able to improve the performance by 46% on adversarial data and at the cost of only 3% performance degradation on clean data, compared to the model trained with clean data.

Autonomous Driving Data Augmentation +1

Paper
Add Code

AdvDO: Realistic Adversarial Attacks for Trajectory Prediction

no code implementations • 19 Sep 2022 • Yulong Cao, Chaowei Xiao, Anima Anandkumar, Danfei Xu, Marco Pavone

Trajectory prediction is essential for autonomous vehicles (AVs) to plan correct and safe driving behaviors.

Adversarial Attack Adversarial Robustness +2

Paper
Add Code

Context Generation Improves Open Domain Question Answering

no code implementations • 12 Oct 2022 • Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

Prior work on closed-book QA either directly finetunes or prompts a pretrained language model (LM) to leverage the stored knowledge.

Language Modelling Open-Domain Question Answering

Paper
Add Code

An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design

no code implementations • 27 Oct 2022 • Mingjie Liu, HaoYu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Selim Dogru, Anima Anandkumar, David Z. Pan, Brucek Khailany, Haoxing Ren

These synthetic mask images will augment the original limited training dataset used to finetune the lithography model for improved performance.

Data Augmentation

Paper
Add Code

Real-time high-resolution CO$_2$ geological storage prediction using nested Fourier neural operators

no code implementations • 31 Oct 2022 • Gege Wen, Zongyi Li, Qirui Long, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

Carbon capture and storage (CCS) plays an essential role in global decarbonization.

Paper
Add Code

DensePure: Understanding Diffusion Models towards Adversarial Robustness

no code implementations • 1 Nov 2022 • Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process.

Adversarial Robustness Denoising

Paper
Add Code

Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

no code implementations • 21 Nov 2022 • Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar

Finally, we find that not all transfer scenarios yield a positive gain, which seems related to the PLMs initial performance on the target-domain task.

Active Learning Transfer Learning

Paper
Add Code

Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

no code implementations • 28 Nov 2022 • Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar

Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows.

Paper
Add Code

Machine Learning Accelerated PDE Backstepping Observers

no code implementations • 28 Nov 2022 • Yuanyuan Shi, Zongyi Li, Huan Yu, Drew Steeves, Anima Anandkumar, Miroslav Krstic

State estimation is important for a variety of tasks, from forecasting to substituting for unmeasured states in feedback controllers.

Computational Efficiency

Paper
Add Code

Fourier Continuation for Exact Derivative Computation in Physics-Informed Neural Operators

no code implementations • 29 Nov 2022 • Haydn Maust, Zongyi Li, YiXuan Wang, Daniel Leibovici, Oscar Bruno, Thomas Hou, Anima Anandkumar

The physics-informed neural operator (PINO) is a machine learning architecture that has shown promising empirical results for learning partial differential equations.

Paper
Add Code

HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression

no code implementations • 30 Nov 2022 • Jiaqi Gu, Ben Keller, Jean Kossaifi, Anima Anandkumar, Brucek Khailany, David Z. Pan

Transformers have attained superior performance in natural language processing and computer vision.

Efficient Exploration Knowledge Distillation +1

Paper
Add Code

Towards Neural Variational Monte Carlo That Scales Linearly with System Size

no code implementations • 21 Dec 2022 • Or Sharir, Garnet Kin-Lic Chan, Anima Anandkumar

Quantum many-body problems are some of the most challenging problems in science and are central to demystifying some exotic quantum phenomena, e. g., high-temperature superconductors.

Quantization Variational Monte Carlo

Paper
Add Code

Vision Transformers Are Good Mask Auto-Labelers

no code implementations • CVPR 2023 • Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar

We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations.

Instance Segmentation Segmentation +1

Paper
Add Code

Forecasting subcritical cylinder wakes with Fourier Neural Operators

no code implementations • 19 Jan 2023 • Peter I Renn, Cong Wang, Sahin Lale, Zongyi Li, Anima Anandkumar, Morteza Gharib

The learned FNO solution operator can be evaluated in milliseconds, potentially enabling faster-than-real-time modeling for predictive flow control in physical systems.

Operator learning

Paper
Add Code

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

no code implementations • 9 Feb 2023 • Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar

Augmenting pretrained language models (LMs) with a vision encoder (e. g., Flamingo) has obtained the state-of-the-art results in image-to-text generation.

Few-Shot Learning Image Captioning +3

Paper
Add Code

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

no code implementations • 13 Feb 2023 • Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

In this paper, we propose PerAda, a parameter-efficient pFL framework that reduces communication and computational costs and exhibits superior generalization performance, especially under test-time distribution shifts.

Generalization Bounds Knowledge Distillation +2

Paper
Add Code

BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models

no code implementations • 14 Feb 2023 • Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar

We thus enable seamless open-ended social bias testing of PLMs by domain experts through an automatic large-scale generation of diverse test sentences for any combination of social categories and attributes.

Sentence Text Generation

Paper
Add Code

Score-based Diffusion Models in Function Space

no code implementations • 14 Feb 2023 • Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Paper
Add Code

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

no code implementations • CVPR 2023 • Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima Anandkumar

To apply this representation to monocular scene reconstruction, we develop a scale calibration algorithm for fast geometric initialization from monocular depth priors.

Indoor Scene Reconstruction

Paper
Add Code

Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs

no code implementations • 27 Jul 2023 • Or Sharir, Anima Anandkumar

Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.

Document Classification Knowledge Distillation +2

Paper
Add Code

Tipping Point Forecasting in Non-Stationary Dynamics on Function Spaces

no code implementations • 17 Aug 2023 • Miguel Liu-Schiaffini, Clare E. Singer, Nikola Kovachki, Tapio Schneider, Kamyar Azizzadenesheli, Anima Anandkumar

Tipping points are abrupt, drastic, and often irreversible changes in the evolution of non-stationary and chaotic dynamical systems.

Conformal Prediction

Paper
Add Code

Spacetime Surface Regularization for Neural Dynamic Scene Reconstruction

no code implementations • ICCV 2023 • Jaesung Choe, Christopher Choy, Jaesik Park, In So Kweon, Anima Anandkumar

We propose an algorithm, 4DRegSDF, for the spacetime surface regularization to improve the fidelity of neural rendering and reconstruction in dynamic scenes.

Neural Rendering

Paper
Add Code

End-to-end 3D Tracking with Decoupled Queries

no code implementations • ICCV 2023 • Yanwei Li, Zhiding Yu, Jonah Philion, Anima Anandkumar, Sanja Fidler, Jiaya Jia, Jose Alvarez

In this work, we present an end-to-end framework for camera-based 3D multi-object tracking, called DQTrack.

3D Multi-Object Tracking

Paper
Add Code

Neural Operators for Accelerating Scientific Simulations and Design

no code implementations • 27 Sep 2023 • Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, Anima Anandkumar

Scientific discovery and engineering design are currently limited by the time and cost of physical experiments, selected mostly through trial-and-error and intuition that require deep domain expertise.

Super-Resolution Weather Forecasting

Paper
Add Code

Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs

no code implementations • 29 Sep 2023 • Jean Kossaifi, Nikola Kovachki, Kamyar Azizzadenesheli, Anima Anandkumar

Our contributions are threefold: i) we enable parallelization over input samples with a novel multi-grid-based domain decomposition, ii) we represent the parameters of the model in a high-order latent subspace of the Fourier domain, through a global tensor factorization, resulting in an extreme reduction in the number of parameters and improved generalization, and iii) we propose architectural improvements to the backbone FNO.

Operator learning

Paper
Add Code

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Paper
Add Code

EKGNet: A 10.96μW Fully Analog Neural Network for Intra-Patient Arrhythmia Classification

no code implementations • 24 Oct 2023 • Benyamin Haghi, Lin Ma, Sahin Lale, Anima Anandkumar, Azita Emami

We present an integrated approach by combining analog computing and deep learning for electrocardiogram (ECG) arrhythmia classification.

Classification

Paper
Add Code

Plasma Surrogate Modelling using Fourier Neural Operators

no code implementations • 10 Nov 2023 • Vignesh Gopakumar, Stanislas Pamela, Lorenzo Zanisi, Zongyi Li, Ander Gray, Daniel Brennand, Nitesh Bhatia, Gregory Stathopoulos, Matt Kusner, Marc Peter Deisenroth, Anima Anandkumar, JOREK Team, MAST Team

Predicting plasma evolution within a Tokamak reactor is crucial to realizing the goal of sustainable fusion.

Super-Resolution

Paper
Add Code

Deep Multimodal Fusion for Surgical Feedback Classification

no code implementations • 6 Dec 2023 • Rafal Kocielnik, Elyssa Y. Wong, Timothy N. Chu, Lydia Lin, De-An Huang, Jiayun Wang, Anima Anandkumar, Andrew J. Hung

This work offers an important first look at the feasibility of automated classification of real-world live surgical feedback based on text, audio, and video modalities.

Classification

Paper
Add Code

Perspectives on the State and Future of Deep Learning -- 2023

no code implementations • 7 Dec 2023 • Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time.

Benchmarking

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.