Search Results for author: Manzil Zaheer

Found 68 papers, 26 papers with code

Compositional Generalization and Decomposition in Neural Program Synthesis

no code implementations7 Apr 2022 Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.

Program Synthesis

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

1 code implementation22 Feb 2022 Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum

Question answering (QA) over real-world knowledge bases (KBs) is challenging because of the diverse (essentially unbounded) types of reasoning patterns needed.

Knowledge Base Question Answering

Private Adaptive Optimization with Side Information

1 code implementation12 Feb 2022 Tian Li, Manzil Zaheer, Sashank J. Reddi, Virginia Smith

Adaptive optimization methods have become the default solvers for many machine learning tasks.

Deep Hierarchy in Bandits

no code implementations3 Feb 2022 Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits.

Robust Training of Neural Networks using Scale Invariant Architectures

no code implementations2 Feb 2022 Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar

In contrast to SGD, adaptive gradient methods like Adam allow robust training of modern deep networks, especially large language models.

A Context-Integrated Transformer-Based Neural Network for Auction Design

no code implementations29 Jan 2022 Zhijian Duan, Jingwu Tang, Yutong Yin, Zhe Feng, Xiang Yan, Manzil Zaheer, Xiaotie Deng

One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer's expected revenue.

Hierarchical Bayesian Bandits

no code implementations12 Nov 2021 Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh

We provide a unified view of all these problems, as learning to act in a hierarchical Bayesian bandit.

Federated Learning

When in Doubt, Summon the Titans: Efficient Inference with Large Models

no code implementations19 Oct 2021 Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar

In a nutshell, we use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples; for the "hard" examples, we fall-back to the teacher.

Image Classification

When in Doubt, Summon the Titans: A Framework for Efficient Inference with Large Models

no code implementations29 Sep 2021 Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar

In a nutshell, we use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples; for the "hard" examples, we fall-back to the teacher.

Image Classification

No Regrets for Learning the Prior in Bandits

no code implementations NeurIPS 2021 Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári

We propose ${\tt AdaTS}$, a Thompson sampling algorithm that adapts sequentially to bandit tasks that it interacts with.

Thompson Sampling with a Mixture Prior

no code implementations10 Jun 2021 Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, Craig Boutilier

We study Thompson sampling (TS) in online decision making, where the uncertain environment is sampled from a mixture distribution.

Decision Making Multi-Task Learning +1

Model-Agnostic Graph Regularization for Few-Shot Learning

no code implementations14 Feb 2021 Ethan Shen, Maria Brbic, Nicholas Monath, Jiaqi Zhai, Manzil Zaheer, Jure Leskovec

In this paper, we present a comprehensive empirical study on graph embedded few-shot learning.

Few-Shot Learning

PLLay: Efficient Topological Layer based on Persistent Landscapes

1 code implementation NeurIPS 2020 Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Kim, Frederic Chazal, Larry Wasserman

We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure.

Modifying Memories in Transformer Models

no code implementations1 Dec 2020 Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar

In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.

Latent Programmer: Discrete Latent Codes for Program Synthesis

no code implementations1 Dec 2020 Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer

The latent codes are learned using a self-supervised learning principle, in which first a discrete autoencoder is trained on the output sequences, and then the resulting latent codes are used as intermediate targets for the end-to-end sequence prediction task.

Document Summarization Program Synthesis +1

Non-Stationary Latent Bandits

no code implementations1 Dec 2020 Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Mohammad Ghavamzadeh, Craig Boutilier

The key idea is to frame this problem as a latent bandit, where the prototypical models of user behavior are learned offline and the latent state of the user is inferred online from its interactions with the models.

Frame online learning +1

Differentiable Meta-Learning of Bandit Policies

no code implementations NeurIPS 2020 Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution P. In this work, we learn such policies for an unknown distribution P using samples from P. Our approach is a form of meta-learning and exploits properties of P without making strong assumptions about its form.

Meta-Learning

Federated Composite Optimization

1 code implementation17 Nov 2020 Honglin Yuan, Manzil Zaheer, Sashank Reddi

We first show that straightforward extensions of primal algorithms such as FedAvg are not well-suited for FCO since they suffer from the "curse of primal averaging," resulting in poor convergence.

Federated Learning

Differentiable Open-Ended Commonsense Reasoning

no code implementations NAACL 2021 Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen

As a step towards making commonsense reasoning research more realistic, we propose to study open-ended commonsense reasoning (OpenCSR) -- the task of answering a commonsense question without any pre-defined choices -- using as a resource only a corpus of commonsense facts written in natural language.

Multiple-choice

Scalable Hierarchical Agglomerative Clustering

2 code implementations22 Oct 2020 Nicholas Monath, Avinava Dubey, Guru Guruganesh, Manzil Zaheer, Amr Ahmed, Andrew McCallum, Gokhan Mergen, Marc Najork, Mert Terzihan, Bryon Tjanaka, YuAn Wang, Yuchen Wu

The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability.

2D Human Pose Estimation

Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes

no code implementations15 Sep 2020 Xinyuan Zhang, Ruiyi Zhang, Manzil Zaheer, Amr Ahmed

High-quality dialogue-summary paired data is expensive to produce and domain-sensitive, making abstractive dialogue summarization a challenging task.

Abstractive Dialogue Summarization dialogue summary +1

A Simple Approach to Case-Based Reasoning in Knowledge Bases

1 code implementation AKBC 2020 Rajarshi Das, Ameya Godbole, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires \emph{no training}, and is reminiscent of case-based reasoning in classical artificial intelligence (AI).

Knowledge Graphs Meta-Learning

Latent Bandits Revisited

no code implementations NeurIPS 2020 Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed, Craig Boutilier

A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.

Recommendation Systems

Non-Stationary Off-Policy Optimization

no code implementations15 Jun 2020 Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed

This approach is practical and analyzable, and we provide guarantees on both the quality of off-policy optimization and the regret during online deployment.

Multi-Armed Bandits

Meta-Learning Bandit Policies by Gradient Ascent

no code implementations9 Jun 2020 Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier

Most bandit policies are designed to either minimize regret in any problem instance, making very few assumptions about the underlying environment, or in a Bayesian sense, assuming a prior distribution over environment parameters.

Meta-Learning Multi-Armed Bandits

Robust Large-Margin Learning in Hyperbolic Space

no code implementations NeurIPS 2020 Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya Menon, Sanjiv Kumar

In this paper, we present, to our knowledge, the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space.

Representation Learning

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

no code implementations ICLR 2021 Paul Pu Liang, Manzil Zaheer, Yu-An Wang, Amr Ahmed

In this paper, we design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix.

Language Modelling Text Classification

Adaptive Federated Optimization

3 code implementations ICLR 2021 Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data.

Federated Learning

Towards Modular Algorithm Induction

no code implementations27 Feb 2020 Daniel A. Abolafia, Rishabh Singh, Manzil Zaheer, Charles Sutton

Main consists of a neural controller that interacts with a variable-length input tape and learns to compose modules together with their corresponding argument choices.

Differentiable Reasoning over a Virtual Knowledge Base

1 code implementation ICLR 2020 Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen

In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus.

Re-Ranking

Differentiable Bandit Exploration

no code implementations NeurIPS 2020 Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

In this work, we learn such policies for an unknown distribution $\mathcal{P}$ using samples from $\mathcal{P}$.

Meta-Learning

PLLay: Efficient Topological Layer based on Persistence Landscapes

2 code implementations NeurIPS 2020 Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, Larry Wasserman

We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure.

FedDANE: A Federated Newton-Type Method

1 code implementation7 Jan 2020 Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

Federated learning aims to jointly learn statistical models over massively distributed remote devices.

Distributed Optimization Federated Learning

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference

no code implementations WS 2019 Rajarshi Das, Ameya Godbole, Manzil Zaheer, Shehzaad Dhuliawala, Andrew McCallum

This paper describes our submission to the shared task on {``}Multi-hop Inference Explanation Regeneration{''} in TextGraphs workshop at EMNLP 2019 (Jansen and Ustalov, 2019).

Anchor & Transform: Learning Sparse Representations of Discrete Objects

no code implementations25 Sep 2019 Paul Pu Liang, Manzil Zaheer, YuAn Wang, Amr Ahmed

Learning continuous representations of discrete objects such as text, users, and items lies at the heart of many applications including text and user modeling.

Language Modelling Text Classification

Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering

no code implementations WS 2019 Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum

Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging.

Information Retrieval Multi-hop Question Answering +1

Developing Creative AI to Generate Sculptural Objects

no code implementations20 Aug 2019 Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos

We explore the intersection of human and machine creativity by generating sculptural objects through machine learning.

Generating 3D Point Clouds

The Myths of Our Time: Fake News

1 code implementation5 Aug 2019 Vít Růžička, Eunsu Kang, David Gordon, Ankita Patel, Jacqui Fashimpaur, Manzil Zaheer

While the purpose of most fake news is misinformation and political propaganda, our team sees it as a new type of myth that is created by people in the age of internet identities and artificial intelligence.

Misinformation News Generation

Randomized Exploration in Generalized Linear Bandits

no code implementations21 Jun 2019 Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier

GLM-TSL samples a generalized linear model (GLM) from the Laplace approximation to the posterior distribution.

Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

1 code implementation ICLR 2019 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other.

Open-Domain Question Answering Reading Comprehension

Exchangeable Generative Models with Flow Scans

1 code implementation5 Feb 2019 Christopher Bender, Kevin O'Connor, Yang Li, Juan Jose Garcia, Manzil Zaheer, Junier Oliva

In this work, we develop a new approach to generative density estimation for exchangeable, non-i. i. d.

Density Estimation

Federated Optimization in Heterogeneous Networks

9 code implementations14 Dec 2018 Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity).

Distributed Optimization Federated Learning

Adaptive Methods for Nonconvex Optimization

1 code implementation NeurIPS 2018 Manzil Zaheer, Sashank Reddi, Devendra Sachan, Satyen Kale, Sanjiv Kumar

In this work, we provide a new analysis of such methods applied to nonconvex stochastic optimization problems, characterizing the effect of increasing minibatch size.

Stochastic Optimization

Point Cloud GAN

1 code implementation13 Oct 2018 Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, Ruslan Salakhutdinov

In this paper, we first show a straightforward extension of existing GAN algorithm is not applicable to point clouds, because the constraint required for discriminators is undefined for set data.

Object Recognition

Towards Gradient Free and Projection Free Stochastic Optimization

no code implementations8 Oct 2018 Anit Kumar Sahu, Manzil Zaheer, Soummya Kar

This paper focuses on the problem of \emph{constrained} \emph{stochastic} optimization.

Stochastic Optimization

Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text

1 code implementation EMNLP 2018 Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, William W. Cohen

In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is available with a large text corpus.

Graph Representation Learning Open-Domain Question Answering

Nonparametric Density Estimation under Adversarial Losses

no code implementations NeurIPS 2018 Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, Barnabás Póczos

We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance.

Density Estimation

Transformation Autoregressive Networks

no code implementations ICML 2018 Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider

Further, through a comprehensive study over both real world and synthetic data, we show for that jointly leveraging transformations of variables and autoregressive conditional models, results in a considerable improvement in performance.

Density Estimation Outlier Detection

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

6 code implementations ICLR 2018 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

A Generic Approach for Escaping Saddle points

no code implementations5 Sep 2017 Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J. Smola

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points.

Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data

no code implementations ICML 2017 Manzil Zaheer, Amr Ahmed, Alexander J. Smola

Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010).

Canopy --- Fast Sampling with Cover Trees

no code implementations ICML 2017 Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola

In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space.

Spectral Methods for Nonparametric Models

no code implementations31 Mar 2017 Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola

Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.

Deep Sets

2 code implementations NeurIPS 2017 Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola

Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong.

Anomaly Detection Outlier Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.