no code implementations • 10 Oct 2024 • Su Hyeong Lee, Sidharth Sharma, Manzil Zaheer, Tian Li
Adaptive optimization is critical in federated learning, where enabling adaptivity on both the server and client sides has proven essential for achieving optimal performance.
no code implementations • 3 Sep 2024 • Nicholas Monath, Will Grathwohl, Michael Boratko, Rob Fergus, Andrew McCallum, Manzil Zaheer
In dense retrieval, deep encoders provide embeddings for both inputs and targets, and the softmax function is used to parameterize a distribution over a large number of candidate targets (e. g., textual passages for information retrieval).
no code implementations • 27 Aug 2024 • Soumya Basu, Ankit Singh Rawat, Manzil Zaheer
We propose a statistical framework to study such models with two components: 1) a {\em retriever} to identify the relevant information out of a large corpus via a data-dependent metric; and 2) a {\em predictor} that consumes the input instances along with the retrieved information to make the final predictions.
no code implementations • 20 Aug 2024 • Ameya Godbole, Nicholas Monath, Seungyeon Kim, Ankit Singh Rawat, Andrew McCallum, Manzil Zaheer
In text generation, hallucinations refer to the generation of seemingly coherent text that contradicts established knowledge.
no code implementations • 9 Aug 2024 • Kensen Shi, Deniz Altınbüken, Saswat Anand, Mihai Christodorescu, Katja Grünwedel, Alexa Koenings, Sai Naidu, Anurag Pathak, Marc Rasi, Fredde Ribeiro, Brandon Ruffin, Siddhant Sanyam, Maxim Tabachnyk, Sara Toth, Roy Tu, Tobias Welp, Pengcheng Yin, Manzil Zaheer, Satish Chandra, Charles Sutton
We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process.
1 code implementation • 10 Jul 2024 • Sai Srivatsa Ravindranath, Zhe Feng, Di Wang, Manzil Zaheer, Aranyak Mehta, David C. Parkes
Revenue-optimal auction design is a challenging problem with significant theoretical and practical implications.
no code implementations • 6 May 2024 • Nishant Yadav, Nicholas Monath, Manzil Zaheer, Rob Fergus, Andrew McCallum
Our method produces a high-quality approximation while requiring only a fraction of CE calls as compared to CUR-based methods, and allows for leveraging DE to initialize the embedding space while avoiding compute- and resource-intensive finetuning of DE via distillation.
1 code implementation • 16 Jan 2024 • Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Manzil Zaheer, Andrew McCallum, Amr Ahmed, Snigdha Chaturvedi
In this work, we study the task of extractive opinion summarization in an incremental setting, where the underlying review set evolves over time.
no code implementations • 15 Dec 2023 • Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar
Answering complex natural language questions often necessitates multi-step reasoning and integrating external information.
Ranked #1 on
Question Answering
on Bamboogle
no code implementations • 6 Oct 2023 • Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli
Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models.
no code implementations • 26 Jul 2023 • Kensen Shi, Joey Hong, Yinlin Deng, Pengcheng Yin, Manzil Zaheer, Charles Sutton
When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks.
no code implementations • 24 May 2023 • Dung Thai, Dhruv Agarwal, Mudit Chaudhary, Wenlong Zhao, Rajarshi Das, Manzil Zaheer, Jay-Yoon Lee, Hannaneh Hajishirzi, Andrew McCallum
Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases.
no code implementations • 20 May 2023 • Boxin Wang, Yibo Jacky Zhang, Yuan Cao, Bo Li, H. Brendan McMahan, Sewoong Oh, Zheng Xu, Manzil Zaheer
We study (differentially) private federated learning (FL) of language models.
1 code implementation • 4 May 2023 • Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum
While ANNCUR's one-time selection of anchors tends to approximate the cross-encoder distances on average, doing so forfeits the capacity to accurately estimate distances to items near the query, leading to regret in the crucial end-task: recall of top-k items.
no code implementations • 27 Mar 2023 • Nicholas Monath, Manzil Zaheer, Kelsey Allen, Andrew McCallum
First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.
no code implementations • 27 Jan 2023 • Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR).
no code implementations • 9 Dec 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh
We prove per-task bounds on the suboptimality of the learned policies, which show a clear improvement over not using the hierarchical model.
1 code implementation • 1 Dec 2022 • Tian Li, Manzil Zaheer, Ken Ziyu Liu, Sashank J. Reddi, H. Brendan McMahan, Virginia Smith
Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training.
no code implementations • 9 Nov 2022 • Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar
By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge.
no code implementations • 31 Oct 2022 • Manzil Zaheer, Kenneth Marino, Will Grathwohl, John Schultz, Wendy Shang, Sheila Babayan, Arun Ahuja, Ishita Dasgupta, Christine Kaeser-Chen, Rob Fergus
A fundamental ability of an intelligent web-based agent is seeking out and acquiring new information.
1 code implementation • 23 Oct 2022 • Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum
When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods.
1 code implementation • 7 Oct 2022 • Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan
Ontonotes has served as the most important benchmark for coreference resolution.
no code implementations • 6 Oct 2022 • Soumya Basu, Ankit Singh Rawat, Manzil Zaheer
The second class of retrieval-based approaches we explore learns a global model using kernel methods to directly map an input instance and retrieved examples to a prediction, without explicitly solving a local learning task.
no code implementations • 5 Oct 2022 • Mingda Qiao, Guru Guruganesh, Ankit Singh Rawat, Avinava Dubey, Manzil Zaheer
Regev and Vijayaraghavan (2017) showed that with $\Delta = \Omega(\sqrt{\log k})$ separation, the means can be learned using $\mathrm{poly}(k, d)$ samples, whereas super-polynomially many samples are required if $\Delta = o(\sqrt{\log k})$ and $d = \Omega(\log k)$.
no code implementations • 14 Aug 2022 • Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar
In this paper, we propose the teacher-guided training (TGT) framework for training a high-quality compact model that leverages the knowledge acquired by pretrained generative models, while obviating the need to go through a large volume of data.
1 code implementation • 21 Jun 2022 • Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
2 code implementations • 23 May 2022 • Adam Liška, Tomáš Kočiský, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, Cyprien de Masson d'Autume, Tim Scholtes, Manzil Zaheer, Susannah Young, Ellen Gilsenan-McMahon, Sophia Austin, Phil Blunsom, Angeliki Lazaridou
Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia.
no code implementations • 7 Apr 2022 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton
We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.
1 code implementation • 22 Feb 2022 • Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum
Question answering (QA) over knowledge bases (KBs) is challenging because of the diverse, essentially unbounded, types of reasoning patterns needed.
1 code implementation • 12 Feb 2022 • Tian Li, Manzil Zaheer, Sashank J. Reddi, Virginia Smith
Adaptive optimization methods have become the default solvers for many machine learning tasks.
no code implementations • 3 Feb 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh
We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits.
no code implementations • 2 Feb 2022 • Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar
In contrast to SGD, adaptive gradient methods like Adam allow robust training of modern deep networks, especially large language models.
1 code implementation • 29 Jan 2022 • Zhijian Duan, Jingwu Tang, Yutong Yin, Zhe Feng, Xiang Yan, Manzil Zaheer, Xiaotie Deng
One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer's expected revenue.
no code implementations • 12 Nov 2021 • Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh
We provide a unified view of all these problems, as learning to act in a hierarchical Bayesian bandit.
no code implementations • 19 Oct 2021 • Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar
In a nutshell, we use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples; for the "hard" examples, we fall-back to the teacher.
no code implementations • 29 Sep 2021 • Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar
In a nutshell, we use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples; for the "hard" examples, we fall-back to the teacher.
2 code implementations • 14 Jul 2021 • Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Konecny, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richtarik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, Wennan Zhu
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection.
no code implementations • NeurIPS 2021 • Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári
We propose ${\tt AdaTS}$, a Thompson sampling algorithm that adapts sequentially to bandit tasks that it interacts with.
no code implementations • 10 Jun 2021 • Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, Craig Boutilier
We study Thompson sampling (TS) in online decision making, where the uncertain environment is sampled from a mixture distribution.
no code implementations • EMNLP 2021 • Rajarshi Das, Manzil Zaheer, Dung Thai, Ameya Godbole, Ethan Perez, Jay-Yoon Lee, Lizhen Tan, Lazaros Polymenakos, Andrew McCallum
It is often challenging to solve a complex problem from scratch, but much easier if we can access other similar problems with their solutions -- a paradigm known as case-based reasoning (CBR).
Knowledge Base Question Answering
Natural Language Queries
+1
no code implementations • 14 Apr 2021 • Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Avinava Dubey, Patrick Flaherty, Manzil Zaheer, Amr Ahmed, Kyle Cranmer, Andrew McCallum
In those cases, hierarchical clustering can be seen as a combinatorial optimization problem.
no code implementations • 14 Feb 2021 • Ethan Shen, Maria Brbic, Nicholas Monath, Jiaqi Zhai, Manzil Zaheer, Jure Leskovec
In this paper, we present a comprehensive empirical study on graph embedded few-shot learning.
no code implementations • 11 Feb 2021 • Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari
Efficient exploration in bandits is a fundamental online learning problem.
1 code implementation • NeurIPS 2020 • Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Kim, Frederic Chazal, Larry Wasserman
We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure.
no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer
Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution P. In this work, we learn such policies for an unknown distribution P using samples from P. Our approach is a form of meta-learning and exploits properties of P without making strong assumptions about its form.
no code implementations • 1 Dec 2020 • Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar
In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.
no code implementations • 1 Dec 2020 • Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer
The latent codes are learned using a self-supervised learning principle, in which first a discrete autoencoder is trained on the output sequences, and then the resulting latent codes are used as intermediate targets for the end-to-end sequence prediction task.
no code implementations • 1 Dec 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Mohammad Ghavamzadeh, Craig Boutilier
The key idea is to frame this problem as a latent bandit, where the prototypical models of user behavior are learned offline and the latent state of the user is inferred online from its interactions with the models.
1 code implementation • 17 Nov 2020 • Honglin Yuan, Manzil Zaheer, Sashank Reddi
We first show that straightforward extensions of primal algorithms such as FedAvg are not well-suited for FCO since they suffer from the "curse of primal averaging," resulting in poor convergence.
no code implementations • NAACL 2021 • Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen
As a step towards making commonsense reasoning research more realistic, we propose to study open-ended commonsense reasoning (OpenCSR) -- the task of answering a commonsense question without any pre-defined choices -- using as a resource only a corpus of commonsense facts written in natural language.
2 code implementations • 22 Oct 2020 • Nicholas Monath, Avinava Dubey, Guru Guruganesh, Manzil Zaheer, Amr Ahmed, Andrew McCallum, Gokhan Mergen, Marc Najork, Mert Terzihan, Bryon Tjanaka, YuAn Wang, Yuchen Wu
The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Rajarshi Das, Ameya Godbole, Nicholas Monath, Manzil Zaheer, Andrew McCallum
A case-based reasoning (CBR) system solves a new problem by retrieving `cases' that are similar to the given problem.
Ranked #1 on
Link Prediction
on NELL-995
no code implementations • 15 Sep 2020 • Xinyuan Zhang, Ruiyi Zhang, Manzil Zaheer, Amr Ahmed
High-quality dialogue-summary paired data is expensive to produce and domain-sensitive, making abstractive dialogue summarization a challenging task.
1 code implementation • AAAI 2019 2019 • Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov
In this paper, we study bidirectional LSTM network for the task of text classification using both supervised and semi-supervised approaches.
Ranked #3 on
Text Classification
on AG News
14 code implementations • NeurIPS 2020 • Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed
To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.
Ranked #1 on
Text Summarization
on arXiv
1 code implementation • AKBC 2020 • Rajarshi Das, Ameya Godbole, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum
We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires \emph{no training}, and is reminiscent of case-based reasoning in classical artificial intelligence (AI).
no code implementations • 15 Jun 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed
This approach is practical and analyzable, and we provide guarantees on both the quality of off-policy optimization and the regret during online deployment.
no code implementations • NeurIPS 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed, Craig Boutilier
A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.
no code implementations • 9 Jun 2020 • Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier
Most bandit policies are designed to either minimize regret in any problem instance, making very few assumptions about the underlying environment, or in a Bayesian sense, assuming a prior distribution over environment parameters.
no code implementations • NeurIPS 2020 • Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya Menon, Sanjiv Kumar
In this paper, we present, to our knowledge, the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space.
no code implementations • ICLR 2021 • Paul Pu Liang, Manzil Zaheer, Yu-An Wang, Amr Ahmed
In this paper, we design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix.
8 code implementations • ICLR 2021 • Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan
Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data.
no code implementations • 27 Feb 2020 • Daniel A. Abolafia, Rishabh Singh, Manzil Zaheer, Charles Sutton
Main consists of a neural controller that interacts with a variable-length input tape and learns to compose modules together with their corresponding argument choices.
1 code implementation • ICLR 2020 • Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen
In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus.
no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer
In this work, we learn such policies for an unknown distribution $\mathcal{P}$ using samples from $\mathcal{P}$.
2 code implementations • NeurIPS 2020 • Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, Larry Wasserman
We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure.
Video-based Generative Performance Benchmarking (Contextual Understanding)
2 code implementations • 7 Jan 2020 • Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith
Federated learning aims to jointly learn statistical models over massively distributed remote devices.
no code implementations • WS 2019 • Rajarshi Das, Ameya Godbole, Manzil Zaheer, Shehzaad Dhuliawala, Andrew McCallum
This paper describes our submission to the shared task on {``}Multi-hop Inference Explanation Regeneration{''} in TextGraphs workshop at EMNLP 2019 (Jansen and Ustalov, 2019).
no code implementations • 25 Sep 2019 • Paul Pu Liang, Manzil Zaheer, YuAn Wang, Amr Ahmed
Learning continuous representations of discrete objects such as text, users, and items lies at the heart of many applications including text and user modeling.
no code implementations • WS 2019 • Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum
Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging.
no code implementations • 20 Aug 2019 • Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos
We explore the intersection of human and machine creativity by generating sculptural objects through machine learning.
1 code implementation • 5 Aug 2019 • Vít Růžička, Eunsu Kang, David Gordon, Ankita Patel, Jacqui Fashimpaur, Manzil Zaheer
While the purpose of most fake news is misinformation and political propaganda, our team sees it as a new type of myth that is created by people in the age of internet identities and artificial intelligence.
no code implementations • 21 Jun 2019 • Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier
The first, GLM-TSL, samples a generalized linear model (GLM) from the Laplace approximation to the posterior distribution.
1 code implementation • ICLR 2019 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum
This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other.
1 code implementation • 5 Feb 2019 • Christopher Bender, Kevin O'Connor, Yang Li, Juan Jose Garcia, Manzil Zaheer, Junier Oliva
In this work, we develop a new approach to generative density estimation for exchangeable, non-i. i. d.
22 code implementations • 14 Dec 2018 • Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith
Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity).
1 code implementation • NeurIPS 2018 • Manzil Zaheer, Sashank Reddi, Devendra Sachan, Satyen Kale, Sanjiv Kumar
In this work, we provide a new analysis of such methods applied to nonconvex stochastic optimization problems, characterizing the effect of increasing minibatch size.
no code implementations • 13 Nov 2018 • Chun-Liang Li, Eunsu Kang, Songwei Ge, Lingyao Zhang, Austin Dill, Manzil Zaheer, Barnabas Poczos
Our approach extends DeepDream from images to 3D point clouds.
1 code implementation • 13 Oct 2018 • Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, Ruslan Salakhutdinov
In this paper, we first show a straightforward extension of existing GAN algorithm is not applicable to point clouds, because the constraint required for discriminators is undefined for set data.
no code implementations • 8 Oct 2018 • Anit Kumar Sahu, Manzil Zaheer, Soummya Kar
This paper focuses on the problem of \emph{constrained} \emph{stochastic} optimization.
2 code implementations • EMNLP 2018 • Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, William W. Cohen
In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is available with a large text corpus.
Graph Representation Learning
Open-Domain Question Answering
no code implementations • NeurIPS 2018 • Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, Barnabás Póczos
We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance.
no code implementations • ICML 2018 • Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider
Further, through a comprehensive study over both real world and synthetic data, we show for that jointly leveraging transformations of variables and autoregressive conditional models, results in a considerable improvement in performance.
Ranked #1 on
Density Estimation
on BSDS300
1 code implementation • COLING 2018 • Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov
Text classification is one of the most widely studied tasks in natural language processing.
1 code implementation • CVPR 2018 • Chao-yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl
), we propose to train a deep network directly on the compressed video.
Ranked #46 on
Action Classification
on Charades
(using extra training data)
no code implementations • ICLR 2018 • Xun Zheng, Manzil Zaheer, Amr Ahmed, Yu-An Wang, Eric P. Xing, Alexander J. Smola
Long Short-Term Memory (LSTM) is one of the most powerful sequence models.
8 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum
Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.
no code implementations • 5 Sep 2017 • Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J. Smola
A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points.
no code implementations • ICML 2017 • Manzil Zaheer, Amr Ahmed, Alexander J. Smola
Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010).
no code implementations • ICML 2017 • Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola
In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space.
no code implementations • ACL 2017 • Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum
Existing question answering methods infer answers either from a knowledge base or from raw text.
no code implementations • 31 Mar 2017 • Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola
Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.
7 code implementations • NeurIPS 2017 • Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola
Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong.