no code implementations • 4 Oct 2022 • Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola

Real-world deployment of machine learning models is challenging because data evolves over time.

1 code implementation • 10 Dec 2021 • Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.

2 code implementations • 4 Nov 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.

Ranked #2 on Binary Classification on kickstarter

1 code implementation • NeurIPS 2021 • Abdul Fatir Ansari, Konstantinos Benidis, Richard Kurle, Ali Caner Turkmen, Harold Soh, Alexander J. Smola, Yuyang Wang, Tim Januschowski

We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and time-dependent switching dynamics.

1 code implementation • 21 Jun 2021 • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.

1 code implementation • 26 Feb 2021 • Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani

Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive.

1 code implementation • NeurIPS 2021 • Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola

Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration.

1 code implementation • 25 Nov 2020 • Jiarui Jin, Kounianhua Du, Weinan Zhang, Jiarui Qin, Yuchen Fang, Yong Yu, Zheng Zhang, Alexander J. Smola

Heterogeneous information network (HIN) has been widely used to characterize entities of various types and their complex relations.

1 code implementation • 1 Jul 2020 • Jiarui Jin, Jiarui Qin, Yuchen Fang, Kounianhua Du, Wei-Nan Zhang, Yong Yu, Zheng Zhang, Alexander J. Smola

To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.

no code implementations • 26 Jun 2020 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity.

1 code implementation • NeurIPS 2020 • Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari, Alexander J. Smola

Automated machine learning (AutoML) can produce complex model ensembles by stacking, bagging, and boosting many individual models like trees, deep networks, and nearest neighbor estimators.

no code implementations • 6 Apr 2020 • Rasool Fakoor, Pratik Chaudhari, Jonas Mueller, Alexander J. Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

1 code implementation • 14 Feb 2020 • Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

2 code implementations • ICLR 2020 • Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).

1 code implementation • 5 May 2019 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms.

1 code implementation • arXiv 2019 • Chenguang Wang, Mu Li, Alexander J. Smola

In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient.

Ranked #2 on Language Modelling on Penn Treebank (Word Level) (using extra training data)

1 code implementation • CVPR 2018 • Chao-yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl

), we propose to train a deep network directly on the compressed video.

Ranked #46 on Action Classification on Charades (using extra training data)

no code implementations • ICLR 2018 • Xun Zheng, Manzil Zaheer, Amr Ahmed, Yu-An Wang, Eric P. Xing, Alexander J. Smola

Long Short-Term Memory (LSTM) is one of the most powerful sequence models.

1 code implementation • 12 Sep 2017 • Yuyu Zhang, Hanjun Dai, Zornitsa Kozareva, Alexander J. Smola, Le Song

Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts.

no code implementations • 5 Sep 2017 • Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J. Smola

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points.

no code implementations • ICML 2017 • Manzil Zaheer, Amr Ahmed, Alexander J. Smola

Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010).

6 code implementations • ICCV 2017 • Chao-yuan Wu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl

In addition, we show that a simple margin based loss is sufficient to outperform all other loss functions.

Ranked #5 on Image Retrieval on CARS196

no code implementations • 31 Mar 2017 • Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola

Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.

1 code implementation • 27 Feb 2017 • Joachim D. Curtó, Irene C. Zarza, Feng Yang, Alexander J. Smola, Fernando de la Torre, Chong-Wah Ngo, Luc van Gool

The algorithm requires to compute the product of Walsh Hadamard Transform (WHT) matrices.

no code implementations • WSDM 2017 • Chao-yuan Wu, Amr Ahmed, Alex Beutel, Alexander J. Smola, How Jing

Recommender systems traditionally assume that user profiles and movie attributes are static.

no code implementations • NeurIPS 2016 • Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing

In this paper, we present techniques for reducing variance in stochastic gradient Langevin dynamics, yielding novel stochastic Monte Carlo methods that improve performance by reducing the variance in the stochastic gradient.

no code implementations • NeurIPS 2016 • Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alexander J. Smola

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonsmooth part is convex.

1 code implementation • 7 Nov 2016 • Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng, Jun Zhou

That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimating the evolution of these vulnerabilities over time.

no code implementations • 6 Dec 2015 • Chao-yuan Wu, Alex Beutel, Amr Ahmed, Alexander J. Smola

With this novel technique we propose a new Bayesian model for joint collaborative filtering of ratings and text reviews through a sum of simple co-clusterings.

no code implementations • 20 Aug 2015 • Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola

We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients.

no code implementations • 18 May 2015 • Mu Li, Dave G. Andersen, Alexander J. Smola

Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance.

no code implementations • 6 May 2015 • Ziqi Liu, Yu-Xiang Wang, Alexander J. Smola

Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed.

no code implementations • 31 Dec 2014 • Alex Beutel, Amr Ahmed, Alexander J. Smola

Matrix completion and approximation are popular tools to capture a user's preferences for recommendation and to approximate missing data.

no code implementations • 19 Dec 2014 • Zichao Yang, Alexander J. Smola, Le Song, Andrew Gordon Wilson

Kernel methods have great promise for learning rich statistical representations of large modern datasets.

no code implementations • NeurIPS 2014 • Mu Li, David G. Andersen, Alexander J. Smola, Kai Yu

This paper describes a third-generation parameter server framework for distributed machine learning.

no code implementations • NeurIPS 2014 • Hsiao-Yu Tung, Alexander J. Smola

The Indian Buffet Process is a versatile statistical tool for modeling distributions over binary matrices.

no code implementations • NeurIPS 2013 • Chong Wang, Xi Chen, Alexander J. Smola, Eric P. Xing

We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.