Search Results for author: Alex Smola

Found 42 papers, 16 papers with code

Automatic Chain of Thought Prompting in Large Language Models

5 code implementations • 7 Oct 2022 • Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola

Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.

17,278

Paper
Code

Multimodal Chain-of-Thought Reasoning in Language Models

3 code implementations • 2 Feb 2023 • Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer.

Ranked #3 on Science Question Answering on ScienceQA

Language Modelling Science Question Answering

3,671

Paper
Code

Hierarchical Attention Networks for Document Classification

1 code implementation • NAACL 2016 • Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy

Ranked #4 on Text Classification on arXiv-10

Citation Intent Classification Document Classification +2

457

Paper
Code

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate Relation +1

309

Paper
Code

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

1 code implementation • NeurIPS 2023 • Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu sun

This work proposes POMP, a prompt pre-training method for vision-language models.

Ranked #1 on Open Vocabulary Semantic Segmentation on COCO-Stuff-171

Image Classification object-detection +3

245

Paper
Code

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

1 code implementation • 14 Nov 2016 • Danica J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, Arthur Gretton

In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples.

205

Paper
Code

Stacked Attention Networks for Image Question Answering

16 code implementations • CVPR 2016 • Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Smola

Thus, we develop a multiple-layer SAN in which we query an image multiple times to infer the answer progressively.

Ranked #5 on Visual Question Answering (VQA) on VQA v1 test-std

Visual Question Answering (VQA)

104

Paper
Code

Multimodal AutoML on Structured Tables with Text Fields

1 code implementation • ICML Workshop AutoML 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola

We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.

AutoML

Paper
Code

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

1 code implementation • 10 Apr 2023 • Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.

Denoising Image Generation +1

Paper
Code

Mixture Proportion Estimation and PU Learning: A Modern Approach

2 code implementations • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, learning the desired positive-versus-negative classifier.

Paper
Code

Mixture Proportion Estimation and PU Learning:A Modern Approach

1 code implementation • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Chase Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier.

Paper
Code

Super-Samples from Kernel Herding

1 code implementation • 15 Mar 2012 • Yutian Chen, Max Welling, Alex Smola

We extend the herding algorithm to continuous spaces by using the kernel trick.

Paper
Code

Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition

1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, Zhangyang Wang

However, in real-world applications, it is common for the training sets to have long-tailed distributions.

Anomaly Detection Contrastive Learning +2

Paper
Code

RLSbench: Domain Adaptation Under Relaxed Label Shift

1 code implementation • 6 Feb 2023 • Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored.

Domain Adaptation

Paper
Code

Deep Fried Convnets

1 code implementation • ICCV 2015 • Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters.

Ranked #54 on Image Classification on MNIST

Image Classification

Paper
Code

Detecting and Correcting for Label Shift with Black Box Predictors

1 code implementation • ICML 2018 • Zachary C. Lipton, Yu-Xiang Wang, Alex Smola

Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels.

Medical Diagnosis

Paper
Code

Deep Graphs

no code implementations • 4 Jun 2018 • Emmanouil Antonios Platanios, Alex Smola

We propose an algorithm for deep learning on networks and graphs.

regression

Paper
Add Code

Efficient Multitask Feature and Relationship Learning

no code implementations • 14 Feb 2017 • Han Zhao, Otilia Stretcu, Alex Smola, Geoff Gordon

In this paper, we consider a formulation of multitask learning that learns the relationships both between tasks and between features, represented through a task covariance and a feature covariance matrix, respectively.

Paper
Add Code

Data Driven Resource Allocation for Distributed Learning

no code implementations • 15 Dec 2015 • Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola

In distributed machine learning, data is dispatched to multiple machines for processing.

Paper
Add Code

AIDE: Fast and Communication Efficient Distributed Optimization

no code implementations • 24 Aug 2016 • Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczós, Alex Smola

It is well known that DANE algorithm does not match the communication complexity lower bounds.

Distributed Optimization

Paper
Add Code

Stochastic Frank-Wolfe Methods for Nonconvex Optimization

no code implementations • 27 Jul 2016 • Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved convergence rates for the stochastic setting.

Paper
Add Code

Neural Machine Translation with Recurrent Attention Modeling

no code implementations • EACL 2017 • Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Paper
Add Code

Trend Filtering on Graphs

no code implementations • 28 Oct 2014 • Yu-Xiang Wang, James Sharpnack, Alex Smola, Ryan J. Tibshirani

We introduce a family of adaptive estimators on graphs, based on penalizing the $\ell_1$ norm of discrete graph differences.

regression

Paper
Add Code

Fast Stochastic Methods for Nonsmooth Nonconvex Optimization

no code implementations • 23 May 2016 • Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

This paper builds upon our recent series of papers on fast stochastic methods for smooth nonconvex optimization [22, 23], with a novel analysis for nonconvex and nonsmooth functions.

Paper
Add Code

Stochastic Variance Reduction for Nonconvex Optimization

no code implementations • 19 Mar 2016 • Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them.

Paper
Add Code

Fast Incremental Method for Nonconvex Optimization

no code implementations • 19 Mar 2016 • Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

We analyze a fast incremental aggregated gradient method for optimizing nonconvex problems of the form $\min_x \sum_i f_i(x)$.

Paper
Add Code

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants

no code implementations • NeurIPS 2015 • Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alex Smola

We demonstrate the empirical performance of our method through a concrete realization of asynchronous SVRG.

Paper
Add Code

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

no code implementations • 26 Feb 2015 • Yu-Xiang Wang, Stephen E. Fienberg, Alex Smola

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility.

Paper
Add Code

The Falling Factorial Basis and Its Statistical Applications

no code implementations • 3 May 2014 • Yu-Xiang Wang, Alex Smola, Ryan J. Tibshirani

We study a novel spline-like basis, which we name the "falling factorial basis", bearing many similarities to the classic truncated power basis.

Paper
Add Code

Randomized Nonlinear Component Analysis

no code implementations • 1 Feb 2014 • David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schölkopf

Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale.

Clustering

Paper
Add Code

Deep Factors with Gaussian Processes for Forecasting

no code implementations • 30 Nov 2018 • Danielle C. Maddix, Yuyang Wang, Alex Smola

A large collection of time series poses significant challenges for classical and neural forecasting approaches.

Gaussian Processes Time Series +1

Paper
Add Code

Canopy --- Fast Sampling with Cover Trees

no code implementations • ICML 2017 • Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola

In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space.

Paper
Add Code

Learning Steady-States of Iterative Algorithms over Graphs

no code implementations • ICML 2018 • Hanjun Dai, Zornitsa Kozareva, Bo Dai, Alex Smola, Le Song

Many graph analytics problems can be solved via iterative algorithms where the solutions are often characterized by a set of steady-state conditions.

Paper
Add Code

MLSys: The New Frontier of Machine Learning Systems

no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar

Machine learning (ML) techniques are enjoying rapidly increasing adoption.

BIG-bench Machine Learning

Paper
Add Code

Deep Factors for Forecasting

no code implementations • 28 May 2019 • Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski

We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part.

Time Series Time Series Analysis

Paper
Add Code

Recognizing Variables from their Data via Deep Embeddings of Distributions

no code implementations • 11 Sep 2019 • Jonas Mueller, Alex Smola

A key obstacle in automated analytics and meta-learning is the inability to recognize when different datasets contain measurements of the same variable.

Attribute Meta-Learning

Paper
Add Code

Tiering as a Stochastic Submodular Optimization Problem

no code implementations • 16 May 2020 • Hyokun Yun, Michael Froh, Roshan Makhijani, Brian Luc, Alex Smola, Trishul Chilimbi

Tiering is an essential technique for building large-scale information retrieval systems.

Information Retrieval Retrieval +1

Paper
Add Code

Explore with Dynamic Map: Graph Structured Reinforcement Learning

no code implementations • 1 Jan 2021 • Jiarui Jin, Sijin Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Tong He, Yong Yu, Zheng Zhang, Alex Smola

In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Regioned Episodic Reinforcement Learning

no code implementations • 1 Jan 2021 • Jiarui Jin, Cong Chen, Ming Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Yong Yu, Jun Wang, Alex Smola

Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

TraDE: A Simple Self-Attention-Based Density Estimator

no code implementations • 1 Jan 2021 • Rasool Fakoor, Pratik Anil Chaudhari, Jonas Mueller, Alex Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

Density Estimation Out-of-Distribution Detection

Paper
Add Code

Parameter-Efficient Fine-Tuning Design Spaces

no code implementations • 4 Jan 2023 • Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang

We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.

Paper
Add Code

Feature Hashing for Large Scale Multitask Learning

no code implementations • 12 Feb 2009 • Kilian Weinberger, Anirban Dasgupta, Josh Attenberg, John Langford, Alex Smola

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation.

Dimensionality Reduction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.