Search Results for author: Alex Smola

Found 41 papers, 15 papers with code

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

1 code implementation10 Apr 2023 Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.

Denoising Image Generation +1

RLSbench: Domain Adaptation Under Relaxed Label Shift

1 code implementation6 Feb 2023 Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored.

Domain Adaptation

Multimodal Chain-of-Thought Reasoning in Language Models

3 code implementations2 Feb 2023 Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer.

Language Modelling Science Question Answering

Parameter-Efficient Fine-Tuning Design Spaces

no code implementations4 Jan 2023 Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang

We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.

Automatic Chain of Thought Prompting in Large Language Models

5 code implementations7 Oct 2022 Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola

Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.

Mixture Proportion Estimation and PU Learning: A Modern Approach

2 code implementations NeurIPS 2021 Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, learning the desired positive-versus-negative classifier.

Mixture Proportion Estimation and PU Learning:A Modern Approach

1 code implementation NeurIPS 2021 Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Chase Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier.

Multimodal AutoML on Structured Tables with Text Fields

2 code implementations ICML Workshop AutoML 2021 Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola

We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.


Regioned Episodic Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Cong Chen, Ming Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Yong Yu, Jun Wang, Alex Smola

Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.

reinforcement-learning Reinforcement Learning (RL)

Explore with Dynamic Map: Graph Structured Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Sijin Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Tong He, Yong Yu, Zheng Zhang, Alex Smola

In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.

reinforcement-learning Reinforcement Learning (RL)

TraDE: A Simple Self-Attention-Based Density Estimator

no code implementations1 Jan 2021 Rasool Fakoor, Pratik Anil Chaudhari, Jonas Mueller, Alex Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

Density Estimation Out-of-Distribution Detection

Recognizing Variables from their Data via Deep Embeddings of Distributions

no code implementations11 Sep 2019 Jonas Mueller, Alex Smola

A key obstacle in automated analytics and meta-learning is the inability to recognize when different datasets contain measurements of the same variable.


Deep Factors for Forecasting

no code implementations28 May 2019 Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski

We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part.

Time Series Time Series Analysis

Deep Factors with Gaussian Processes for Forecasting

no code implementations30 Nov 2018 Danielle C. Maddix, Yuyang Wang, Alex Smola

A large collection of time series poses significant challenges for classical and neural forecasting approaches.

Gaussian Processes Time Series +1

Learning Steady-States of Iterative Algorithms over Graphs

no code implementations ICML 2018 Hanjun Dai, Zornitsa Kozareva, Bo Dai, Alex Smola, Le Song

Many graph analytics problems can be solved via iterative algorithms where the solutions are often characterized by a set of steady-state conditions.

Deep Graphs

no code implementations4 Jun 2018 Emmanouil Antonios Platanios, Alex Smola

We propose an algorithm for deep learning on networks and graphs.


Detecting and Correcting for Label Shift with Black Box Predictors

1 code implementation ICML 2018 Zachary C. Lipton, Yu-Xiang Wang, Alex Smola

Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels.

Medical Diagnosis Test

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations ICLR 2018 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate valid

Canopy --- Fast Sampling with Cover Trees

no code implementations ICML 2017 Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola

In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space.

Efficient Multitask Feature and Relationship Learning

no code implementations14 Feb 2017 Han Zhao, Otilia Stretcu, Alex Smola, Geoff Gordon

In this paper, we consider a formulation of multitask learning that learns the relationships both between tasks and between features, represented through a task covariance and a feature covariance matrix, respectively.

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

1 code implementation14 Nov 2016 Danica J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, Arthur Gretton

In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples.


Stochastic Frank-Wolfe Methods for Nonconvex Optimization

no code implementations27 Jul 2016 Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved convergence rates for the stochastic setting.

Neural Machine Translation with Recurrent Attention Modeling

no code implementations EACL 2017 Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Fast Stochastic Methods for Nonsmooth Nonconvex Optimization

no code implementations23 May 2016 Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

This paper builds upon our recent series of papers on fast stochastic methods for smooth nonconvex optimization [22, 23], with a novel analysis for nonconvex and nonsmooth functions.

Fast Incremental Method for Nonconvex Optimization

no code implementations19 Mar 2016 Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

We analyze a fast incremental aggregated gradient method for optimizing nonconvex problems of the form $\min_x \sum_i f_i(x)$.

Stochastic Variance Reduction for Nonconvex Optimization

no code implementations19 Mar 2016 Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them.

Data Driven Resource Allocation for Distributed Learning

no code implementations15 Dec 2015 Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola

In distributed machine learning, data is dispatched to multiple machines for processing.

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

no code implementations26 Feb 2015 Yu-Xiang Wang, Stephen E. Fienberg, Alex Smola

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility.

Deep Fried Convnets

1 code implementation ICCV 2015 Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters.

Image Classification

Trend Filtering on Graphs

no code implementations28 Oct 2014 Yu-Xiang Wang, James Sharpnack, Alex Smola, Ryan J. Tibshirani

We introduce a family of adaptive estimators on graphs, based on penalizing the $\ell_1$ norm of discrete graph differences.


The Falling Factorial Basis and Its Statistical Applications

no code implementations3 May 2014 Yu-Xiang Wang, Alex Smola, Ryan J. Tibshirani

We study a novel spline-like basis, which we name the "falling factorial basis", bearing many similarities to the classic truncated power basis.


Randomized Nonlinear Component Analysis

no code implementations1 Feb 2014 David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schölkopf

Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale.


Super-Samples from Kernel Herding

1 code implementation15 Mar 2012 Yutian Chen, Max Welling, Alex Smola

We extend the herding algorithm to continuous spaces by using the kernel trick.

Feature Hashing for Large Scale Multitask Learning

no code implementations12 Feb 2009 Kilian Weinberger, Anirban Dasgupta, Josh Attenberg, John Langford, Alex Smola

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation.

Dimensionality Reduction

Cannot find the paper you are looking for? You can Submit a new open access paper.