Search Results for author: Scott Sanner

Found 33 papers, 12 papers with code

Sample-efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs

no code implementations23 Mar 2022 Siow Meng Low, Akshat Kumar, Scott Sanner

This novel formulation of DRP learning as iterative lower bound optimization (ILBO) is particularly appealing because (i) each step is structurally easier to optimize than the overall objective, (ii) it guarantees a monotonically improving objective under certain theoretical conditions, and (iii) it reuses samples between iterations thus lowering sample complexity.

TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation

1 code implementation14 Mar 2022 Ruiwen Li, Zheda Mai, Chiheb Trabelsi, Zhibo Zhang, Jongseong Jang, Scott Sanner

In this paper, we propose TransCAM, a Conformer-based solution to WSSS that explicitly leverages the attention weights from the transformer branch of the Conformer to refine the CAM generated from the CNN branch.

Weakly-Supervised Semantic Segmentation

Unintended Bias in Language Model-driven Conversational Recommendation

no code implementations17 Jan 2022 Tianshu Shen, Jiaru Li, Mohamed Reda Bouadjenek, Zheda Mai, Scott Sanner

Conversational Recommendation Systems (CRSs) have recently started to leverage pretrained language models (LM) such as BERT for their ability to semantically interpret a wide range of preference statement variations.

Language Modelling Pretrained Language Models +1

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

1 code implementation28 Nov 2021 Zhibo Zhang, Jongseong Jang, Chiheb Trabelsi, Ruiwen Li, Scott Sanner, Yeonjeong Jeong, Dongsub Shim

Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification.

Adversarial Robustness Classification +2

Multi-axis Attentive Prediction for Sparse EventData: An Application to Crime Prediction

1 code implementation5 Oct 2021 Yi Sui, Ga Wu, Scott Sanner

We additionally introduce a novel Frobenius norm-based contrastive learning objective to improve latent representational generalization. Empirically, we validate MAPSED on two publicly accessible urban crime datasets for spatiotemporal sparse event prediction, where MAPSED outperforms both classical and state-of-the-art deep learning models.

Contrastive Learning Crime Prediction

Planning with Learned Binarized Neural Networks Benchmarks for MaxSAT Evaluation 2021

no code implementations2 Aug 2021 Buser Say, Scott Sanner, Jo Devriendt, Jakob Nordström, Peter J. Stuckey

This document provides a brief introduction to learned automated planning problem where the state transition function is in the form of a binarized neural network (BNN), presents a general MaxSAT encoding for this problem, and describes the four domains, namely: Navigation, Inventory Control, System Administrator and Cellda, that are submitted as benchmarks for MaxSAT Evaluation 2021.

RAPTOR: End-to-end Risk-Aware MDP Planning and Policy Learning by Backpropagation

no code implementations14 Jun 2021 Noah Patton, Jihwan Jeong, Michael Gimelfarb, Scott Sanner

The direct optimization of this empirical objective in an end-to-end manner is called the risk-averse straight-line plan, which commits to a sequence of actions in advance and can be sub-optimal in highly stochastic domains.

Risk-Aware Transfer in Reinforcement Learning using Successor Features

no code implementations NeurIPS 2021 Michael Gimelfarb, André Barreto, Scott Sanner, Chi-Guhn Lee

Sample efficiency and risk-awareness are central to the development of practical reinforcement learning (RL) for complex decision-making.

Decision Making reinforcement-learning +1

Supervised Contrastive Replay: Revisiting the Nearest Class Mean Classifier in Online Class-Incremental Continual Learning

1 code implementation22 Mar 2021 Zheda Mai, Ruiwen Li, Hyunwoo Kim, Scott Sanner

Online class-incremental continual learning (CL) studies the problem of learning new classes continually from an online non-stationary data stream, intending to adapt to new data while mitigating catastrophic forgetting.

Continual Learning online learning

Online Continual Learning in Image Classification: An Empirical Survey

1 code implementation25 Jan 2021 Zheda Mai, Ruiwen Li, Jihwan Jeong, David Quispe, Hyunwoo Kim, Scott Sanner

To better understand the relative advantages of various approaches and the settings where they work best, this survey aims to (1) compare state-of-the-art methods such as MIR, iCARL, and GDumb and determine which works best at different experimental settings; (2) determine if the best class incremental methods are also competitive in domain incremental setting; (3) evaluate the performance of 7 simple but effective trick such as "review" trick and nearest class mean (NCM) classifier to assess their relative impact.

Classification Continual Learning +2

Attentive Autoencoders for Multifaceted Preference Learning in One-class Collaborative Filtering

no code implementations24 Oct 2020 Zheda Mai, Ga Wu, Kai Luo, Scott Sanner

In order to capture multifaceted user preferences, existing recommender systems either increase the encoding complexity or extend the latent representation dimension.

Collaborative Filtering Recommendation Systems

Online Class-Incremental Continual Learning with Adversarial Shapley Value

1 code implementation31 Aug 2020 Dongsub Shim, Zheda Mai, Jihwan Jeong, Scott Sanner, Hyunwoo Kim, Jongseong Jang

As image-based deep learning becomes pervasive on every device, from cell phones to smart watches, there is a growing need to develop methods that continually learn from data while minimizing memory footprint and power consumption.

Continual Learning

Noise Contrastive Estimation for Autoencoding-based One-Class Collaborative Filtering

no code implementations3 Aug 2020 Jin Peng Zhou, Ga Wu, Zheda Mai, Scott Sanner

One-class collaborative filtering (OC-CF) is a common class of recommendation problem where only the positive class is explicitly observed (e. g., purchases, clicks).

Collaborative Filtering

Batch-level Experience Replay with Review for Continual Learning

1 code implementation11 Jul 2020 Zheda Mai, Hyunwoo Kim, Jihwan Jeong, Scott Sanner

Continual learning is a branch of deep learning that seeks to strike a balance between learning stability and plasticity.

Continual Learning

ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning

1 code implementation2 Jul 2020 Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms.

Bayesian Experience Reuse for Learning from Multiple Demonstrators

no code implementations10 Jun 2020 Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

We demonstrate the effectiveness of this approach for static optimization of smooth functions, and transfer learning in a high-dimensional supply chain problem with cost uncertainty.

Transfer Learning

Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts

no code implementations29 Feb 2020 Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

In this paper, we assume knowledge of estimated source task dynamics and policies, and common sub-goals but different dynamics.

OpenAI Gym Q-Learning +1

Optimizing Search API Queries for Twitter Topic Classifiers Using a Maximum Set Coverage Approach

no code implementations23 Apr 2019 Kasra Safari, Scott Sanner

Thus, it is critically important to query the Twitter API relative to the intended topical classifier in a way that minimizes the amount of negatively classified data retrieved.

Reward Potentials for Planning with Learned Neural Network Transition Models

no code implementations19 Apr 2019 Buser Say, Scott Sanner, Sylvie Thiébaux

We then strengthen the linear relaxation of the underlying MILP model by introducing constraints to bound the reward function based on the precomputed reward potentials.

Scalable Planning with Deep Neural Network Learned Transition Models

no code implementations5 Apr 2019 Ga Wu, Buser Say, Scott Sanner

But there remains one major problem for the task of control -- how can we plan with deep network learned transition models without resorting to Monte Carlo Tree Search and other black-box transition model techniques that ignore model structure and do not easily extend to mixed discrete and continuous domains?

Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach

no code implementations NeurIPS 2018 Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

Potential based reward shaping is a powerful technique for accelerating convergence of reinforcement learning algorithms.

reinforcement-learning

Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models

no code implementations26 Nov 2018 Buser Say, Scott Sanner

In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces.

Aesthetic Features for Personalized Photo Recommendation

no code implementations31 Aug 2018 Yu Qing Zhou, Ga Wu, Scott Sanner, Putra Manggala

Many photography websites such as Flickr, 500px, Unsplash, and Adobe Behance are used by amateur and professional photography enthusiasts.

Collaborative Filtering Image Retrieval +1

Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding

1 code implementation ICLR 2019 Ga Wu, Justin Domke, Scott Sanner

Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging.

Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

no code implementations NeurIPS 2017 Ga Wu, Buser Say, Scott Sanner

Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces?

Stochastic Planning and Lifted Inference

no code implementations4 Jan 2017 Roni Khardon, Scott Sanner

Lifted probabilistic inference (Poole, 2003) and symbolic dynamic programming for lifted stochastic planning (Boutilier et al, 2001) were introduced around the same time as algorithmic efforts to use abstraction in stochastic systems.

Decision Making

Expecting to be HIP: Hawkes Intensity Processes for Social Media Popularity

1 code implementation19 Feb 2016 Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner, Manuel Cebrian, Honglin Yu, Pascal Van Hentenryck

Modeling and predicting the popularity of online content is a significant problem for the practice of information dissemination, advertising, and consumption.

Social and Information Networks

Bounded Approximate Symbolic Dynamic Programming for Hybrid MDPs

no code implementations26 Sep 2013 Luis Gustavo Vianna, Scott Sanner, Leliane Nunes de Barros

Recent advances in symbolic dynamic programming (SDP) combined with the extended algebraic decision diagram (XADD) data structure have provided exact solutions for mixed discrete and continuous (hybrid) MDPs with piecewise linear dynamics and continuous actions.

Symbolic Dynamic Programming for Continuous State and Observation POMDPs

no code implementations NeurIPS 2012 Zahra Zamani, Scott Sanner, Pascal Poupart, Kristian Kersting

In recent years, point- based value iteration methods have proven to be extremely effective techniques for finding (approximately) optimal dynamic programming solutions to POMDPs when an initial set of belief states is known.

Decision Making

Gaussian Process Preference Elicitation

no code implementations NeurIPS 2010 Shengbo Guo, Scott Sanner, Edwin V. Bonilla

Bayesian approaches to preference elicitation (PE) are particularly attractive due to their ability to explicitly model uncertainty in users' latent utility functions.

Cannot find the paper you are looking for? You can Submit a new open access paper.