Search Results for author: Yuan Zhou

Found 71 papers, 13 papers with code

Imitation Learning from Observations under Transition Model Disparity

1 code implementation ICLR 2022 Tanmay Gangwani, Yuan Zhou, Jian Peng

In this work, we propose an algorithm that trains an intermediary policy in the learner environment and uses it as a surrogate expert for the learner.

Imitation Learning

Self-normalized Classification of Parkinson's Disease DaTscan Images

no code implementations27 Dec 2021 Yuan Zhou, Hemant D. Tagare

Classifying SPECT images requires a preprocessing step which normalizes the images using a normalization region.

Classification

A Survey on Scenario-Based Testing for Automated Driving Systems in High-Fidelity Simulation

no code implementations2 Dec 2021 Ziyuan Zhong, Yun Tang, Yuan Zhou, Vania de Oliveira Neves, Yang Liu, Baishakhi Ray

To bridge this gap, in this work, we provide a generic formulation of scenario-based testing in high-fidelity simulation and conduct a literature review on the existing works.

Learning Long-Term Reward Redistribution via Randomized Return Decomposition

1 code implementation ICLR 2022 Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng

Based on this framework, this paper proposes a novel reward redistribution algorithm, randomized return decomposition (RRD), to learn a proxy reward function for episodic reinforcement learning.

reinforcement-learning

Fairness-aware Online Price Discrimination with Nonparametric Demand Models

no code implementations16 Nov 2021 Xi Chen, Xuan Zhang, Yuan Zhou

To handle this general class, we propose a soft fairness constraint and develop the dynamic pricing policy that achieves $\tilde{O}(T^{4/5})$ regret.

Fairness online learning

Full-attention based Neural Architecture Search using Context Auto-regression

no code implementations13 Nov 2021 Yuan Zhou, Haiyang Wang, Shuwei Huo, Boyu Wang

Thus, it is appropriate to consider using NAS methods to discover a better self-attention architecture automatically.

Fine-Grained Image Recognition Image Classification +3

SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation

no code implementations13 Aug 2021 Chenyu You, Yuan Zhou, Ruihan Zhao, Lawrence Staib, James S. Duncan

However, most existing learning-based approaches usually suffer from limited manually annotated medical data, which poses a major practical problem for accurate and robust medical image segmentation.

Data Augmentation Image Generation +3

Few-shot Learning with Global Relatedness Decoupled-Distillation

no code implementations12 Jul 2021 Yuan Zhou, Yanrong Guo, Shijie Hao, Richang Hong, ZhengJun Zha, Meng Wang

To overcome these problems, we propose a new Global Relatedness Decoupled-Distillation (GRDD) method using the global category knowledge and the Relatedness Decoupled-Distillation (RDD) strategy.

Few-Shot Learning Metric Learning

Coordinate-wise Control Variates for Deep Policy Gradients

no code implementations11 Jul 2021 Yuanyi Zhong, Yuan Zhou, Jian Peng

The control variates (CV) method is widely used in policy gradient estimation to reduce the variance of the gradient estimators in practice.

Continuous Control

Few-shot Partial Multi-view Learning

no code implementations5 May 2021 Yuan Zhou, Yanrong Guo, Shijie Hao, Richang Hong, Jiebo Luo

The challenges of this task are twofold: (1) under the interference of the missing views, it is difficult to overcome the negative impact brought by data scarcity; (2) the limited number of data exacerbates information scarcity, thereby making it harder to address the view-missing problem.

Few-Shot Learning MULTI-VIEW LEARNING

Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records

no code implementations13 Jan 2021 Yiqin Yu, Pin-Yu Chen, Yuan Zhou, Jing Mei

With the successful adoption of machine learning on electronic health records (EHRs), numerous computational models have been deployed to address a variety of clinical problems.

Data Augmentation Domain Adaptation

Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition

no code implementations NeurIPS 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon1episodic Markov Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a model-free algorithm UCB-ADVANTAGE and prove that it achieves \tilde{O}(\sqrt{H^2 SAT}) regret where T=KH and K is the number of episodes to play.

reinforcement-learning

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

1 code implementation5 Nov 2020 Tanmay Gangwani, Jian Peng, Yuan Zhou

Quality-Diversity (QD) is a concept from Neuroevolution with some intriguing applications to Reinforcement Learning.

reinforcement-learning

Learning Guidance Rewards with Trajectory-space Smoothing

2 code implementations NeurIPS 2020 Tanmay Gangwani, Yuan Zhou, Jian Peng

To make credit assignment easier, recent works have proposed algorithms to learn dense "guidance" rewards that could be used in place of the sparse or delayed environmental rewards.

Q-Learning

Unsupervised Self-training Algorithm Based on Deep Learning for Optical Aerial Images Change Detection

no code implementations15 Oct 2020 Yuan Zhou, Xiangrui Li

Then two set of pseudo labels are used to jointly train a student network with the same structure as the teacher.

Change Detection

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis

1 code implementation10 Oct 2020 Yicheng Luo, Antonio Filieri, Yuan Zhou

Probabilistic software analysis aims at quantifying the probability of a target event occurring during the execution of a program processing uncertain incoming data or written itself using probabilistic programming constructs.

Probabilistic Programming

Probabilistic Programs with Stochastic Conditioning

1 code implementation1 Oct 2020 David Tolpin, Yuan Zhou, Tom Rainforth, Hongseok Yang

We tackle the problem of conditioning probabilistic programs on distributions of observable variables.

Probabilistic Programming

Bayesian Policy Search for Stochastic Domains

no code implementations1 Oct 2020 David Tolpin, Yuan Zhou, Hongseok Yang

In this work, we cast policy search in stochastic domains as a Bayesian inference problem and provide a scheme for encoding such problems as nested probabilistic programs.

Probabilistic Programming Variational Inference

Near-Optimal MNL Bandits Under Risk Criteria

no code implementations26 Sep 2020 Guangyu Xi, Chao Tao, Yuan Zhou

We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria.

Efficient Competitive Self-Play Policy Optimization

no code implementations13 Sep 2020 Yuanyi Zhong, Yuan Zhou, Jian Peng

Reinforcement learning from self-play has recently reported many successes.

reinforcement-learning

Pooling Regularized Graph Neural Network for fMRI Biomarker Analysis

no code implementations29 Jul 2020 Xiaoxiao Li, Yuan Zhou, Nicha C. Dvornek, Muhan Zhang, Juntang Zhuang, Pamela Ventola, James S. Duncan

We propose an interpretable GNN framework with a novel salient region selection mechanism to determine neurological brain biomarkers associated with disorders.

Graph Neural Network for Video Relocalization

no code implementations20 Jul 2020 Yuan Zhou, Mingfei Wang, Ruolin Wang, Shuwei Huo

In this paper, we focus on video relocalization task, which uses a query video clip as input to retrieve a semantic relative video clip in another untrimmed long video.

Moment Retrieval

Multinomial Logit Bandit with Low Switching Cost

no code implementations ICML 2020 Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou

We also present the ESUCB algorithm with item switching cost $O(N \log^2 T)$.

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

no code implementations4 Jul 2020 Yufei Ruan, Jiaqi Yang, Yuan Zhou

Motivated by practical needs such as large-scale learning, we study the impact of adaptivity constraints to linear contextual bandits, a central problem in online active learning.

Active Learning Multi-Armed Bandits

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

no code implementations6 Jun 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

In this paper we consider the problem of learning an $\epsilon$-optimal policy for a discounted Markov Decision Process (MDP).

reinforcement-learning

Adaptive Double-Exploration Tradeoff for Outlier Detection

no code implementations13 May 2020 Xiaojin Zhang, Honglei Zhuang, Shengyu Zhang, Yuan Zhou

We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold.

Outlier Detection

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

no code implementations21 Apr 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$.

reinforcement-learning

Collaborative Top Distribution Identifications with Limited Interaction

no code implementations20 Apr 2020 Nikolai Karpov, Qin Zhang, Yuan Zhou

We give optimal time-round tradeoffs, as well as demonstrate complexity separations between top-$1$ arm identification and top-$m$ arm identifications for general $m$ and between fixed-time and fixed-confidence variants.

Stochastically Differentiable Probabilistic Programs

no code implementations2 Mar 2020 David Tolpin, Yuan Zhou, Hongseok Yang

Probabilistic programs with mixed support (both continuous and discrete latent random variables) commonly appear in many probabilistic programming systems (PPSs).

Probabilistic Programming

Anypath Routing Protocol Design via Q-Learning for Underwater Sensor Networks

no code implementations22 Feb 2020 Yuan Zhou, Tao Cao, Wei Xiang

As a promising technology in the Internet of Underwater Things, underwater sensor networks have drawn a widespread attention from both academia and industry.

Q-Learning

Domain Adaptive Adversarial Learning Based on Physics Model Feedback for Underwater Image Enhancement

no code implementations20 Feb 2020 Yuan Zhou, Kangming Yan

Owing to refraction, absorption, and scattering of light by suspended particles in water, raw underwater images suffer from low contrast, blurred details, and color distortion.

Domain Adaptation Image Enhancement

Exploiting Operation Importance for Differentiable Neural Architecture Search

no code implementations24 Nov 2019 Xukai Xie, Yuan Zhou, Sun-Yuan Kung

All the existing methods determine the importance of each operation directly by architecture weights.

Neural Architecture Search

Furnishing Your Room by What You See: An End-to-End Furniture Set Retrieval Framework with Rich Annotated Benchmark Dataset

no code implementations21 Nov 2019 Bingyuan Liu, Jiantao Zhang, Xiaoting Zhang, Wei zhang, Chuanhui Yu, Yuan Zhou

However, few works focus on the understanding of furniture within the scenes and a large-scale dataset is also lacked to advance the field.

Temporal Action Localization using Long Short-Term Dependency

no code implementations4 Nov 2019 Yuan Zhou, Hongru Li, Sun-Yuan Kung

In the present study, we developed a novel method, referred to as Gemini Network, for effective modeling of temporal structures and achieving high-performance temporal action localization.

Temporal Action Localization

Cross-Scale Residual Network for Multiple Tasks:Image Super-resolution, Denoising, and Deblocking

no code implementations4 Nov 2019 Yuan Zhou, Xiaoting Du, Yeda Zhang, Sun-Yuan Kung

To this end, we propose the cross-scale residual network to exploit scale-related features and the inter-task correlations among the three tasks.

Denoising Image Restoration +1

Comb Convolution for Efficient Convolutional Architecture

no code implementations1 Nov 2019 Dandan Li, Yuan Zhou, Shuwei Huo, Sun-Yuan Kung

Convolutional neural networks (CNNs) are inherently suffering from massively redundant computation (FLOPs) due to the dense connection pattern between feature maps and convolution kernels.

$\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank

no code implementations5 Sep 2019 Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou

Our learning algorithm, Adaptive Value-function Elimination (AVE), is inspired by the policy elimination algorithm proposed in (Jiang et al., 2017), known as OLIVE.

Efficient Exploration online learning

Dual-reference Age Synthesis

no code implementations7 Aug 2019 Yuan Zhou, Bingzhang Hu, and Jun He, Yu Guan, Ling Shao

Age synthesis methods typically take a single image as input and use a specific number to control the age of the generated image.

Exploration via Hindsight Goal Generation

1 code implementation NeurIPS 2019 Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng

Goal-oriented reinforcement learning has recently been a practical framework for robotic manipulation tasks, in which an agent is required to reach a certain goal defined by a function on the state space.

reinforcement-learning

HGC: Hierarchical Group Convolution for Highly Efficient Neural Network

no code implementations9 Jun 2019 Xukai Xie, Yuan Zhou, Sun-Yuan Kung

Using this operation, feature maps of different group cannot communicate, which restricts their representation capability.

Thresholding Bandit with Optimal Aggregate Regret

no code implementations NeurIPS 2019 Chao Tao, Saùl Blanco, Jian Peng, Yuan Zhou

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials.

Tight Regret Bounds for Infinite-armed Linear Contextual Bandits

no code implementations4 May 2019 Yingkai Li, Yining Wang, Xi Chen, Yuan Zhou

Linear contextual bandit is an important class of sequential decision making problems with a wide range of applications to recommender systems, online advertising, healthcare, and many other machine learning related tasks.

Decision Making Multi-Armed Bandits +1

Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits

no code implementations5 Apr 2019 Chao Tao, Qin Zhang, Yuan Zhou

Best arm identification (or, pure exploration) in multi-armed bandits is a fundamental problem in machine learning.

Multi-Armed Bandits

LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models

1 code implementation6 Mar 2019 Yuan Zhou, Bradley J. Gram-Hansen, Tobias Kohn, Tom Rainforth, Hongseok Yang, Frank Wood

We develop a new Low-level, First-order Probabilistic Programming Language (LF-PPL) suited for models containing a mix of continuous, discrete, and/or piecewise-continuous variables.

Probabilistic Programming

Efficient Interpretation of Deep Learning Models Using Graph Structure and Cooperative Game Theory: Application to ASD Biomarker Discovery

no code implementations14 Dec 2018 Xiaoxiao Li, Nicha C. Dvornek, Yuan Zhou, Juntang Zhuang, Pamela Ventola, James S. Duncan

Cooperative game theory is advantageous here because it directly considers the interaction between features and can be applied to any machine learning method, making it a novel, more accurate way of determining instance-wise biomarker importance from deep learning models.

Feature Importance

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

no code implementations NeurIPS 2018 Yining Wang, Xi Chen, Yuan Zhou

In this paper we consider the dynamic assortment selection problem under an uncapacitated multinomial-logit (MNL) model.

Dynamic Assortment Optimization with Changing Contextual Information

no code implementations31 Oct 2018 Xi Chen, Yining Wang, Yuan Zhou

To this end, we develop an upper confidence bound (UCB) based policy and establish the regret bound on the order of $\widetilde O(d\sqrt{T})$, where $d$ is the dimension of the feature and $\widetilde O$ suppresses logarithmic dependence.

Combinatorial Optimization

On Exploration, Exploitation and Learning in Adaptive Importance Sampling

no code implementations31 Oct 2018 Xiaoyu Lu, Tom Rainforth, Yuan Zhou, Jan-Willem van de Meent, Yee Whye Teh

We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the trade-off between exploration and exploitation in this adaptation.

online learning

Best Arm Identification in Linear Bandits with Linear Dimension Dependency

no code implementations ICML 2018 Chao Tao, Saúl Blanco, Yuan Zhou

We study the best arm identification problem in linear bandits, where the mean reward of each arm depends linearly on an unknown $d$-dimensional parameter vector $\theta$, and the goal is to identify the arm with the largest expected reward.

Dynamic Assortment Selection under the Nested Logit Models

no code implementations27 Jun 2018 Xi Chen, Chao Shi, Yining Wang, Yuan Zhou

One key challenge is that utilities of products are unknown to the seller and need to be learned.

Inference Trees: Adaptive Inference with Exploration

no code implementations25 Jun 2018 Tom Rainforth, Yuan Zhou, Xiaoyu Lu, Yee Whye Teh, Frank Wood, Hongseok Yang, Jan-Willem van de Meent

We introduce inference trees (ITs), a new class of inference methods that build on ideas from Monte Carlo tree search to perform adaptive sampling in a manner that balances exploration with exploitation, ensures consistency, and alleviates pathologies in existing adaptive methods.

Channel Gating Neural Networks

1 code implementation NeurIPS 2019 Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, G. Edward Suh

Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2. 6$\times$ without accuracy drop on ImageNet.

Knowledge Distillation Network Pruning

Tight Bounds for Collaborative PAC Learning via Multiplicative Weights

no code implementations NeurIPS 2018 Jiecao Chen, Qin Zhang, Yuan Zhou

We study the collaborative PAC learning problem recently proposed in Blum et al.~\cite{BHPQ17}, in which we have $k$ players and they want to learn a target function collaboratively, such that the learned function approximates the target function well on all players' distributions simultaneously.

PAC learning

Hamiltonian Monte Carlo for Probabilistic Programs with Discontinuities

1 code implementation7 Apr 2018 Bradley Gram-Hansen, Yuan Zhou, Tobias Kohn, Tom Rainforth, Hongseok Yang, Frank Wood

Hamiltonian Monte Carlo (HMC) is arguably the dominant statistical inference algorithm used in most popular "first-order differentiable" Probabilistic Programming Languages (PPLs).

Probabilistic Programming

Unmixing urban hyperspectral imagery with a Gaussian mixture model on endmember variability

no code implementations25 Jan 2018 Yuan Zhou, Erin B. Wetherley, Paul D. Gader

Spectral libraries were built by manually identifying and extracting pure spectra from both resolution images, resulting in 3, 287 spectra at 16 m and 15, 426 spectra at 4 m. We then unmixed ROIs of each resolution using the following unmixing algorithms: the set-based algorithms MESMA and AAM, and the distribution-based algorithms GMM, NCM, and BCM.

A Gaussian mixture model representation of endmember variability in hyperspectral unmixing

1 code implementation29 Sep 2017 Yuan Zhou, Anand Rangarajan, Paul D. Gader

We show, given the GMM starting premise, that the distribution of the mixed pixel (under the linear mixing model) is also a GMM (and this is shown from two perspectives).

Hyperspectral Unmixing

Adaptive Multiple-Arm Identification

no code implementations ICML 2017 Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou

We study the problem of selecting $K$ arms with the highest expected rewards in a stochastic $n$-armed bandit game.

A spatial compositional model (SCM) for linear unmixing and endmember uncertainty estimation

no code implementations30 Sep 2015 Yuan Zhou, Anand Rangarajan, Paul Gader

In this paper, we show that NCM can be used for calculating the uncertainty of the estimated endmembers with spatial priors incorporated for better unmixing.

Hyperspectral Unmixing

Cannot find the paper you are looking for? You can Submit a new open access paper.