Search Results for author: Yi Wan

Found 24 papers, 6 papers with code

Reward Centering

no code implementations16 May 2024 Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton

We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average.


Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model

no code implementations1 May 2024 Yu Cui, Feng Liu, Pengbo Wang, Bohao Wang, Heng Tang, Yi Wan, Jun Wang, Jiawei Chen

Owing to their powerful semantic reasoning capabilities, Large Language Models (LLMs) have been effectively utilized as recommenders, achieving impressive performance.

Knowledge Distillation Language Modelling +1

Light-weight Retinal Layer Segmentation with Global Reasoning

no code implementations25 Apr 2024 Xiang He, Weiye Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, Yi Wan

Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases.

Decoder Segmentation

ELA: Efficient Local Attention for Deep Convolutional Neural Networks

no code implementations2 Mar 2024 Wei Xu, Yi Wan

The attention mechanism has gained significant recognition in the field of computer vision due to its ability to effectively enhance the performance of deep neural networks.

Dimensionality Reduction Image Classification +2

A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays

no code implementations22 Dec 2023 Huizhen Yu, Yi Wan, Richard S. Sutton

In this paper, we study asynchronous stochastic approximation algorithms without communication delays.


Survey on video anomaly detection in dynamic scenes with moving cameras

no code implementations14 Aug 2023 Runyu Jiao, Yi Wan, Fabio Poiesi, Yiming Wang

The increasing popularity of compact and inexpensive cameras, e. g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras.

Anomaly Detection Video Anomaly Detection

Attentive Multimodal Fusion for Optical and Scene Flow

1 code implementation28 Jul 2023 Youjie Zhou, Guofeng Mei, Yiming Wang, Fabio Poiesi, Yi Wan

This paper presents an investigation into the estimation of optical and scene flow using RGBD information in scenarios where the RGB modality is affected by noise or captured in dark environments.

Imbalance Knowledge-Driven Multi-modal Network for Land-Cover Semantic Segmentation Using Images and LiDAR Point Clouds

no code implementations28 Mar 2023 Yameng Wang, Yi Wan, Yongjun Zhang, Bin Zhang, Zhi Gao

The present multi-modal methods usually map high-dimensional features to low-dimensional spaces as a preprocess before feature extraction to address the nonnegligible domain gap, which inevitably leads to information loss.

Semantic Segmentation

On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly Communicating MDPs

no code implementations30 Sep 2022 Yi Wan, Richard S. Sutton

We show two average-reward off-policy control algorithms, Differential Q-learning (Wan, Naik, & Sutton 2021a) and RVI Q-learning (Abounadi Bertsekas & Borkar 2001), converge in weakly communicating MDPs.


Toward Discovering Options that Achieve Faster Planning

no code implementations25 May 2022 Yi Wan, Richard S. Sutton

In a variant of the classic four-room domain, we show that 1) a higher objective value is typically associated with fewer number of elementary planning operations used by the option-value iteration algorithm to obtain a near-optimal value function, 2) our algorithm achieves an objective value that matches it achieved by two human-designed options 3) the amount of computation used by option-value iteration with options discovered by our algorithm matches it with the human-designed options, 4) the options produced by our algorithm also make intuitive sense--they seem to move to and terminate at the entrances of rooms.

Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods

1 code implementation25 Apr 2022 Yi Wan, Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Sarath Chandar, Harm van Seijen

We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes.

Model-based Reinforcement Learning reinforcement-learning +1

LiDAR-guided Stereo Matching with a Spatial Consistency Constraint

no code implementations21 Feb 2022 Yongjun Zhang, Siyuan Zou, Xinyi Liu, Xu Huang, Yi Wan, Yongxiang Yao

Next, we propose a riverbed enhancement function to optimize the cost volume of the LiDAR projection points and their homogeneous pixels to improve the matching robustness.

Stereo Matching

Loop closure detection using local 3D deep descriptors

1 code implementation31 Oct 2021 Youjie Zhou, Yiming Wang, Fabio Poiesi, Qi Qin, Yi Wan

We compare our L3D-based loop closure approach with recent approaches on LiDAR data and achieve state-of-the-art loop closure detection accuracy.

Loop Closure Detection

Average-Reward Learning and Planning with Options

no code implementations NeurIPS 2021 Yi Wan, Abhishek Naik, Richard S. Sutton

We extend the options framework for temporal abstraction in reinforcement learning from discounted Markov decision processes (MDPs) to average-reward MDPs.

reinforcement-learning Reinforcement Learning (RL)

Planning with Expectation Models for Control

no code implementations17 Apr 2021 Katya Kudashkina, Yi Wan, Abhishek Naik, Richard S. Sutton

Our algorithms and experiments are the first to treat MBRL with expectation models in a general setting.

Model-based Reinforcement Learning

Average-Reward Off-Policy Policy Evaluation with Function Approximation

1 code implementation8 Jan 2021 Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson

We consider off-policy policy evaluation with function approximation (FA) in average-reward MDPs, where the goal is to estimate both the reward rate and the differential value function.

Detecting Log Anomalies with Multi-Head Attention (LAMA)

no code implementations7 Jan 2021 Yicheng Guo, Yujin Wen, Congwei Jiang, Yixin Lian, Yi Wan

Anomaly detection is a crucial and challenging subject that has been studied within diverse research areas.

Anomaly Detection

Incremental Policy Gradients for Online Reinforcement Learning Control

no code implementations1 Jan 2021 Kristopher De Asis, Alan Chan, Yi Wan, Richard S. Sutton

Our emphasis is on the first approach in this work, detailing an incremental policy gradient update which neither waits until the end of the episode, nor relies on learning estimates of the return.

Policy Gradient Methods reinforcement-learning +1

Learning and Planning in Average-Reward Markov Decision Processes

1 code implementation29 Jun 2020 Yi Wan, Abhishek Naik, Richard S. Sutton

We introduce learning and planning algorithms for average-reward MDPs, including 1) the first general proven-convergent off-policy model-free control algorithm without reference states, 2) the first proven-convergent off-policy model-free prediction algorithm, and 3) the first off-policy learning algorithm that converges to the actual value function rather than to the value function plus an offset.

Off-policy Maximum Entropy Reinforcement Learning : Soft Actor-Critic with Advantage Weighted Mixture Policy(SAC-AWMP)

no code implementations7 Feb 2020 Zhimin Hou, Kuangen Zhang, Yi Wan, Dongyu Li, Chenglong Fu, Haoyong Yu

A common way to solve this problem, known as Mixture-of-Experts, is to represent the policy as the weighted sum of multiple components, where different components perform well on different parts of the state space.

Continuous Control

Planning with Expectation Models

no code implementations2 Apr 2019 Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton

In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm.

Model-based Reinforcement Learning

Clustering Assisted Fundamental Matrix Estimation

no code implementations14 Apr 2015 Hao Wu, Yi Wan

In computer vision, the estimation of the fundamental matrix is a basic problem that has been extensively studied.

3D Reconstruction Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.