Search Results for author: Tian Xu

Found 20 papers, 5 papers with code

AI-driven platform for systematic nomenclature and intelligent knowledge acquisition of natural medicinal materials

3 code implementations27 Dec 2023 Zijie Yang, Yongjing Yin, Chaojun Kong, Tiange Chi, Wufan Tao, Yue Zhang, Tian Xu

Natural Medicinal Materials (NMMs) have a long history of global clinical applications, accompanied by extensive informational records.

Machine Translation Management

Policy Optimization in RLHF: The Impact of Out-of-preference Data

1 code implementation17 Dec 2023 Ziniu Li, Tian Xu, Yang Yu

These methods, either explicitly or implicitly, learn a reward model from preference data and differ in the data used for policy optimization to unlock the generalization ability of the reward model.

Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning

no code implementations9 Oct 2023 Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu

MOREC learns a generalizable dynamics reward function from offline data, which is subsequently employed as a transition filter in any offline MBRL method: when generating transitions, the dynamics model generates a batch of transitions and selects the one with the highest dynamics reward value.

D4RL Model-based Reinforcement Learning +1

Provably Efficient Adversarial Imitation Learning with Unknown Transitions

1 code implementation11 Jun 2023 Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

Adversarial imitation learning (AIL), a subset of IL methods, is particularly promising, but its theoretical foundation in the presence of unknown transitions has yet to be fully developed.

Imitation Learning

Theoretical Analysis of Offline Imitation With Supplementary Dataset

1 code implementation27 Jan 2023 Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

This paper considers a situation where, besides the small amount of expert data, a supplementary dataset is available, which can be collected cheaply from sub-optimal policies.

Imitation Learning

Model Generation with Provable Coverability for Offline Reinforcement Learning

no code implementations1 Jun 2022 Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.

Offline RL Out-of-Distribution Generalization +2

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle

no code implementations22 Mar 2022 Ziniu Li, Tian Xu, Yang Yu

In particular, we demonstrate that the sample complexity of the target Q-learning algorithm in [Lee and He, 2020] is $\widetilde{\mathcal O}(|\mathcal S|^2|\mathcal A|^2 (1-\gamma)^{-5}\varepsilon^{-2})$.

Q-Learning

Rethinking ValueDice: Does It Really Improve Performance?

no code implementations5 Feb 2022 Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

First, we show that ValueDice could reduce to BC under the offline setting.

Imitation Learning

On Generalization of Adversarial Imitation Learning and Beyond

no code implementations19 Jun 2021 Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

For some MDPs, we show that vanilla AIL has a worse sample complexity than BC.

Imitation Learning

Reinforcement Learning With Sparse-Executing Actions via Sparsity Regularization

no code implementations18 May 2021 Jing-Cheng Pang, Tian Xu, Shengyi Jiang, Yu-Ren Liu, Yang Yu

Reinforcement learning (RL) has made remarkable progress in many decision-making tasks, such as Go, game playing, and robotics control.

Atari Games Decision Making +3

Generating Multi-scale Maps from Remote Sensing Images via Series Generative Adversarial Networks

no code implementations31 Mar 2021 Xu Chen, Bangguo Yin, Songqiang Chen, Haifeng Li, Tian Xu

The series strategy avoids RS-m inconsistency as inputs are high-resolution large-scale RSIs, and reduces the distribution gap in multi-scale map generation through similar pixel distributions among multi-scale maps.

Image-to-Image Translation Translation

Machine learning spatio-temporal epidemiological model to evaluate Germany-county-level COVID-19 risk

no code implementations30 Nov 2020 Lingxiao Wang, Tian Xu, Till Hannes Stoecker, Horst Stoecker, Yin Jiang, Kai Zhou

As the COVID-19 pandemic continues to ravage the world, it is of critical significance to provide a timely risk prediction of the COVID-19 in multi-level.

BIG-bench Machine Learning

Error Bounds of Imitating Policies and Environments

no code implementations NeurIPS 2020 Tian Xu, Ziniu Li, Yang Yu

In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation.

Imitation Learning Model-based Reinforcement Learning +2

Investigating Bias and Fairness in Facial Expression Recognition

no code implementations20 Jul 2020 Tian Xu, Jennifer White, Sinan Kalkan, Hatice Gunes

Recognition of expressions of emotions and affect from facial images is a well-studied research problem in the fields of affective computing and computer vision with a large number of datasets available containing facial images and corresponding expression labels.

Attribute Data Augmentation +3

On Value Discrepancy of Imitation Learning

no code implementations16 Nov 2019 Tian Xu, Ziniu Li, Yang Yu

We also show that the framework leads to the value discrepancy of GAIL in an order of O((1-\gamma)^{-1}).

Imitation Learning

Deeper Interpretability of Deep Networks

no code implementations19 Nov 2018 Tian Xu, Jiayu Zhan, Oliver G. B. Garrod, Philip H. S. Torr, Song-Chun Zhu, Robin A. A. Ince, Philippe G. Schyns

However, understanding the information represented and processed in CNNs remains in most cases challenging.

A Robust Alternating Direction Method for Constrained Hybrid Variational Deblurring Model

no code implementations31 Aug 2013 Ryan Wen Liu, Tian Xu

In this work, a new constrained hybrid variational deblurring model is developed by combining the non-convex first- and second-order total variation regularizers.

Deblurring

Cannot find the paper you are looking for? You can Submit a new open access paper.