Search Results for author: Jiechao Xiong

Found 23 papers, 10 papers with code

Self-Supervised Continuous Control without Policy Gradient

no code implementations1 Jan 2021 Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou

Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.

Continuous Control Policy Gradient Methods +3

TStarBot-X: An Open-Sourced and Comprehensive Study for Efficient League Training in StarCraft II Full Game

1 code implementation27 Nov 2020 Lei Han, Jiechao Xiong, Peng Sun, Xinghai Sun, Meng Fang, Qingwei Guo, Qiaobo Chen, Tengfei Shi, Hongsheng Yu, Xipeng Wu, Zhengyou Zhang

We show that with orders of less computation scale, a faithful reimplementation of AlphaStar's methods can not succeed and the proposed techniques are necessary to ensure TStarBot-X's competitive performance.

Imitation Learning Starcraft +1

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

1 code implementation25 Nov 2020 Peng Sun, Jiechao Xiong, Lei Han, Xinghai Sun, Shuxing Li, Jiawei Xu, Meng Fang, Zhengyou Zhang

This poses non-trivial difficulties for researchers or engineers and prevents the application of MARL to a broader range of real-world problems.

Dota 2 Multi-agent Reinforcement Learning +4

Zeroth-Order Supervised Policy Improvement

no code implementations11 Jun 2020 Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.

Continuous Control Policy Gradient Methods +2

Divergence-Augmented Policy Optimization

1 code implementation NeurIPS 2019 Qing Wang, Yingru Li, Jiechao Xiong, Tong Zhang

In deep reinforcement learning, policy optimization methods need to deal with issues such as function approximation and the reuse of off-policy data.

Atari Games Policy Gradient Methods +2

TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game

3 code implementations19 Sep 2018 Peng Sun, Xinghai Sun, Lei Han, Jiechao Xiong, Qing Wang, Bo Li, Yang Zheng, Ji Liu, Yongsheng Liu, Han Liu, Tong Zhang

Both TStarBot1 and TStarBot2 are able to defeat the built-in AI agents from level 1 to level 10 in a full game (1v1 Zerg-vs-Zerg game on the AbyssalReef map), noting that level 8, level 9, and level 10 are cheating agents with unfair advantages such as full vision on the whole map and resource harvest boosting.

Decision Making Starcraft +1

A Margin-based MLE for Crowdsourced Partial Ranking

no code implementations29 Jul 2018 Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order.

From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation

no code implementations8 Mar 2018 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments.

Stochastic Non-convex Ordinal Embedding with Stabilized Barzilai-Borwein Step Size

1 code implementation17 Nov 2017 Ke Ma, Jinshan Zeng, Jiechao Xiong, Qianqian Xu, Xiaochun Cao, Wei Liu, Yuan YAO

Learning representation from relative similarity comparisons, often called ordinal embedding, gains rising attention in recent years.

HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

no code implementations16 Nov 2017 Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan YAO

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains.

Exploring Outliers in Crowdsourced Ranking for QoE

no code implementations18 Jul 2017 Qianqian Xu, Ming Yan, Chendi Huang, Jiechao Xiong, Qingming Huang, Yuan YAO

Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years.

Outlier Detection

Split LBI: An Iterative Regularization Path with Structural Sparsity

no code implementations NeurIPS 2016 Chendi Huang, Xinwei Sun, Jiechao Xiong, Yuan YAO

An iterative regularization path with structural sparsity is proposed in this paper based on variable splitting and the Linearized Bregman Iteration, hence called \emph{Split LBI}.

Image Denoising Model Selection

Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation

no code implementations12 Jul 2016 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or utility function which generates their comparison behaviors in experiments.

False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking

no code implementations19 May 2016 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan YAO

With the rapid growth of crowdsourcing platforms it has become easy and relatively inexpensive to collect a dataset labeled by multiple annotators in a short time.


Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs

no code implementations28 Feb 2015 Braxton Osting, Jiechao Xiong, Qianqian Xu, Yuan YAO

In this setting, a pairwise comparison dataset is typically gathered via random sampling, either \emph{with} or \emph{without} replacement.


Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels

no code implementations25 Jan 2015 Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Jiechao Xiong, Shaogang Gong, Yizhou Wang, Yuan YAO

In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly.

Learning-To-Rank Outlier Detection +1

Evaluating Visual Properties via Robust HodgeRank

no code implementations15 Aug 2014 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms.

Graph Sampling Outlier Detection

Sparse Recovery via Differential Inclusions

1 code implementation30 Jun 2014 Stanley Osher, Feng Ruan, Jiechao Xiong, Yuan YAO, Wotao Yin

In this paper, we recover sparse signals from their noisy linear measurements by solving nonlinear differential inclusions, which is based on the notion of inverse scale space (ISS) developed in applied mathematics.

Cannot find the paper you are looking for? You can Submit a new open access paper.