Search Results for author: Jiechao Xiong

Found 23 papers, 10 papers with code

Self-Supervised Continuous Control without Policy Gradient

no code implementations • 1 Jan 2021 • Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou

Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.

Continuous Control Policy Gradient Methods +3

Paper
Add Code

TStarBot-X: An Open-Sourced and Comprehensive Study for Efficient League Training in StarCraft II Full Game

1 code implementation • 27 Nov 2020 • Lei Han, Jiechao Xiong, Peng Sun, Xinghai Sun, Meng Fang, Qingwei Guo, Qiaobo Chen, Tengfei Shi, Hongsheng Yu, Xipeng Wu, Zhengyou Zhang

We show that with orders of less computation scale, a faithful reimplementation of AlphaStar's methods can not succeed and the proposed techniques are necessary to ensure TStarBot-X's competitive performance.

Imitation Learning Starcraft +1

131

Paper
Code

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

1 code implementation • 25 Nov 2020 • Peng Sun, Jiechao Xiong, Lei Han, Xinghai Sun, Shuxing Li, Jiawei Xu, Meng Fang, Zhengyou Zhang

This poses non-trivial difficulties for researchers or engineers and prevents the application of MARL to a broader range of real-world problems.

Dota 2 Multi-agent Reinforcement Learning +4

131

Paper
Code

Zeroth-Order Supervised Policy Improvement

no code implementations • 11 Jun 2020 • Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.

Continuous Control Policy Gradient Methods +2

Paper
Add Code

Divergence-Augmented Policy Optimization

1 code implementation • NeurIPS 2019 • Qing Wang, Yingru Li, Jiechao Xiong, Tong Zhang

In deep reinforcement learning, policy optimization methods need to deal with issues such as function approximation and the reuse of off-policy data.

Atari Games Policy Gradient Methods +2

Paper
Code

Arena: a toolkit for Multi-Agent Reinforcement Learning

2 code implementations • 20 Jul 2019 • Qing Wang, Jiechao Xiong, Lei Han, Meng Fang, Xinghai Sun, Zhuobin Zheng, Peng Sun, Zhengyou Zhang

We introduce Arena, a toolkit for multi-agent reinforcement learning (MARL) research.

Multi-agent Reinforcement Learning OpenAI Gym +4

Paper
Code

Exponentially Weighted Imitation Learning for Batched Historical Data

1 code implementation • NeurIPS 2018 • Qing Wang, Jiechao Xiong, Lei Han, Peng Sun, Han Liu, Tong Zhang

We consider deep policy learning with only batched historical trajectories.

Imitation Learning reinforcement-learning +1

31,092

Paper
Code

Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space

5 code implementations • 10 Oct 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Lei Han, Yang Zheng, Haobo Fu, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous action space solely.

reinforcement-learning Reinforcement Learning (RL)

2,548

Paper
Code

TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game

3 code implementations • 19 Sep 2018 • Peng Sun, Xinghai Sun, Lei Han, Jiechao Xiong, Qing Wang, Bo Li, Yang Zheng, Ji Liu, Yongsheng Liu, Han Liu, Tong Zhang

Both TStarBot1 and TStarBot2 are able to defeat the built-in AI agents from level 1 to level 10 in a full game (1v1 Zerg-vs-Zerg game on the AbyssalReef map), noting that level 8, level 9, and level 10 are cheating agents with unfair advantages such as full vision on the whole map and resource harvest boosting.

Decision Making Starcraft +1

Paper
Code

A Margin-based MLE for Crowdsourced Partial Ranking

no code implementations • 29 Jul 2018 • Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order.

Paper
Add Code

From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation

no code implementations • 8 Mar 2018 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments.

Paper
Add Code

PARAMETRIZED DEEP Q-NETWORKS LEARNING: PLAYING ONLINE BATTLE ARENA WITH DISCRETE-CONTINUOUS HYBRID ACTION SPACE

1 code implementation • ICLR 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Yang Zheng, Lei Han, Haobo Fu, Xiangru Lian, Carson Eisenach, Haichuan Yang, Emmanuel Ekwedike, Bei Peng, Haoyue Gao, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider action spaces that are either discrete or continuous space.

2,548

Paper
Code

Stochastic Non-convex Ordinal Embedding with Stabilized Barzilai-Borwein Step Size

1 code implementation • 17 Nov 2017 • Ke Ma, Jinshan Zeng, Jiechao Xiong, Qianqian Xu, Xiaochun Cao, Wei Liu, Yuan YAO

Learning representation from relative similarity comparisons, often called ordinal embedding, gains rising attention in recent years.

Paper
Code

HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

no code implementations • 16 Nov 2017 • Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan YAO

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains.

Paper
Add Code

Exploring Outliers in Crowdsourced Ranking for QoE

no code implementations • 18 Jul 2017 • Qianqian Xu, Ming Yan, Chendi Huang, Jiechao Xiong, Qingming Huang, Yuan YAO

Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years.

Outlier Detection

Paper
Add Code

Boosting with Structural Sparsity: A Differential Inclusion Approach

no code implementations • 16 Apr 2017 • Chendi Huang, Xinwei Sun, Jiechao Xiong, Yuan YAO

Boosting as gradient descent algorithms is one popular method in machine learning.

Image Denoising Model Selection

Paper
Add Code

Split LBI: An Iterative Regularization Path with Structural Sparsity

no code implementations • NeurIPS 2016 • Chendi Huang, Xinwei Sun, Jiechao Xiong, Yuan YAO

An iterative regularization path with structural sparsity is proposed in this paper based on variable splitting and the Linearized Bregman Iteration, hence called \emph{Split LBI}.

Image Denoising Model Selection

Paper
Add Code

Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation

no code implementations • 12 Jul 2016 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or utility function which generates their comparison behaviors in experiments.

Paper
Add Code

False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking

no code implementations • 19 May 2016 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan YAO

With the rapid growth of crowdsourcing platforms it has become easy and relatively inexpensive to collect a dataset labeled by multiple annotators in a short time.

Position Sociology

Paper
Add Code

Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs

no code implementations • 28 Feb 2015 • Braxton Osting, Jiechao Xiong, Qianqian Xu, Yuan YAO

In this setting, a pairwise comparison dataset is typically gathered via random sampling, either \emph{with} or \emph{without} replacement.

Informativeness

Paper
Add Code

Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels

no code implementations • 25 Jan 2015 • Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Jiechao Xiong, Shaogang Gong, Yizhou Wang, Yuan YAO

In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly.

Attribute Learning-To-Rank +2

Paper
Add Code

Evaluating Visual Properties via Robust HodgeRank

no code implementations • 15 Aug 2014 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms.

Graph Sampling Outlier Detection

Paper
Add Code

Sparse Recovery via Differential Inclusions

1 code implementation • 30 Jun 2014 • Stanley Osher, Feng Ruan, Jiechao Xiong, Yuan YAO, Wotao Yin

In this paper, we recover sparse signals from their noisy linear measurements by solving nonlinear differential inclusions, which is based on the notion of inverse scale space (ISS) developed in applied mathematics.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.