Search Results for author: Nan Jiang

Found 60 papers, 15 papers with code

Adaptable Semantic Compression and Resource Allocation for Task-Oriented Communications

no code implementations19 Apr 2022 Chuanhong Liu, Caili Guo, Yang Yang, Nan Jiang

To solve the problem, both compression ratio and resource allocation are optimized for the task-oriented communication system to maximize the success probability of tasks.

Offline Reinforcement Learning Under Value and Density-Ratio Realizability: the Power of Gaps

no code implementations25 Mar 2022 Jinglin Chen, Nan Jiang

We consider a challenging theoretical problem in offline reinforcement learning (RL): obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under only realizability-type assumptions for the function approximators.

Offline RL reinforcement-learning

ActFormer: A GAN Transformer Framework towards General Action-Conditioned 3D Human Motion Generation

no code implementations15 Mar 2022 Ziyang Song, Dongliang Wang, Nan Jiang, Zhicheng Fang, Chenjing Ding, Weihao Gan, Wei Wu

Such a design combines the strong spatio-temporal representation capacity of Transformer, superiority in generative modeling of GAN, and inherent temporal correlations from latent prior.

VRConvMF: Visual Recurrent Convolutional Matrix Factorization for Movie Recommendation

no code implementations16 Feb 2022 Zhu Wang, Honglong Chen, Zhe Li, Kai Lin, Nan Jiang, Feng Xia

Fortunately, context-aware recommender systems can alleviate the sparsity problem by making use of some auxiliary information, such as the information of both the users and items.

Recommendation Systems

Offline Reinforcement Learning with Realizability and Single-policy Concentrability

no code implementations9 Feb 2022 Wenhao Zhan, Baihe Huang, Audrey Huang, Nan Jiang, Jason D. Lee

Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong assumptions on both the function classes (e. g., Bellman-completeness) and the data coverage (e. g., all-policy concentrability).

Offline RL reinforcement-learning

Adversarially Trained Actor Critic for Offline Reinforcement Learning

1 code implementation5 Feb 2022 Ching-An Cheng, Tengyang Xie, Nan Jiang, Alekh Agarwal

We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm for offline reinforcement learning under insufficient data coverage, based on a two-player Stackelberg game framing of offline RL: A policy actor competes against an adversarially trained value critic, who finds data-consistent scenarios where the actor is inferior to the data-collection behavior policy.

Continuous Control Offline RL +1

A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

no code implementations12 Nov 2021 Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang

In this work, we first propose novel identification methods for OPE in POMDPs with latent confounders, by introducing bridge functions that link the target policy's value and the observed data distribution.

Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning

1 code implementation NeurIPS 2021 Siyuan Zhang, Nan Jiang

How to select between policies and value functions produced by different training algorithms in offline reinforcement learning (RL) -- which is crucial for hyperpa-rameter tuning -- is an important open question.

reinforcement-learning

A Fast Randomized Algorithm for Massive Text Normalization

no code implementations6 Oct 2021 Nan Jiang, Chen Luo, Vihan Lakshman, Yesh Dattatreya, Yexiang Xue

In addition, FLAN does not require any annotated data or supervised learning.

A Spectral Approach to Off-Policy Evaluation for POMDPs

no code implementations22 Sep 2021 Yash Nair, Nan Jiang

We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes, where the evaluation policy depends only on observable variables but the behavior policy depends on latent states (Tennenholtz et al. (2020a)).

Causal Identification

Bellman-consistent Pessimism for Offline Reinforcement Learning

no code implementations NeurIPS 2021 Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning.

reinforcement-learning

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

no code implementations NeurIPS 2021 Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai

This offline result is the first that matches the sample complexity lower bound in this setting, and resolves a recent open question in offline RL.

Offline RL reinforcement-learning

On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction

no code implementations2 Jun 2021 Jiawei Huang, Nan Jiang

In this paper, we study the convergence properties of off-policy policy improvement algorithms with state-action density ratio correction under function approximation setting, where the objective function is formulated as a max-max-min optimization problem.

Minimax Model Learning

no code implementations2 Mar 2021 Cameron Voloshin, Nan Jiang, Yisong Yue

We present a novel off-policy loss function for learning a transition model in model-based reinforcement learning.

Model-based Reinforcement Learning reinforcement-learning

CURE: Code-Aware Neural Machine Translation for Automatic Program Repair

1 code implementation26 Feb 2021 Nan Jiang, Thibaud Lutellier, Lin Tan

Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes.

Machine Translation Program Repair +1

SM+: Refined Scale Match for Tiny Person Detection

no code implementations6 Feb 2021 Nan Jiang, Xuehui Yu, Xiaoke Peng, Yuqi Gong, Zhenjun Han

Detecting tiny objects ( e. g., less than 20 x 20 pixels) in large-scale images is an important yet open problem.

Human Detection

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

no code implementations5 Feb 2021 Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie

We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning using function approximation for marginal importance weights and $q$-functions when these are estimated using recent minimax methods.

reinforcement-learning

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

no code implementations3 Feb 2021 Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári

We consider local planning in fixed-horizon MDPs with a generative model under the assumption that the optimal value function lies close to the span of a feature map.

Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking

1 code implementation21 Jan 2021 Nan Jiang, Kuiran Wang, Xiaoke Peng, Xuehui Yu, Qiang Wang, Junliang Xing, Guorong Li, Jian Zhao, Guodong Guo, Zhenjun Han

The releasing of such a large-scale dataset could be a useful initial step in research of tracking UAVs.

Experimental demonstration of memory-enhanced scaling for entanglement connection of quantum repeater segments

no code implementations21 Jan 2021 Yunfei Pu, Sheng Zhang, Yukai Wu, Nan Jiang, Wei Chang, Chang Li, Luming Duan

The experimental realization of entanglement connection of two quantum repeater segments with an efficient memory-enhanced scaling demonstrates a key advantage of the quantum repeater protocol, which makes a cornerstone towards future large-scale quantum networks.

Quantum Physics

Quantifying Spatial Homogeneity of Urban Road Networks via Graph Neural Networks

1 code implementation1 Jan 2021 Jiawei Xue, Nan Jiang, Senwei Liang, Qiyuan Pang, Takahiro Yabe, Satish V. Ukkusuri, Jianzhu Ma

We apply the method to 11, 790 urban road networks across 30 cities worldwide to measure the spatial homogeneity of road networks within each city and across different cities.

When Counterpoint Meets Chinese Folk Melodies

1 code implementation NeurIPS 2020 Nan Jiang, Sheng Jin, Zhiyao Duan, ChangShui Zhang

An interaction reward model is trained on the duets formed from outer parts of Bach chorales to model counterpoint interaction, while a style reward model is trained on monophonic melodies of Chinese folk songs to model melodic patterns.

A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting

no code implementations2 Nov 2020 Philip Amortila, Nan Jiang, Tengyang Xie

Recently, Wang et al. (2020) showed a highly intriguing hardness result for batch reinforcement learning (RL) with linearly realizable value function and good feature coverage in the finite-horizon case.

reinforcement-learning

Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

no code implementations23 Oct 2020 Priyank Agrawal, Jinglin Chen, Nan Jiang

This paper studies regret minimization with randomized value functions in reinforcement learning.

reinforcement-learning

The 1st Tiny Object Detection Challenge:Methods and Results

1 code implementation16 Sep 2020 Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi

The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.

Human Detection Object Detection

Analysis of Random Access in NB-IoT Networks with Three Coverage Enhancement Groups: A Stochastic Geometry Approach

no code implementations14 Sep 2020 Yan Liu, Yansha Deng, Nan Jiang, Maged Elkashlan, Arumugam Nallanathan

NarrowBand-Internet of Things (NB-IoT) is a new 3GPP radio access technology designed to provide better coverage for Low Power Wide Area (LPWA) networks.

Batch Value-function Approximation with Only Realizability

1 code implementation11 Aug 2020 Tengyang Xie, Nan Jiang

We make progress in a long-standing problem of batch reinforcement learning (RL): learning $Q^\star$ from an exploratory and polynomial-sized dataset, using a realizable and otherwise arbitrary function class.

Model Selection reinforcement-learning

A Question Type Driven and Copy Loss Enhanced Frameworkfor Answer-Agnostic Neural Question Generation

no code implementations WS 2020 Xiuyu Wu, Nan Jiang, Yunfang Wu

The answer-agnostic question generation is a significant and challenging task, which aims to automatically generate questions for a given sentence but without an answer.

Question Generation Type prediction

Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison

no code implementations9 Mar 2020 Tengyang Xie, Nan Jiang

We prove performance guarantees of two algorithms for approximating $Q^\star$ in batch reinforcement learning.

reinforcement-learning

RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning

no code implementations8 Feb 2020 Nan Jiang, Sheng Jin, Zhiyao Duan, Chang-Shui Zhang

We cast this as a reinforcement learning problem, where the generation agent learns a policy to generate a musical note (action) based on previously generated context (state).

Music Generation reinforcement-learning

Minimax Value Interval for Off-Policy Evaluation and Policy Optimization

no code implementations NeurIPS 2020 Nan Jiang, Jiawei Huang

By slightly altering the derivation of previous methods (one from each style; Uehara et al., 2020), we unify them into a single value interval that comes with a special type of double robustness: when either the value-function or the importance-weight class is well specified, the interval is valid and its length quantifies the misspecification of the other class.

Efficient Exploration

Scale Match for Tiny Person Detection

1 code implementation23 Dec 2019 Xuehui Yu, Yuqi Gong, Nan Jiang, Qixiang Ye, Zhenjun Han

In this paper, we introduce a new benchmark, referred to as TinyPerson, opening up a promising directionfor tiny object detection in a long distance and with mas-sive backgrounds.

Human Detection Object Detection

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

3 code implementations15 Nov 2019 Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue

We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications.

Experimental Design reinforcement-learning

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

no code implementations ICML 2020 Masatoshi Uehara, Jiawei Huang, Nan Jiang

We provide theoretical investigations into off-policy evaluation in reinforcement learning using function approximators for (marginalized) importance weights and value functions.

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

no code implementations23 Oct 2019 Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh

As an extension, we also consider the more challenging problem of model selection, where the state features are unknown and can be chosen from a large candidate set.

Model Selection reinforcement-learning

From Importance Sampling to Doubly Robust Policy Gradient

1 code implementation ICML 2020 Jiawei Huang, Nan Jiang

We show that on-policy policy gradient (PG) and its variance reduction variants can be derived by taking finite difference of function evaluations supplied by estimators from the importance sampling (IS) family for off-policy evaluation (OPE).

Quantum Communication between Multiplexed Atomic Quantum Memories

no code implementations5 Sep 2019 Chang Li, Nan Jiang, Yukai Wu, Wei Chang, Yunfei Pu, Sheng Zhang, Lu-Ming Duan

The use of multiplexed atomic quantum memories (MAQM) can significantly enhance the efficiency to establish entanglement in a quantum network.

Quantum Physics

On Value Functions and the Agent-Environment Boundary

no code implementations30 May 2019 Nan Jiang

When function approximation is deployed in reinforcement learning (RL), the same problem may be formulated in different ways, often by treating a pre-processing step as a part of the environment or as part of the agent.

Imitation Learning reinforcement-learning

Provably Efficient Q-Learning with Low Switching Cost

no code implementations NeurIPS 2019 Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang

We take initial steps in studying PAC-MDP algorithms with limited adaptivity, that is, algorithms that change its exploration policy as infrequently as possible during regret minimization.

Q-Learning

Information-Theoretic Considerations in Batch Reinforcement Learning

no code implementations1 May 2019 Jinglin Chen, Nan Jiang

Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL).

reinforcement-learning

Provably efficient RL with Rich Observations via Latent State Decoding

1 code implementation25 Jan 2019 Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states.

Q-Learning

Completing State Representations using Spectral Learning

no code implementations NeurIPS 2018 Nan Jiang, Alex Kulesza, Satinder Singh

A central problem in dynamical system modeling is state discovery—that is, finding a compact summary of the past that captures the information needed to predict the future.

Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches

no code implementations21 Nov 2018 Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We study the sample complexity of model-based reinforcement learning (henceforth RL) in general contextual decision processes that require strategic exploration to find a near-optimal policy.

Model-based Reinforcement Learning

LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics

no code implementations NAACL 2018 Zhen Xu, Nan Jiang, Bingquan Liu, Wenge Rong, Bowen Wu, Baoxun Wang, Zhuoran Wang, Xiaolong Wang

The experimental results have shown that our proposed corpus can be taken as a new benchmark dataset for the NRG task, and the presented metrics are promising to guide the optimization of NRG models by quantifying the diversity of the generated responses reasonably.

Machine Translation Response Generation

Image Classification Based on Quantum KNN Algorithm

no code implementations16 May 2018 Yijie Dang, Nan Jiang, Hao Hu, Zhuoxiao Ji, Wenyin Zhang

However, the usually used classification method --- the K Nearest-Neighbor algorithm has high complexity, because its two main processes: similarity computing and searching are time-consuming.

Classification General Classification +1

Markov Decision Processes with Continuous Side Information

no code implementations15 Nov 2017 Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari

Because our lower bound has an exponential dependence on the dimension, we consider a tractable linear setting where the context is used to create linear combinations of a finite set of MDPs.

Repeated Inverse Reinforcement Learning

no code implementations NeurIPS 2017 Kareem Amin, Nan Jiang, Satinder Singh

We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted.

Imitation Learning reinforcement-learning

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

no code implementations ICML 2017 Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings.

Efficient Exploration reinforcement-learning

Neural Network Architecture Optimization through Submodularity and Supermodularity

no code implementations1 Sep 2016 Junqi Jin, Ziang Yan, Kun fu, Nan Jiang, Chang-Shui Zhang

Deep learning models' architectures, including depth and width, are key factors influencing models' performance, such as test accuracy and computation time.

Optimizing Recurrent Neural Networks Architectures under Time Constraints

no code implementations29 Aug 2016 Junqi Jin, Ziang Yan, Kun fu, Nan Jiang, Chang-Shui Zhang

A greedy algorithm with bounds is suggested to solve the transformed problem.

Word Embedding based Correlation Model for Question/Answer Matching

no code implementations15 Nov 2015 Yikang Shen, Wenge Rong, Nan Jiang, Baolin Peng, Jie Tang, Zhang Xiong

With the development of community based question answering (Q&A) services, a large scale of Q&A archives have been accumulated and are an important information and knowledge resource on the web.

Question Answering Translation

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

2 code implementations11 Nov 2015 Nan Jiang, Lihong Li

We study the problem of off-policy value evaluation in reinforcement learning (RL), where one aims to estimate the value of a new policy based on data collected by a different policy.

Decision Making reinforcement-learning

Unifying Spatial and Attribute Selection for Distracter-Resilient Tracking

no code implementations CVPR 2014 Nan Jiang, Ying Wu

This paper presents a novel method to jointly determine the best spatial location and the optimal metric.

Cannot find the paper you are looking for? You can Submit a new open access paper.