Search Results for author: Qi Cai

Found 29 papers, 6 papers with code

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

no code implementations26 May 2022 Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

For a class of POMDPs with a low-rank structure in the transition kernel, ETC attains an $O(1/\epsilon^2)$ sample complexity that scales polynomially with the horizon and the intrinsic dimension (that is, the rank).

reinforcement-learning Representation Learning

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

no code implementations20 Apr 2022 Qi Cai, Zhuoran Yang, Zhaoran Wang

In specific, we focus on a class of undercomplete POMDPs with linear function approximations, which allows the state and observation spaces to be infinite.

reinforcement-learning

BooVI: Provably Efficient Bootstrapped Value Iteration

no code implementations NeurIPS 2021 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Despite the tremendous success of reinforcement learning (RL) with function approximation, efficient exploration remains a significant challenge, both practically and theoretically.

Efficient Exploration reinforcement-learning

A Low Rank Promoting Prior for Unsupervised Contrastive Learning

no code implementations5 Aug 2021 Yu Wang, Jingyang Lin, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

In this paper, we construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning, referred to as LORAC.

Contrastive Learning Image Classification +4

A Pose-only Solution to Visual Reconstruction and Navigation

no code implementations2 Mar 2021 Qi Cai, Lilian Zhang, Yuanxin Wu, Wenxian Yu, Dewen Hu

Visual navigation and three-dimensional (3D) scene reconstruction are essential for robotics to interact with the surrounding environment.

3D Scene Reconstruction Motion Estimation +1

Optimistic Policy Optimization with General Function Approximations

no code implementations1 Jan 2021 Qi Cai, Zhuoran Yang, Csaba Szepesvari, Zhaoran Wang

Although policy optimization with neural networks has a track record of achieving state-of-the-art results in reinforcement learning on various domains, the theoretical understanding of the computational and sample efficiency of policy optimization remains restricted to linear function approximations with finite-dimensional feature representations, which hinders the design of principled, effective, and efficient algorithms.

reinforcement-learning

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

no code implementations NeurIPS 2020 Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang

Temporal-difference and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks.

Q-Learning reinforcement-learning

Segmenting Epipolar Line

no code implementations11 Oct 2020 Shengjie Li, Qi Cai, Yuanxin Wu

Identifying feature correspondence between two images is a fundamental procedure in three-dimensional computer vision.

Joint Contrastive Learning with Infinite Possibilities

1 code implementation NeurIPS 2020 Qi Cai, Yu Wang, Yingwei Pan, Ting Yao, Tao Mei

This paper explores useful modifications of the recent development in contrastive learning via novel probabilistic modeling.

Contrastive Learning

On the Global Optimality of Model-Agnostic Meta-Learning

no code implementations ICML 2020 Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

Model-agnostic meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior, while the outer level searches for the optimal shared prior by optimizing its aggregated performance over all the subtasks.

Bilevel Optimization Meta-Learning

Learning a Unified Sample Weighting Network for Object Detection

1 code implementation CVPR 2020 Qi Cai, Yingwei Pan, Yu Wang, Jingen Liu, Ting Yao, Tao Mei

To this end, we devise a general loss function to cover most region-based object detectors with various sampling strategies, and then based on it we propose a unified sample weighting network to predict a sample's task weights.

General Classification Object Detection

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

no code implementations8 Jun 2020 Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang

We aim to answer the following questions: When the function approximator is a neural network, how does the associated feature representation evolve?

Q-Learning

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate

no code implementations8 Mar 2020 Yufeng Zhang, Qi Cai, Zhuoran Yang, Zhaoran Wang

Generative adversarial imitation learning (GAIL) demonstrates tremendous success in practice, especially when combined with neural networks.

Imitation Learning reinforcement-learning

Provably Efficient Exploration in Policy Optimization

no code implementations ICML 2020 Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang

While policy-based reinforcement learning (RL) achieves tremendous successes in practice, it is significantly less understood in theory, especially compared with value-based RL.

Efficient Exploration reinforcement-learning

Neural Temporal-Difference Learning Converges to Global Optima

no code implementations NeurIPS 2019 Qi Cai, Zhuoran Yang, Jason D. Lee, Zhaoran Wang

Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning.

Q-Learning reinforcement-learning

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy

no code implementations NeurIPS 2019 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning.

reinforcement-learning

Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019

2 code implementations8 Oct 2019 Yingwei Pan, Yehao Li, Qi Cai, Yang Chen, Ting Yao

Semi-Supervised Domain Adaptation: For this task, we adopt a standard self-learning framework to construct a classifier based on the labeled source and target data, and generate the pseudo labels for unlabeled target data.

Domain Adaptation Self-Learning

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence

no code implementations ICLR 2020 Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

In detail, we prove that neural natural policy gradient converges to a globally optimal policy at a sublinear rate.

Policy Gradient Methods

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

no code implementations25 Jun 2019 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning.

reinforcement-learning

vireoJD-MM at Activity Detection in Extended Videos

no code implementations20 Jun 2019 Fuchen Long, Qi Cai, Zhaofan Qiu, Zhijian Hou, Yingwei Pan, Ting Yao, Chong-Wah Ngo

This notebook paper presents an overview and comparative analysis of our system designed for activity detection in extended videos (ActEV-PC) in ActivityNet Challenge 2019.

Action Detection Action Localization +1

Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019

no code implementations14 Jun 2019 Zhaofan Qiu, Dong Li, Yehao Li, Qi Cai, Yingwei Pan, Ting Yao

This notebook paper presents an overview and comparative analysis of our systems designed for the following three tasks in ActivityNet Challenge 2019: trimmed action recognition, dense-captioning events in videos, and spatio-temporal action localization.

Action Recognition Spatio-Temporal Action Localization

Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima

1 code implementation NeurIPS 2019 Qi Cai, Zhuoran Yang, Jason D. Lee, Zhaoran Wang

Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning.

Q-Learning reinforcement-learning

General Method for Prime-point Cyclic Convolution over the Real Field

no code implementations9 May 2019 Qi Cai, Tsung-Ching Lin, Yuanxin Wu, Wenxian Yu, Trieu-Kien Truong

A general and fast method is conceived for computing the cyclic convolution of n points, where n is a prime number.

Exploring Object Relation in Mean Teacher for Cross-Domain Detection

1 code implementation CVPR 2019 Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Ling-Yu Duan, Ting Yao

The whole architecture is then optimized with three consistency regularizations: 1) region-level consistency to align the region-level predictions between teacher and student, 2) inter-graph consistency for matching the graph structures between teacher and student, and 3) intra-graph consistency to enhance the similarity between regions of same class within the graph of student.

Unsupervised Domain Adaptation

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

no code implementations11 Jan 2019 Qi Cai, Mingyi Hong, Yongxin Chen, Zhaoran Wang

We study the global convergence of generative adversarial imitation learning for linear quadratic regulators, which is posed as minimax optimization.

Imitation Learning reinforcement-learning

Equivalent Constraints for Two-View Geometry: Pose Solution/Pure Rotation Identification and 3D Reconstruction

no code implementations13 Oct 2018 Qi Cai, Yuanxin Wu, Lilian Zhang, Peike Zhang

The PPO constraints are simplified and formulated in the form of inequalities to directly identify the right pose solution with no need of 3D reconstruction and the 3D reconstruction can be analytically achieved from the identified right pose.

3D Reconstruction Pose Estimation

Memory Matching Networks for One-Shot Image Recognition

no code implementations CVPR 2018 Qi Cai, Yingwei Pan, Ting Yao, Chenggang Yan, Tao Mei

In this paper, we introduce the new ideas of augmenting Convolutional Neural Networks (CNNs) with Memory and learning to learn the network parameters for the unlabelled images on the fly in one-shot learning.

One-Shot Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.