1 code implementation • 30 May 2023 • Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang
In this paper, we study out-of-distribution (OOD) generalization of offline GCRL both theoretically and empirically to identify factors that are important.
no code implementations • 29 May 2023 • Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang
To achieve this, existing sample-efficient online RL algorithms typically consist of three components: estimation, planning, and exploration.
no code implementations • 27 Feb 2023 • Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang
Self-supervised methods have become crucial for advancing deep learning by leveraging data itself to reduce the need for expensive annotations.
no code implementations • 13 Dec 2022 • Yiqi Sun, Zhengxin Shi, Jianshen Zhang, Yongzhi Qi, Hao Hu, ZuoJun Max Shen
We first quantitatively define interpretability for data-driven forecasts and systematically review the existing forecasting algorithms from the perspective of interpretability.
no code implementations • 2 Dec 2022 • Yiqin Yang, Hao Hu, Wenzhe Li, Siyuan Li, Jun Yang, Qianchuan Zhao, Chongjie Zhang
We show that such lossless primitives can drastically improve the performance of hierarchical policies.
no code implementations • 7 Jun 2022 • Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang
The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored.
no code implementations • 21 Mar 2022 • Hao Hu, Marie-José Huguet, Mohamed Siala
Then, we lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths, that maximize the number of examples correctly classified.
1 code implementation • ICLR 2022 • Xiaoteng Ma, Yiqin Yang, Hao Hu, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang
Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data.
1 code implementation • NeurIPS 2021 • Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang
Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation.
1 code implementation • 11 Mar 2021 • Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang
Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning.
no code implementations • 1 Jan 2021 • Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang
Deep reinforcement learning algorithms generally require large amounts of data to solve a single task.
no code implementations • 17 Dec 2020 • Hao Hu, Xiao Lin, Liang Jie Wong, Qianru Yang, Baile Zhang, Yu Luo
Recent advances in engineered material technologies (e. g., photonic crystals, metamaterials, plasmonics, etc) provide valuable tools to control Cherenkov radiation.
1 code implementation • 15 Jun 2020 • Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang
Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks and achieves fast adaptation to new tasks.
1 code implementation • 27 May 2020 • Benjamin M. Althouse, Edward A. Wenger, Joel C. Miller, Samuel V. Scarpino, Antoine Allard, Laurent Hébert-Dufresne, Hao Hu
SARS-CoV-2 causing COVID-19 disease has moved rapidly around the globe, infecting millions and killing hundreds of thousands.
no code implementations • 27 Apr 2020 • Chunhua Jia, Lei Zhang, Hui Huang, Weiwei Cai, Hao Hu, Rohan Adivarekar
Multi-label networks with branches are proved to perform well in both accuracy and speed, but lacks flexibility in providing dynamic extension onto new labels due to the low efficiency of re-work on annotating and training.
no code implementations • 4 Jun 2019 • Bo wang, Hao Hu, Caixia Zhang
And when the optical center moves on the danger cylinder, accordingly the optical centers of the two other solutions of the corresponding P3P problem form a new surface, characterized by a polynomial equation of degree 12 in the optical center coordinates, called the Companion Surface of Danger Cylinder (CSDC).
no code implementations • 15 Feb 2019 • Hao Hu, Liqiang Wang, Guo-Jun Qi
Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series.
no code implementations • 30 Jan 2019 • Bo Wang, Hao Hu, Caixia Zhang
In this work, we show that when the optical center is outside of all the 6 toroids defined by the control point triangle, each positive root of the Grunert's quartic equation must correspond to a true solution of the P3P problem, and the corresponding P3P problem cannot have a unique solution, it must have either 2 positive solutions or 4 positive solutions.
no code implementations • 29 Jan 2019 • Bo wang, Hao Hu, Caixia Zhang
In this work, we provide some new geometric interpretations on the multi-solution phenomenon in the P3P problem, our main results include: (1): The necessary and sufficient condition for the P3P problem to have a pair of side-sharing solutions is the two optical centers of the solutions both lie on one of the 3 vertical planes to the base plane of control points; (2): The necessary and sufficient condition for the P3P problem to have a pair of point-sharing solutions is the two optical centers of the solutions both lie on one of the 3 so-called skewed danger cylinders;(3): If the P3P problem has other solutions in addition to a pair of side-sharing ( point-sharing) solutions, these remaining solutions must be a point-sharing ( side-sharing ) pair.
no code implementations • 16 May 2018 • Yijie Dang, Nan Jiang, Hao Hu, Zhuoxiao Ji, Wenyin Zhang
However, the usually used classification method --- the K Nearest-Neighbor algorithm has high complexity, because its two main processes: similarity computing and searching are time-consuming.
2 code implementations • CVPR 2018 • Guo-Jun Qi, Liheng Zhang, Hao Hu, Marzieh Edraki, Jingdong Wang, Xian-Sheng Hua
In this paper, we present a novel localized Generative Adversarial Net (GAN) to learn on the manifold of real data.
2 code implementations • ICML 2017 • Hao Hu, Guo-Jun Qi
Modeling temporal sequences plays a fundamental role in various modern applications and has drawn more and more attentions in the machine learning community.
no code implementations • 11 Aug 2015 • Hao Hu, Hainan Cui
Augmented reality is the art to seamlessly fuse virtual objects into real ones.
no code implementations • 6 Jun 2015 • Jun Ye, Hao Hu, Kai Li, Guo-Jun Qi, Kien A. Hua
With the prevalence of the commodity depth cameras, the new paradigm of user interfaces based on 3D motion capturing and recognition have dramatically changed the way of interactions between human and computers.