no code implementations • 6 Jun 2023 • Minting Pan, Yitao Zheng, Wendong Zhang, Yunbo Wang, Xiaokang Yang
Pretraining RL models on offline video datasets is a promising way to improve their training efficiency in online tasks, but challenging due to the inherent mismatch in tasks, dynamics, and behaviors across domains.
no code implementations • 24 May 2023 • Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang
Training visual reinforcement learning (RL) models in offline datasets is challenging due to overfitting issues in representation learning and overestimation problems in value function.
no code implementations • 30 Apr 2023 • Siyu Gao, Yanpeng Zhao, Yunbo Wang, Xiaokang Yang
Understanding the compositional dynamics of multiple objects in unsupervised visual environments is challenging, and existing object-centric representation learning methods often ignore 3D consistency in scene decomposition.
1 code implementation • 27 Mar 2023 • Minting Pan, Xiangming Zhu, Yunbo Wang, Xiaokang Yang
On top of our previous work, we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups.
1 code implementation • 12 Mar 2023 • Wendong Zhang, Geng Chen, Xiangming Zhu, Siyu Gao, Yunbo Wang, Xiaokang Yang
In this paper, we present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
no code implementations • 12 Mar 2023 • Haijian Chen, Wendong Zhang, Yunbo Wang, Xiaokang Yang
Masked image modeling is a promising self-supervised learning method for visual data.
2 code implementations • 27 May 2022 • Minting Pan, Xiangming Zhu, Yunbo Wang, Xiaokang Yang
First, by optimizing the inverse dynamics, we encourage the world model to learn controllable and noncontrollable sources of spatiotemporal changes on isolated state transition branches.
no code implementations • CVPR 2021 • Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long
It is a challenging problem due to the substantial geometry shift from simulated to real data, such that most existing 3D models underperform due to overfitting the complete geometries in the source domain.
1 code implementation • CVPR 2022 • Geng Chen, Wendong Zhang, Han Lu, Siyu Gao, Yunbo Wang, Mingsheng Long, Xiaokang Yang
Can we develop predictive learning algorithms that can deal with more realistic, non-stationary physical environments?
no code implementations • 3 Mar 2022 • Shanyan Guan, Huayu Deng, Yunbo Wang, Xiaokang Yang
Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids.
1 code implementation • 8 Dec 2021 • Wendong Zhang, Yunbo Wang, Bingbing Ni, Xiaokang Yang
We train the prior learner and the image generator as a unified model without any post-processing.
no code implementations • NeurIPS 2021 • Zeng Yihan, Chunwei Wang, Yunbo Wang, Hang Xu, Chaoqiang Ye, Zhen Yang, Chao Ma
First, 3D-CoCo is inspired by our observation that the bird-eye-view (BEV) features are more transferable than low-level geometry features.
1 code implementation • 7 Nov 2021 • Shanyan Guan, Jingwei Xu, Michelle Z. He, Yunbo Wang, Bingbing Ni, Xiaokang Yang
We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos, where performance of existing SMPL-based models are significantly affected by the distribution shift represented by different camera parameters, bone lengths, backgrounds, and occlusions.
Ranked #1 on
3D Absolute Human Pose Estimation
on Surreal
1 code implementation • 8 Oct 2021 • Zhiyu Yao, Yunbo Wang, Haixu Wu, Jianmin Wang, Mingsheng Long
To this end, we propose ModeRNN, which introduces a novel method to learn structured hidden representations between recurrent states.
1 code implementation • 14 Jun 2021 • Wendong Zhang, Junwei Zhu, Ying Tai, Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang
Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator.
no code implementations • CVPR 2021 • Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long
It is a challenging problem due to the substantial geometry shift from simulated to real data, such that most existing 3D models underperform due to overfitting the complete geometries in the source domain.
1 code implementation • CVPR 2021 • Shanyan Guan, Jingwei Xu, Yunbo Wang, Bingbing Ni, Xiaokang Yang
This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos.
Ranked #29 on
3D Human Pose Estimation
on 3DPW
3 code implementations • 17 Mar 2021 • Yunbo Wang, Haixu Wu, Jianjin Zhang, Zhifeng Gao, Jianmin Wang, Philip S. Yu, Mingsheng Long
This paper models these structures by presenting PredRNN, a new recurrent network, in which a pair of memory cells are explicitly decoupled, operate in nearly independent transition manners, and finally form unified representations of the complex environment.
Ranked #1 on
Video Prediction
on KTH
(Cond metric)
2 code implementations • 14 Dec 2020 • Jian Liang, Dapeng Hu, Yunbo Wang, Ran He, Jiashi Feng
Furthermore, we propose a new labeling transfer strategy, which separates the target data into two splits based on the confidence of predictions (labeling information), and then employ semi-supervised learning to improve the accuracy of less-confident predictions in the target domain.
1 code implementation • 4 Dec 2020 • Jingwei Xu, Jianjin Zhang, Zhiyu Yao, Yunbo Wang
This technical report presents a solution for the 2020 Traffic4Cast Challenge.
1 code implementation • ICML 2020 • Zhiyu Yao, Yunbo Wang, Mingsheng Long, Jian-Min Wang
This paper explores a new research problem of unsupervised transfer learning across multiple spatiotemporal prediction tasks.
no code implementations • CVPR 2020 • Yunbo Wang, Jiajun Wu, Mingsheng Long, Joshua B. Tenenbaum
It is also challenging because it involves two levels of uncertainty: the perceptual uncertainty from noisy observations and the dynamics uncertainty in forward modeling.
1 code implementation • ECCV 2020 • Jian Liang, Yunbo Wang, Dapeng Hu, Ran He, Jiashi Feng
On one hand, negative transfer results in misclassification of target samples to the classes only present in the source domain.
Ranked #2 on
Partial Domain Adaptation
on DomainNet
1 code implementation • 8 Dec 2019 • Zhiyu Yao, Yunbo Wang, Jianmin Wang, Philip S. Yu, Mingsheng Long
This paper introduces video domain generalization where most video classification networks degenerate due to the lack of exposure to the target domains of divergent distributions.
1 code implementation • 28 Sep 2019 • Yunbo Wang, Bo Liu, Jiajun Wu, Yuke Zhu, Simon S. Du, Li Fei-Fei, Joshua B. Tenenbaum
A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty.
2 code implementations • 2 Aug 2019 • Kun Han, Junwen Chen, HUI ZHANG, Haiyang Xu, Yiping Peng, Yun Wang, Ning Ding, Hui Deng, Yonghu Gao, Tingwei Guo, Yi Zhang, Yahao He, Baochang Ma, Yu-Long Zhou, Kangli Zhang, Chao Liu, Ying Lyu, Chenxi Wang, Cheng Gong, Yunbo Wang, Wei Zou, Hui Song, Xiangang Li
In this paper we present DELTA, a deep learning based language technology platform.
Ranked #3 on
Text Classification
on Yahoo! Answers
no code implementations • IEEE International Conference on Multimedia and Expo (ICME) 2019 • Jianjin Zhang, Yunbo Wang, Mingsheng Long, Wang Jianmin, Philip S Yu
First, we propose a new RNN architecture for modeling the deterministic dynamics, which updates hidden states along a z-order curve to enhance the consistency of the features of mirrored layers.
Ranked #1 on
Video Prediction
on KTH
(Cond metric)
3 code implementations • ICLR 2019 • Yunbo Wang, Lu Jiang, Ming-Hsuan Yang, Li-Jia Li, Mingsheng Long, Li Fei-Fei
We first evaluate the E3D-LSTM network on widely-used future video prediction datasets and achieve the state-of-the-art performance.
Ranked #1 on
Video Prediction
on KTH
(Cond metric)
no code implementations • CVPR 2017 • Yunbo Wang, Mingsheng Long, Jian-Min Wang, Philip S. Yu
From the technical perspective, we introduce the spatiotemporal compact bilinear operator into video analysis tasks.
no code implementations • 20 Nov 2018 • Zhiyu Yao, Yunbo Wang, Mingsheng Long, Jian-Min Wang, Philip S. Yu, Jiaguang Sun
Rev2Net is shown to be effective on the classic action recognition task.
3 code implementations • CVPR 2019 • Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jian-Min Wang, Philip S. Yu
Natural spatiotemporal processes can be highly non-stationary in many ways, e. g. the low-level non-stationarity such as spatial correlations or temporal dependencies of local pixel values; and the high-level variations such as the accumulation, deformation or dissipation of radar echoes in precipitation forecasting.
Ranked #5 on
Video Prediction
on Human3.6M
1 code implementation • 17 Oct 2018 • Lingxiao He, Zhenan Sun, Yuhao Zhu, Yunbo Wang
Biometric recognition on partial captured targets is challenging, where only several partial observations of objects are available for matching.
1 code implementation • Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18} 2018 • Ziru Xu, Yunbo Wang, Mingsheng Long, Jian-Min Wang
Predicting future frames in videos remains an unsolved but challenging problem.
Ranked #3 on
Pose Prediction
on Filtered NTU RGB+D
8 code implementations • ICML 2018 • Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jian-Min Wang, Philip S. Yu
We present PredRNN++, an improved recurrent network for video predictive learning.
Ranked #1 on
Video Prediction
on KTH
(Cond metric)
no code implementations • NeurIPS 2017 • Yunbo Wang, Mingsheng Long, Jian-Min Wang, Zhifeng Gao, Philip S. Yu
The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously.
Ranked #6 on
Video Prediction
on Human3.6M