no code implementations • 29 Mar 2024 • Runhao Zeng, Xiaoyong Chen, Jiaming Liang, Huisi Wu, Guangzhong Cao, Yong Guo
In this paper, we extensively analyze the robustness of seven leading TAD methods and obtain some interesting findings: 1) Existing methods are particularly vulnerable to temporal corruptions, and end-to-end methods are often more susceptible than those with a pre-trained feature extractor; 2) Vulnerability mainly comes from localization error rather than classification error; 3) When corruptions occur in the middle of an action instance, TAD models tend to yield the largest performance drop.
no code implementations • 10 Dec 2023 • Kunyang Lin, Yufeng Wang, Peihao Chen, Runhao Zeng, Siyuan Zhou, Mingkui Tan, Chuang Gan
In this paper, we propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents by utilizing intrinsic rewards to learn the optimal policy for each agent.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 15 Aug 2023 • Peihao Chen, Xinyu Sun, Hongyan Zhi, Runhao Zeng, Thomas H. Li, Gaowen Liu, Mingkui Tan, Chuang Gan
We study the task of zero-shot vision-and-language navigation (ZS-VLN), a practical yet challenging problem in which an agent learns to navigate following a path described by language instructions without requiring any path-instruction annotation data.
1 code implementation • 14 Oct 2022 • Peihao Chen, Dongyu Ji, Kunyang Lin, Runhao Zeng, Thomas H. Li, Mingkui Tan, Chuang Gan
To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.
no code implementations • 1 Dec 2021 • Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
To this end, we propose a general graph convolutional module (GCM) that can be easily plugged into existing action localization methods, including two-stage and one-stage paradigms.
Ranked #2 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)
no code implementations • 22 Nov 2020 • Yihan Zheng, Zhiquan Wen, Mingkui Tan, Runhao Zeng, Qi Chen, YaoWei Wang, Qi Wu
Moreover, to capture the complex logic in a query, we construct a relational graph to represent the visual objects and their relationships, and propose a multi-step reasoning method to progressively understand the complex logic.
Ranked #2 on Referring Expression Comprehension on CLEVR-Ref+
1 code implementation • 27 Oct 2020 • Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan
We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.
Ranked #11 on Self-Supervised Action Recognition on UCF101
1 code implementation • 7 Aug 2020 • Deng Huang, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, Chuang Gan
In this work, we propose to represent the contents in the video as a location-aware graph by incorporating the location information of an object into the graph construction.
1 code implementation • CVPR 2020 • Runhao Zeng, Haoming Xu, Wenbing Huang, Peihao Chen, Mingkui Tan, Chuang Gan
The key idea of this paper is to use the distances between the frame within the ground truth and the starting (ending) frame as dense supervisions to improve the video grounding accuracy.
Natural Language Moment Retrieval Natural Language Queries +2
no code implementations • 31 Mar 2020 • Chendi Rao, JieZhang Cao, Runhao Zeng, Qi Chen, Huazhu Fu, Yanwu Xu, Mingkui Tan
In this paper, we aim to review various adversarial attack and defense methods on chest X-rays.
1 code implementation • ICCV 2019 • Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
Then we apply the GCNs over the graph to model the relations among different proposals and learn powerful representations for the action classification and localization.
Ranked #4 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)
no code implementations • 21 Jun 2019 • Fengda Zhu, Xiaojun Chang, Runhao Zeng, Mingkui Tan
We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective.