Search Results for author: Ziyuan Huang

Found 33 papers, 19 papers with code

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

no code implementations • 15 Dec 2023 • Xin Guo, Jiangwei Lao, Bo Dang, Yingying Zhang, Lei Yu, Lixiang Ru, Liheng Zhong, Ziyuan Huang, Kang Wu, Dingxiang Hu, Huimei He, Jian Wang, Jingdong Chen, Ming Yang, Yongjun Zhang, Yansheng Li

Prior studies on Remote Sensing Foundation Model (RSFM) reveal immense potential towards a generic model for Earth Observation.

Contrastive Learning Earth Observation +1

Paper
Add Code

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning

1 code implementation • ICCV 2023 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang

When pre-training on the large-scale Kinetics-710, we achieve 89. 7% on Kinetics-400 with a frozen ViT-L model, which verifies the scalability of DiST.

Transfer Learning Video Recognition

Paper
Code

Towards Real-World Visual Tracking with Temporal Contexts

1 code implementation • 20 Aug 2023 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu

To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently.

Visual Tracking

153

Paper
Code

Temporally-Adaptive Models for Efficient Video Understanding

1 code implementation • 10 Aug 2023 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Yingya Zhang, Ziwei Liu, Marcelo H. Ang Jr

Spatial convolutions are extensively used in numerous deep video models.

Ranked #3 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Classification Action Recognition +1

215

Paper
Code

Rethinking Efficient Tuning Methods from a Unified Perspective

no code implementations • 1 Mar 2023 • Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Yiliang Lv, Deli Zhao, Jingren Zhou

The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.

Transfer Learning

Paper
Add Code

Physically Plausible Animation of Human Upper Body from a Single Image

no code implementations • 9 Dec 2022 • Ziyuan Huang, Zhengping Zhou, Yung-Yu Chuang, Jiajun Wu, C. Karen Liu

We present a new method for generating controllable, dynamically responsive, and photorealistic human animations.

Paper
Add Code

Progressive Learning without Forgetting

no code implementations • 28 Nov 2022 • Tao Feng, Hangjie Yuan, Mang Wang, Ziyuan Huang, Ang Bian, Jianzhou Zhang

Learning from changing tasks and sequential experience without forgetting the obtained knowledge is a challenging problem for artificial neural networks.

Continual Learning

Paper
Add Code

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

1 code implementation • ICCV 2023 • Bowen Li, Ziyuan Huang, Junjie Ye, Yiming Li, Sebastian Scherer, Hang Zhao, Changhong Fu

Visual object tracking is essential to intelligent robots.

Visual Object Tracking Visual Tracking

Paper
Code

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

3 code implementations • 5 Sep 2022 • Hangjie Yuan, Jianwen Jiang, Samuel Albanie, Tao Feng, Ziyuan Huang, Dong Ni, Mingqian Tang

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.

Ranked #16 on Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Relation +1

Paper
Code

MAR: Masked Autoencoders for Efficient Action Recognition

1 code implementation • 24 Jul 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Xiang Wang, Yuehuan Wang, Yiliang Lv, Changxin Gao, Nong Sang

Inspired by this, we propose propose Masked Action Recognition (MAR), which reduces the redundant computation by discarding a proportion of patches and operating only on a part of the videos.

Ranked #12 on Action Recognition on Something-Something V2

Action Classification Action Recognition +1

Paper
Code

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency

no code implementations • CVPR 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yi Xu, Xiang Wang, Mingqian Tang, Changxin Gao, Rong Jin, Nong Sang

In this work, we aim to learn representations by leveraging more abundant information in untrimmed videos.

Contrastive Learning Representation Learning +1

Paper
Add Code

TCTrack: Temporal Contexts for Aerial Tracking

1 code implementation • CVPR 2022 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu

Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers.

153

Paper
Code

TAda! Temporally-Adaptive Convolutions for Video Understanding

2 code implementations • ICLR 2022 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Mingqian Tang, Ziwei Liu, Marcelo H. Ang Jr

This work presents Temporally-Adaptive Convolutions (TAdaConv) for video understanding, which shows that adaptive weight calibration along the temporal dimension is an efficient way to facilitate modelling complex temporal dynamics in videos.

Ranked #67 on Action Recognition on Something-Something V2 (using extra training data)

Action Classification Action Recognition +2

215

Paper
Code

Support-Set Based Cross-Supervision for Video Grounding

no code implementations • ICCV 2021 • Xinpeng Ding, Nannan Wang, Shiwei Zhang, De Cheng, Xiaomeng Li, Ziyuan Huang, Mingqian Tang, Xinbo Gao

The contrastive objective aims to learn effective representations by contrastive learning, while the caption objective can train a powerful video encoder supervised by texts.

Contrastive Learning Video Grounding

Paper
Add Code

ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning

1 code implementation • 24 Aug 2021 • Zhiwu Qing, Ziyuan Huang, Shiwei Zhang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Rong Jin, Nong Sang

The visualizations show that ParamCrop adaptively controls the center distance and the IoU between two augmented views, and the learned change in the disparity along the training process is beneficial to learning a strong representation.

Contrastive Learning

215

Paper
Code

Exploring Stronger Feature for Temporal Action Localization

no code implementations • 24 Jun 2021 • Zhiwu Qing, Xiang Wang, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang

Temporal action localization aims to localize starting and ending time with action category.

Temporal Action Localization

Paper
Add Code

Proposal Relation Network for Temporal Action Detection

1 code implementation • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang

We calculate the detection results by assigning the proposals with corresponding classification results.

Ranked #2 on Temporal Action Localization on ActivityNet-1.3 (using extra training data)

Action Classification Action Detection +3

Paper
Code

Weakly-Supervised Temporal Action Localization Through Local-Global Background Modeling

no code implementations • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Yuanjie Shao, Nong Sang

Then our proposed Local-Global Background Modeling Network (LGBM-Net) is trained to localize instances by using only video-level labels based on Multi-Instance Learning (MIL).

Weakly-supervised Learning Weakly-supervised Temporal Action Localization +1

Paper
Add Code

Relation Modeling in Spatio-Temporal Action Localization

no code implementations • 15 Jun 2021 • Yutong Feng, Jianwen Jiang, Ziyuan Huang, Zhiwu Qing, Xiang Wang, Shiwei Zhang, Mingqian Tang, Yue Gao

This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021.

Ranked #4 on Spatio-Temporal Action Localization on AVA-Kinetics (using extra training data)

Action Detection Relation +2

Paper
Add Code

A Stronger Baseline for Ego-Centric Action Detection

1 code implementation • 13 Jun 2021 • Zhiwu Qing, Ziyuan Huang, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Nong Sang

This technical report analyzes an egocentric video action detection method we used in the 2021 EPIC-KITCHENS-100 competition hosted in CVPR2021 Workshop.

Action Detection

215

Paper
Code

Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition

1 code implementation • 9 Jun 2021 • Ziyuan Huang, Zhiwu Qing, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Zhurong Xia, Mingqian Tang, Nong Sang, Marcelo H. Ang Jr

In this paper, we present empirical results for training a stronger video vision transformer on the EPIC-KITCHENS-100 Action Recognition dataset.

Action Recognition Point Cloud Classification +1

215

Paper
Code

Multi-Scale Feature Aggregation by Cross-Scale Pixel-to-Region Relation Operation for Semantic Segmentation

no code implementations • 3 Jun 2021 • Yechao Bai, Ziyuan Huang, Lyuyu Shen, Hongliang Guo, Marcelo H. Ang Jr, Daniela Rus

Experiment results on two challenging datasets Cityscapes and COCO demonstrate that the RSP head performs competitively on both semantic segmentation and panoptic segmentation with high efficiency.

Panoptic Segmentation Relation +1

Paper
Add Code

Self-supervised Motion Learning from Static Images

1 code implementation • CVPR 2021 • Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo Ang

We furthermore introduce a static mask in pseudo motions to create local motion patterns, which forces the model to additionally locate notable motion areas for the correct classification. We demonstrate that MoSI can discover regions with large motion even without fine-tuning on the downstream datasets.

Action Recognition Self-Supervised Learning

215

Paper
Code

Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw

no code implementations • 1 Jan 2021 • Yuqi Huo, Mingyu Ding, Haoyu Lu, Zhiwu Lu, Tao Xiang, Ji-Rong Wen, Ziyuan Huang, Jianwen Jiang, Shiwei Zhang, Mingqian Tang, Songfang Huang, Ping Luo

With the constrained jigsaw puzzles, instead of solving them directly, which could still be extremely hard, we carefully design four surrogate tasks that are more solvable but meanwhile still ensure that the learned representation is sensitive to spatiotemporal continuity at both the local and global levels.

Representation Learning

Paper
Add Code

A Simple Baseline for Pose Tracking in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Shuning Chang, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Xuecheng Nie, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan

This paper presents our solution to ACM MM challenge: Large-scale Human-centric Video Analysis in Complex Events\cite{lin2020human}; specifically, here we focus on Track3: Crowd Pose Tracking in Complex Events.

Multi-Object Tracking Optical Flow Estimation +1

Paper
Add Code

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Shuning Chang, Xuecheng Nie, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data.

Optical Flow Estimation Pose Estimation

Paper
Add Code

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yunpeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Action Recognition In Videos Semantic Segmentation

Paper
Add Code

Toward Hierarchical Self-Supervised Monocular Absolute Depth Estimation for Autonomous Driving Applications

1 code implementation • 12 Apr 2020 • Feng Xue, Guirong Zhuo, Ziyuan Huang, Wufei Fu, Zhuoyue Wu, Marcelo H. Ang Jr

Our contributions are twofold: a) a novel dense connected prediction (DCP) layer is proposed to provide better object-level depth estimation and b) specifically for autonomous driving scenarios, dense geometrical constrains (DGC) is introduced so that precise scale factor can be recovered without additional cost for autonomous vehicles.

Ranked #53 on Monocular Depth Estimation on KITTI Eigen split

Autonomous Driving Monocular Depth Estimation +1

Paper
Code

AutoTrack: Towards High-Performance Visual Tracking for UAV with Automatic Spatio-Temporal Regularization

1 code implementation • CVPR 2020 • Yiming Li, Changhong Fu, Fangqiang Ding, Ziyuan Huang, Geng Lu

Considerable tests in the indoor practical scenarios have proven the effectiveness and versatility of our localization method.

Visual Tracking

Paper
Code

Keyfilter-Aware Real-Time UAV Object Tracking

1 code implementation • 11 Mar 2020 • Yiming Li, Changhong Fu, Ziyuan Huang, Yinqiang Zhang, Jia Pan

Correlation filter-based tracking has been widely applied in unmanned aerial vehicle (UAV) with high efficiency.

Object Object Tracking +2

Paper
Code

Augmented Memory for Correlation Filters in Real-Time UAV Tracking

1 code implementation • 24 Sep 2019 • Yiming Li, Changhong Fu, Fangqiang Ding, Ziyuan Huang, Jia Pan

The outstanding computational efficiency of discriminative correlation filter (DCF) fades away with various complicated improvements.

Computational Efficiency

Paper
Code

Boundary Effect-Aware Visual Tracking for UAV with Online Enhanced Background Learning and Multi-Frame Consensus Verification

1 code implementation • 10 Aug 2019 • Changhong Fu, Ziyuan Huang, Yiming Li, Ran Duan, Peng Lu

Meanwhile, convolutional features are extracted to provide a more comprehensive representation of the object.

Object Visual Object Tracking +1

Paper
Code

Learning Aberrance Repressed Correlation Filters for Real-Time UAV Tracking

1 code implementation • ICCV 2019 • Ziyuan Huang, Changhong Fu, Yiming Li, Fuling Lin, Peng Lu

Traditional framework of discriminative correlation filters (DCF) is often subject to undesired boundary effects.

Object Tracking

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.