Search Results for author: Yuetian Weng

Found 5 papers, 3 papers with code

LongVLM: Efficient Long Video Understanding via Large Language Models

1 code implementation4 Apr 2024 Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang

In this way, we encode video representations that incorporate both local and global information, enabling the LLM to generate comprehensive responses for long-term videos.

Question Answering Video Question Answering +1

Mask Propagation for Efficient Video Semantic Segmentation

1 code implementation NeurIPS 2023 Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang

By reusing predictions from key frames, we circumvent the need to process a large volume of video frames individually with resource-intensive segmentors, alleviating temporal redundancy and significantly reducing computational costs.

Semantic Segmentation Video Semantic Segmentation

A Survey on Efficient Training of Transformers

no code implementations2 Feb 2023 Bohan Zhuang, Jing Liu, Zizheng Pan, Haoyu He, Yuetian Weng, Chunhua Shen

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources.

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

no code implementations21 Jul 2022 Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang

The task of action detection aims at deducing both the action category and localization of the start and end moment for each action instance in a long, untrimmed video.

Action Detection Video Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.