Search Results for author: Bo He

Found 20 papers, 8 papers with code

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation8 Apr 2024 Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Question Answering Video Captioning +4

OmniVid: A Generative Framework for Universal Video Understanding

1 code implementation26 Mar 2024 Junke Wang, Dongdong Chen, Chong Luo, Bo He, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang

The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.

Action Recognition Dense Video Captioning +4

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

2 code implementations13 Nov 2023 Junke Wang, Lingchen Meng, Zejia Weng, Bo He, Zuxuan Wu, Yu-Gang Jiang

Existing visual instruction tuning methods typically prompt large language models with textual descriptions to generate instruction-following data.

Instruction Following Visual Question Answering

Towards Scalable Neural Representation for Diverse Videos

no code implementations CVPR 2023 Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e. g., NeRV, E-NeRV).

Action Recognition Video Compression

CNeRV: Content-adaptive Neural Representation for Visual Data

no code implementations18 Nov 2022 Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava

We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images).

Data Compression

Learning Semantic Correspondence with Sparse Annotations

1 code implementation15 Aug 2022 Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava

In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.

Denoising Semantic correspondence

ColdGuess: A General and Effective Relational Graph Convolutional Network to Tackle Cold Start Cases

no code implementations24 May 2022 Bo He, Xiang Song, Vincent Gao, Christos Faloutsos

It outperforms the lightgbm2 by up to 34 pcp ROC-AUC in a cold start case when a new seller sells a new product .

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization

1 code implementation CVPR 2022 Bo He, Xitong Yang, Le Kang, Zhiyu Cheng, Xin Zhou, Abhinav Shrivastava

Without the boundary information of action segments, existing methods mostly rely on multiple instance learning (MIL), where the predictions of unlabeled instances (i. e., video snippets) are supervised by classifying labeled bags (i. e., untrimmed videos).

Weakly Supervised Temporal Action Localization

NeRV: Neural Representations for Videos

3 code implementations NeurIPS 2021 Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).

Denoising Neural Network Compression +3

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

2 code implementations28 Jun 2021 Xin Zhou, Le Kang, Zhiyu Cheng, Bo He, Jingyu Xin

With rapidly evolving internet technologies and emerging tools, sports related videos generated online are increasing at an unprecedentedly fast pace.

Action Recognition Action Spotting +3

GTA: Global Temporal Attention for Video Action Understanding

no code implementations15 Dec 2020 Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava

To this end, we introduce Global Temporal Attention (GTA), which performs global temporal attention on top of spatial attention in a decoupled manner.

Action Recognition Action Understanding +1

Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle

no code implementations10 Jan 2020 Qilei Zhang, Jinying Lin, Qixin Sha, Bo He, Guangliang Li

In this paper, we proposed a deep interactive reinforcement learning method for path following of AUV by combining the advantages of deep reinforcement learning and interactive RL.

reinforcement-learning Reinforcement Learning (RL)

Improving Interactive Reinforcement Agent Planning with Human Demonstration

no code implementations18 Apr 2019 Guangliang Li, Randy Gomez, Keisuke Nakamura, Jinying Lin, Qilei Zhang, Bo He

Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path.

reinforcement-learning Reinforcement Learning (RL)

HSR: L1/2 Regularized Sparse Representation for Fast Face Recognition using Hierarchical Feature Selection

no code implementations23 Sep 2014 Bo Han, Bo He, Tingting Sun, Mengmeng Ma, Amaury Lendasse

By employing hierarchical feature selection, we can compress the scale and dimension of global dictionary, which directly contributes to the decrease of computational cost in sparse representation that our approach is strongly rooted in.

Face Recognition feature selection +1

Robust OS-ELM with a novel selective ensemble based on particle swarm optimization

no code implementations13 Aug 2014 Yang Liu, Bo He, Diya Dong, Yue Shen, Tianhong Yan, Rui Nian, Amaury Lendase

Second, an adaptive selective ensemble framework for online learning is designed to balance the robustness and complexity of the algorithm.

General Classification

LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data

no code implementations9 Aug 2014 Bo Han, Bo He, Rui Nian, Mengmeng Ma, Shujing Zhang, Minghui Li, Amaury Lendasse

Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data.

RMSE-ELM: Recursive Model based Selective Ensemble of Extreme Learning Machines for Robustness Improvement

no code implementations9 Aug 2014 Bo Han, Bo He, Mengmeng Ma, Tingting Sun, Tianhong Yan, Amaury Lendasse

It becomes a potential framework to solve robustness issue of ELM for high-dimensional blended data in the future.

Cannot find the paper you are looking for? You can Submit a new open access paper.