Search Results for author: Tao Gong

Found 15 papers, 7 papers with code

MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection

1 code implementation4 Mar 2024 Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, Nenghai Yu

Inspired by the recent basic model with linear complexity for long-distance modeling, called Mamba, we explore the potential of this state space model for ISTD task in terms of effectiveness and efficiency in the paper.

Sentence

Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues

no code implementations4 Feb 2024 Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu

This bidirectional interaction narrows the modality imbalance, facilitating more effective learning of integrated audio-visual representations.

Decoder Representation Learning

Towards More Unified In-context Visual Understanding

no code implementations CVPR 2024 Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Thanks to this design, the model is capable of handling in-context vision understanding tasks with multimodal output in a unified pipeline. Experimental results demonstrate that our model achieves competitive performance compared with specialized models and previous ICL baselines.

Decoder Image Captioning +2

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

1 code implementation8 May 2023 Tao Gong, Chengqi Lyu, Shilong Zhang, Yudong Wang, Miao Zheng, Qian Zhao, Kuikun Liu, Wenwei Zhang, Ping Luo, Kai Chen

To further enhance the ability to chat with humans of the MultiModal-GPT, we utilize language-only instruction-following data to train the MultiModal-GPT jointly.

Instruction Following Language Modelling

StrongSORT: Make DeepSORT Great Again

14 code implementations28 Feb 2022 Yunhao Du, Zhicheng Zhao, Yang song, Yanyun Zhao, Fei Su, Tao Gong, Hongying Meng

As a result, the construction of a good baseline for a fair comparison is essential.

Ranked #7 on Multi-Object Tracking on MOT17 (using extra training data)

Multi-Object Tracking object-detection +1

Temporal RoI Align for Video Object Recognition

1 code implementation8 Sep 2021 Tao Gong, Kai Chen, Xinjiang Wang, Qi Chu, Feng Zhu, Dahua Lin, Nenghai Yu, Huamin Feng

In this work, considering the features of the same object instance are highly similar among frames in a video, a novel Temporal RoI Align operator is proposed to extract features from other frames feature maps for current frame proposals by utilizing feature similarity.

Instance Segmentation Object +5

Mining Contextual Information Beyond Image for Semantic Segmentation

1 code implementation ICCV 2021 Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao

To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations.

Image Segmentation Segmentation +1

Towards Generalizable and Robust Face Manipulation Detection via Bag-of-local-feature

no code implementations14 Mar 2021 Changtao Miao, Qi Chu, Weihai Li, Tao Gong, Wanyi Zhuang, Nenghai Yu

Over the past several years, in order to solve the problem of malicious abuse of facial manipulation technology, face manipulation detection technology has obtained considerable attention and achieved remarkable progress.

A study of resting-state EEG biomarkers for depression recognition

no code implementations23 Feb 2020 Shuting Sun, Jianxiu Li, Huayu Chen, Tao Gong, Xiaowei Li, Bin Hu

Results: Functional connectivity feature PLI is superior to the linear features and nonlinear features.

EEG feature selection

Side-Aware Boundary Localization for More Precise Object Detection

3 code implementations ECCV 2020 Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

Object object-detection +2

A Large Scale Urban Surveillance Video Dataset for Multiple-Object Tracking and Behavior Analysis

no code implementations26 Apr 2019 Guojun Yin, Bin Liu, Huihui Zhu, Tao Gong, Nenghai Yu

Multiple-object tracking and behavior analysis have been the essential parts of surveillance video analysis for public security and urban management.

Management Multiple Object Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.