Search Results for author: Weijie Kong

Found 9 papers, 6 papers with code

Global and Local Semantic Completion Learning for Vision-Language Pre-training

1 code implementation12 Jun 2023 Rong-Cheng Tu, Yatai Ji, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu

MGSC promotes learning more representative global features, which have a great impact on the performance of downstream tasks, while MLTC reconstructs modal-fusion local tokens, further enhancing accurate comprehension of multimodal data.

cross-modal alignment Image-text Retrieval +6

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022

1 code implementation4 Jul 2022 Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, RongCheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou

In this report, we propose a video-language pretraining (VLP) based solution \cite{kevin2022egovlp} for four Ego4D challenge tasks, including Natural Language Query (NLQ), Moment Query (MQ), Object State Change Classification (OSCC), and PNR Localization (PNR).

Language Modelling Object State Change Classification

Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions with Multi-Level Representations

no code implementations7 Apr 2022 Jie Jiang, Shaobo Min, Weijie Kong, Dihong Gong, Hongfa Wang, Zhifeng Li, Wei Liu

With multi-level representations for video and text, hierarchical contrastive learning is designed to explore fine-grained cross-modal relationships, i. e., frame-word, clip-phrase, and video-sentence, which enables HCMI to achieve a comprehensive semantic comparison between video and text modalities.

 Ranked #1 on Video Retrieval on MSR-VTT-1kA (using extra training data)

Contrastive Learning Denoising +4

BLP -- Boundary Likelihood Pinpointing Networks for Accurate Temporal Action Localization

no code implementations6 Nov 2018 Weijie Kong, Nannan Li, Shan Liu, Thomas Li, Ge Li

Despite tremendous progress achieved in temporal action detection, state-of-the-art methods still suffer from the sharp performance deterioration when localizing the starting and ending temporal action boundaries.

Action Detection regression +1

Step-by-step Erasion, One-by-one Collection: A Weakly Supervised Temporal Action Detector

no code implementations9 Jul 2018 Jia-Xing Zhong, Nannan Li, Weijie Kong, Tao Zhang, Thomas H. Li, Ge Li

Weakly supervised temporal action detection is a Herculean task in understanding untrimmed videos, since no supervisory signal except the video-level category label is available on training data.

Action Detection Temporal Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.