Search Results for author: Guanyu Cai

Found 10 papers, 8 papers with code

Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval

2 code implementations • 15 Mar 2022 • Guanyu Cai, Yixiao Ge, Binjie Zhang, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, XiaoHu Qie, Jianping Wu, Mike Zheng Shou

Recent dominant methods for video-language pre-training (VLP) learn transferable representations from the raw pixels in an end-to-end manner to achieve advanced performance on downstream video-language retrieval.

Question Answering Retrieval +4

Paper
Code

All in One: Exploring Unified Video-Language Pre-training

1 code implementation • CVPR 2023 • Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, XiaoHu Qie, Mike Zheng Shou

In this work, we for the first time introduce an end-to-end video-language model, namely \textit{all-in-one Transformer}, that embeds raw video and textual signals into joint representations using a unified backbone architecture.

Ranked #6 on TGIF-Transition on TGIF-QA (using extra training data)

Language Modelling Multiple-choice +10

274

Paper
Code

Video-Text Pre-training with Learned Regions

1 code implementation • 2 Dec 2021 • Rui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang

Video-Text pre-training aims at learning transferable representations from large-scale video-text pairs via aligning the semantics between visual and textual information.

Representation Learning Retrieval +2

Paper
Code

Object-aware Video-language Pre-training for Retrieval

1 code implementation • CVPR 2022 • Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, XiaoHu Qie, Mike Zheng Shou

In this work, we present Object-aware Transformers, an object-centric approach that extends video-language transformer to incorporate object representations.

Ranked #20 on Zero-Shot Video Retrieval on DiDeMo

Object Retrieval +2

Paper
Code

Unsupervised Adaptive Semantic Segmentation with Local Lipschitz Constraint

no code implementations • 27 May 2021 • Guanyu Cai, Lianghua He

In the first stage, we propose the local Lipschitzness regularization as the objective function to align different domains by exploiting intra-domain knowledge, which explores a promising direction for non-adversarial adaptive semantic segmentation.

Segmentation Self-Learning +2

Paper
Add Code

Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query

1 code implementation • ICCV 2021 • Guanyu Cai, Jun Zhang, Xinyang Jiang, Yifei Gong, Lianghua He, Fufu Yu, Pai Peng, Xiaowei Guo, Feiyue Huang, Xing Sun

However, the performance of existing methods suffers in real life since the user is likely to provide an incomplete description of an image, which often leads to results filled with false positives that fit the incomplete description.

Cross-Modal Retrieval Image Retrieval +1

Paper
Code

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

2 code implementations • 8 Jan 2021 • Chenyang Gao, Guanyu Cai, Xinyang Jiang, Feng Zheng, Jun Zhang, Yifei Gong, Pai Peng, Xiaowei Guo, Xing Sun

Secondly, a BERT with locality-constrained attention is proposed to obtain representations of descriptions at different scales.

Ranked #15 on Text based Person Retrieval on CUHK-PEDES

Descriptive Sentence +2

Paper
Code

Learning Smooth Representation for Unsupervised Domain Adaptation

1 code implementation • 26 May 2019 • Guanyu Cai, Lianghua He, Mengchu Zhou, Hesham Alhumade, Die Hu

When constructing a deep end-to-end model, to ensure the effectiveness and stability of unsupervised domain adaptation, three critical factors are considered in our proposed optimization strategy, i. e., the sample amount of a target domain, dimension and batchsize of samples.

Ranked #1 on Domain Adaptation on SVNH-to-MNIST

Unsupervised Domain Adaptation

Paper
Code

Virtual Conditional Generative Adversarial Networks

1 code implementation • 25 Jan 2019 • Haifeng Shi, Guanyu Cai, Yuqin Wang, Shaohua Shang, Lianghua He

All the generative paths share the same decoder network while in each path the decoder network is fed with a concatenation of a different pre-computed amplified one-hot vector and the inputted Gaussian noise.

Clustering Conditional Image Generation +1

Paper
Code

Unsupervised Domain Adaptation with Adversarial Residual Transform Networks

no code implementations • 25 Apr 2018 • Guanyu Cai, Yuqin Wang, Mengchu Zhou, Lianghua He

Domain adaptation is widely used in learning problems lacking labels.

Unsupervised Domain Adaptation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.