Search Results for author: Hongwei Xue

Found 8 papers, 4 papers with code

Stare at What You See: Masked Image Modeling without Reconstruction

no code implementations CVPR 2023 Hongwei Xue, Peng Gao, Hongyang Li, Yu Qiao, Hao Sun, Houqiang Li, Jiebo Luo

However, unlike the low-level features such as pixel values, we argue the features extracted by powerful teacher models already encode rich semantic correlation across regions in an intact image. This raises one question: is reconstruction necessary in Masked Image Modeling (MIM) with a teacher model?

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning

1 code implementation12 Oct 2022 Yuchong Sun, Hongwei Xue, Ruihua Song, Bei Liu, Huan Yang, Jianlong Fu

Large-scale video-language pre-training has shown significant improvement in video-language understanding tasks.

Ranked #2 on Video Retrieval on QuerYD (using extra training data)

Contrastive Learning Question Answering +3

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

1 code implementation CVPR 2022 Hongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu, Huan Yang, Jianlong Fu, Baining Guo

To enable VL pre-training, we jointly optimize the HD-VILA model by a hybrid Transformer that learns rich spatiotemporal features, and a multimodal Transformer that enforces interactions of the learned video features with diversified texts.

Retrieval Super-Resolution +4

Unifying Multimodal Transformer for Bi-directional Image and Text Generation

1 code implementation19 Oct 2021 Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu

We adopt Transformer as our unified architecture for its strong performance and task-agnostic design.

Text Generation Text-to-Image Generation

Learning Fine-Grained Motion Embedding for Landscape Animation

no code implementations6 Sep 2021 Hongwei Xue, Bei Liu, Huan Yang, Jianlong Fu, Houqiang Li, Jiebo Luo

To tackle this problem, we propose a model named FGLA to generate high-quality and realistic videos by learning Fine-Grained motion embedding for Landscape Animation.

Cannot find the paper you are looking for? You can Submit a new open access paper.