Search Results for author: Ziyun Zeng

Found 11 papers, 10 papers with code

GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval

1 code implementation8 Oct 2023 Yuting Wang, Jinpeng Wang, Bin Chen, Ziyun Zeng, Shu-Tao Xia

Current PRVR methods adopt scanning-based clip construction to achieve explicit clip modeling, which is information-redundant and requires a large storage overhead.

Partially Relevant Video Retrieval Retrieval +1

Making LLaMA SEE and Draw with SEED Tokenizer

1 code implementation2 Oct 2023 Yuying Ge, Sijie Zhao, Ziyun Zeng, Yixiao Ge, Chen Li, Xintao Wang, Ying Shan

We identify two crucial design principles: (1) Image tokens should be independent of 2D physical patch positions and instead be produced with a 1D causal dependency, exhibiting intrinsic interdependence that aligns with the left-to-right autoregressive prediction mechanism in LLMs.

multimodal generation

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

1 code implementation28 Aug 2023 Xudong Wang, Ishan Misra, Ziyun Zeng, Rohit Girdhar, Trevor Darrell

Existing approaches to unsupervised video instance segmentation typically rely on motion estimates and experience difficulties tracking small or divergent motions.

Instance Segmentation Optical Flow Estimation +5

MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation

1 code implementation22 Aug 2023 Jinpeng Wang, Ziyun Zeng, Yunxiao Wang, Yuting Wang, Xingyu Lu, Tianxiang Li, Jun Yuan, Rui Zhang, Hai-Tao Zheng, Shu-Tao Xia

We propose MISSRec, a multi-modal pre-training and transfer learning framework for SR. On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests while a novel interest-aware decoder is developed to grasp item-modality-interest relations for better sequence representation.

Contrastive Learning Sequential Recommendation +1

Planting a SEED of Vision in Large Language Model

1 code implementation16 Jul 2023 Yuying Ge, Yixiao Ge, Ziyun Zeng, Xintao Wang, Ying Shan

Research on image tokenizers has previously reached an impasse, as frameworks employing quantized visual tokens have lost prominence due to subpar performance and convergence in multimodal comprehension (compared to BLIP-2, etc.)

Language Modelling Large Language Model +1

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale

1 code implementation23 May 2023 Ziyun Zeng, Yixiao Ge, Zhan Tong, Xihui Liu, Shu-Tao Xia, Ying Shan

We argue that tuning a text encoder end-to-end, as done in previous work, is suboptimal since it may overfit in terms of styles, thereby losing its original generalization ability to capture the semantics of various language registers.

Representation Learning

Contrastive Masked Autoencoders for Self-Supervised Video Hashing

1 code implementation21 Nov 2022 Yuting Wang, Jinpeng Wang, Bin Chen, Ziyun Zeng, Shutao Xia

To capture video semantic information for better hashing learning, we adopt an encoder-decoder structure to reconstruct the video from its temporal-masked frames.

Retrieval Video Retrieval +2

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

1 code implementation CVPR 2023 Ziyun Zeng, Yuying Ge, Xihui Liu, Bin Chen, Ping Luo, Shu-Tao Xia, Yixiao Ge

Pre-training on large-scale video data has become a common recipe for learning transferable spatiotemporal representations in recent years.

Descriptive Representation Learning +1

Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

1 code implementation7 Feb 2022 Jinpeng Wang, Bin Chen, Dongliang Liao, Ziyun Zeng, Gongfu Li, Shu-Tao Xia, Jin Xu

By performing Asymmetric-Quantized Contrastive Learning (AQ-CL) across views, HCQ aligns texts and videos at coarse-grained and multiple fine-grained levels.

Contrastive Learning Quantization +4

Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

1 code implementation11 Sep 2021 Jinpeng Wang, Ziyun Zeng, Bin Chen, Tao Dai, Shu-Tao Xia

The high efficiency in computation and storage makes hashing (including binary hashing and quantization) a common strategy in large-scale retrieval systems.

Contrastive Learning Deep Hashing +1

Cannot find the paper you are looking for? You can Submit a new open access paper.