Search Results for author: Tan Yu

Found 23 papers, 6 papers with code

Toward Faster and Simpler Matrix Normalization via Rank-1 Update

no code implementations ECCV 2020 Tan Yu, Yunfeng Cai, Ping Li

To boost the efficiency in the GPU platform, recent methods rely on Newton-Schulz (NS) iteration to approximate the matrix square-root.

Inflate and Shrink:Enriching and Reducing Interactions for Fast Text-Image Retrieval

no code implementations EMNLP 2021 Haoliang Liu, Tan Yu, Ping Li

Through an inflating operation followed by a shrinking operation, both efficiency and accuracy of a late-interaction model are boosted.

Cross-Modal Retrieval Image Retrieval +1

Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaboration

1 code implementation17 Mar 2024 Shu Zhao, Xiaohan Zou, Tan Yu, Huijuan Xu

Meanwhile, our RebQ leverages extensive multi-modal knowledge from pre-trained LMMs to reconstruct the data of missing modality.

Continual Learning

Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations

no code implementations25 Nov 2022 Tan Yu, Ping Li

To bring back the global receptive field, window-based Vision Transformers have devoted a lot of efforts to achieving cross-window communications by developing several sophisticated operations.

object-detection Object Detection +1

R2-MLP: Round-Roll MLP for Multi-View 3D Object Recognition

2 code implementations20 Nov 2022 Shuo Chen, Tan Yu, Ping Li

Recently, vision architectures based exclusively on multi-layer perceptrons (MLPs) have gained much attention in the computer vision community.

3D Object Recognition Image Classification +1

Prompting through Prototype: A Prototype-based Prompt Learning on Pretrained Vision-Language Models

no code implementations19 Oct 2022 Yue Zhang, Hongliang Fei, Dingcheng Li, Tan Yu, Ping Li

In particular, we focus on few-shot image recognition tasks on pretrained vision-language models (PVLMs) and develop a method of prompting through prototype (PTP), where we define $K$ image prototypes and $K$ prompt prototypes.

Few-Shot Learning

Decomposing User-APP Graph into Subgraphs for Effective APP and User Embedding Learning

no code implementations13 Oct 2022 Tan Yu, Jun Zhi, Yufei Zhang, Jian Li, Hongliang Fei, Ping Li

In this paper, we formulate the APP-installation user embedding learning into a bipartite graph embedding problem.

Graph Embedding Graph Learning

Boost CTR Prediction for New Advertisements via Modeling Visual Content

no code implementations23 Sep 2022 Tan Yu, Zhipeng Jin, Jie Liu, Yi Yang, Hongliang Fei, Ping Li

To overcome the limitations of behavior ID features in modeling new ads, we exploit the visual content in ads to boost the performance of CTR prediction models.

Click-Through Rate Prediction Quantization

BOAT: Bilateral Local Attention Vision Transformer

1 code implementation31 Jan 2022 Tan Yu, Gangming Zhao, Ping Li, Yizhou Yu

To improve efficiency, recent Vision Transformers adopt local self-attention mechanisms, where self-attention is computed within local windows.

MVT: Multi-view Vision Transformer for 3D Object Recognition

2 code implementations25 Oct 2021 Shuo Chen, Tan Yu, Ping Li

Nevertheless, multi-view CNN models cannot model the communications between patches from different views, limiting its effectiveness in 3D object recognition.

3D Object Recognition Inductive Bias +1

Constructing Orthogonal Convolutions in an Explicit Manner

no code implementations ICLR 2022 Tan Yu, Jun Li, Yunfeng Cai, Ping Li

A convolution layer with an orthogonal Jacobian matrix is 1-Lipschitz in the 2-norm, making the output robust to the perturbation in input.

S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision

3 code implementations2 Aug 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

More recently, using smaller patches with a pyramid structure, Vision Permutator (ViP) and Global Filter Network (GFNet) achieve better performance than S$^2$-MLP.

Inductive Bias

Rethinking Token-Mixing MLP for MLP-based Vision Backbone

no code implementations28 Jun 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

By introducing the inductive bias from the image processing, convolution neural network (CNN) has achieved excellent performance in numerous computer vision tasks and has been established as \emph{de facto} backbone.

Inductive Bias

S$^2$-MLP: Spatial-Shift MLP Architecture for Vision

1 code implementation14 Jun 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

We discover that the token-mixing MLP is a variant of the depthwise convolution with a global reception field and spatial-specific configuration.

Cross-lingual Cross-modal Pretraining for Multimodal Retrieval

no code implementations NAACL 2021 Hongliang Fei, Tan Yu, Ping Li

Recent pretrained vision-language models have achieved impressive performance on cross-modal retrieval tasks in English.

Cross-Modal Retrieval Machine Translation +2

Cross-Probe BERT for Efficient and Effective Cross-Modal Search

no code implementations1 Jan 2021 Tan Yu, Hongliang Fei, Ping Li

Inspired by the great success of BERT in NLP tasks, many text-vision BERT models emerged recently.

Image Retrieval Retrieval

Temporal Structure Mining for Weakly Supervised Action Detection

no code implementations ICCV 2019 Tan Yu, Zhou Ren, Yuncheng Li, Enxu Yan, Ning Xu, Junsong Yuan

In TSM, each action instance is modeled as a multi-phase process and phase evolving within an action instance, i. e., the temporal structure, is exploited.

Action Detection Weakly Supervised Action Localization

Product Quantization Network for Fast Image Retrieval

no code implementations ECCV 2018 Tan Yu, Junsong Yuan, Chen Fang, Hailin Jin

Product quantization has been widely used in fast image retrieval due to its effectiveness of coding high-dimensional visual features.

Image Retrieval Quantization +1

Compressive Quantization for Fast Object Instance Search in Videos

no code implementations ICCV 2017 Tan Yu, Zhenzhen Wang, Junsong Yuan

Most of current visual search systems focus on image-to-image (point-to-point) search such as image and object retrieval.

Instance Search Object +3

Cannot find the paper you are looking for? You can Submit a new open access paper.