Search Results for author: Shilin Yan

Found 9 papers, 6 papers with code

A Sanity Check for AI-generated Image Detection

1 code implementation27 Jun 2024 Shilin Yan, Ouxiang Li, Jiayin Cai, Yanbin Hao, XiaoLong Jiang, Yao Hu, Weidi Xie

This effectively enables the model to discern AI-generated images based on semantics or contextual information; Secondly, we select the highest frequency patches and the lowest frequency patches in the image, and compute the low-level patchwise features, aiming to detect AI-generated images by low-level artifacts, for example, noise pattern, anti-aliasing, etc.

Multi-Modal Generative Embedding Model

no code implementations29 May 2024 Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun

Existing models usually tackle these two types of problems by decoupling language modules into a text decoder for generation, and a text encoder for embedding.

Caption Generation Cross-Modal Retrieval +7

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

no code implementations CVPR 2024 Lingyi Hong, Shilin Yan, Renrui Zhang, Wanyun Li, Xinyu Zhou, Pinxue Guo, Kaixun Jiang, Yiting Chen, Jinglun Li, Zhaoyu Chen, Wenqiang Zhang

To evaluate the effectiveness of our general framework OneTracker, which is consisted of Foundation Tracker and Prompt Tracker, we conduct extensive experiments on 6 popular tracking tasks across 11 benchmarks and our OneTracker outperforms other models and achieves state-of-the-art performance.

Object Rgb-T Tracking +1

InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation

1 code implementation30 Nov 2023 Rongyao Fang, Shilin Yan, Zhaoyang Huang, Jingqiu Zhou, Hao Tian, Jifeng Dai, Hongsheng Li

In this work, we introduce InstructSeq, an instruction-conditioned multi-modal modeling framework that unifies diverse vision tasks through flexible natural language control and handling of both visual and textual data.

Image Captioning Referring Expression +2

PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

1 code implementation21 Sep 2023 Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei zhang

Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

Autonomous Driving Segmentation +4

Personalize Segment Anything Model with One Shot

1 code implementation4 May 2023 Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan, Xianzheng Ma, Hao Dong, Peng Gao, Hongsheng Li

Driven by large-data pre-training, Segment Anything Model (SAM) has been demonstrated as a powerful and promptable framework, revolutionizing the segmentation models.

Personalized Segmentation Segmentation +4

Cannot find the paper you are looking for? You can Submit a new open access paper.