Search Results for author: Jun Lan

Found 17 papers, 12 papers with code

Efficient Transfer Learning for Video-language Foundation Models

1 code implementation18 Nov 2024 Haoxing Chen, Zizheng Huang, Yan Hong, Yanshuo Wang, Zhongcai Lyu, Zhuoer Xu, Jun Lan, Zhangxuan Gu

Pre-trained vision-language models provide a robust foundation for efficient transfer learning across various downstream tasks.

Action Recognition Few-Shot Learning +3

DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning

1 code implementation7 Nov 2024 Yuxuan Duan, Yan Hong, Bo Zhang, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang, Li Niu, Liqing Zhang

The recent progress in text-to-image models pretrained on large-scale datasets has enabled us to generate various images as long as we provide a text prompt describing what we want.

Attribute Disentanglement +1

Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training

1 code implementation30 Aug 2024 Zizheng Huang, Haoxing Chen, Jiaqi Li, Jun Lan, Huijia Zhu, Weiqiang Wang, LiMin Wang

Recent Vision Mamba models not only have much lower complexity for processing higher resolution images and longer videos but also the competitive performance with Vision Transformers (ViTs).

Image Classification Mamba +2

Rate Maximization for RIS-Assisted OAM Multiuser Wireless Communications

no code implementations2 Aug 2024 Jun Lan, Liping Liang, Wenchi Cheng, Wei zhang

Conventional multiple-input multiple-out (MIMO) technologies have encountered bottlenecks of significantly increasing spectrum efficiencies of wireless communications due to the low degrees of freedom in practical line-of-sight scenarios and severe path loss of high frequency carriers.

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

1 code implementation30 May 2024 Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li

We believe that the GenVideo dataset and the DeMamba module will significantly advance the field of AI-generated video detection.

DeepFake Detection Mamba +4

Supervised Contrastive Learning for Snapshot Spectral Imaging Face Anti-Spoofing

no code implementations29 May 2024 Chuanbiao Song, Yan Hong, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang

This study reveals a cutting-edge re-balanced contrastive learning strategy aimed at strengthening face anti-spoofing capabilities within facial recognition systems, with a focus on countering the challenges posed by printed photos, and highly realistic silicone or latex masks.

Contrastive Learning Face Anti-Spoofing

Conditional Prototype Rectification Prompt Learning

1 code implementation15 Apr 2024 Haoxing Chen, Yaohui Li, Zizheng Huang, Yan Hong, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang

Recent advancements in efficient transfer learning (ETL) have shown remarkable success in fine-tuning VLMs within the scenario of limited data, introducing only a few parameters to harness task-specific insights from VLMs.

Few-Shot Learning Transfer Learning

Segment Anything Model Meets Image Harmonization

no code implementations20 Dec 2023 Haoxing Chen, Yaohui Li, Zhangxuan Gu, Zhuoer Xu, Jun Lan, Huaxiong Li

Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images.

Image Harmonization Semantic Segmentation

Boosting Audio-visual Zero-shot Learning with Large Language Models

1 code implementation21 Nov 2023 Haoxing Chen, Yaohui Li, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang

Recent methods mainly focus on learning multi-modal features aligned with class names to enhance the generalization ability to unseen categories.

audio-visual learning Descriptive +1

ControlCom: Controllable Image Composition using Diffusion Model

1 code implementation19 Aug 2023 Bo Zhang, Yuxuan Duan, Jun Lan, Yan Hong, Huijia Zhu, Weiqiang Wang, Li Niu

To address these challenges, we propose a controllable image composition method that unifies four tasks in one diffusion model: image blending, image harmonization, view synthesis, and generative composition.

Image Harmonization

Realizing In-Memory Baseband Processing for Ultra-Fast and Energy-Efficient 6G

no code implementations19 Aug 2023 Qunsong Zeng, Jiawei Liu, Mingrui Jiang, Jun Lan, Yi Gong, Zhongrui Wang, Yida Li, Can Li, Jim Ignowski, Kaibin Huang

To support emerging applications ranging from holographic communications to extended reality, next-generation mobile wireless communication systems require ultra-fast and energy-efficient baseband processors.

DiffUTE: Universal Text Editing Diffusion Model

1 code implementation NeurIPS 2023 Haoxing Chen, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Xing Zheng, Yaohui Li, Changhua Meng, Huijia Zhu, Weiqiang Wang

Specifically, we build our model on a diffusion model and carefully modify the network structure to enable the model for drawing multilingual characters with the help of glyph and position information.

Self-Supervised Learning

DiffusionInst: Diffusion Model for Instance Segmentation

2 code implementations6 Dec 2022 Zhangxuan Gu, Haoxing Chen, Zhuoer Xu, Jun Lan, Changhua Meng, Weiqiang Wang

Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models with various backbones, such as ResNet and Swin Transformers.

Instance Segmentation Segmentation

Hierarchical Dynamic Image Harmonization

1 code implementation16 Nov 2022 Haoxing Chen, Zhangxuan Gu, Yaohui Li, Jun Lan, Changhua Meng, Weiqiang Wang, Huaxiong Li

The MGD effectively applies distinct convolution to the foreground and background, learning the representations of foreground and background regions as well as their correlations to the global harmonization, facilitating local visual consistency for the images much more efficiently.

Image Harmonization

Realizing Ultra-Fast and Energy-Efficient Baseband Processing Using Analogue Resistive Switching Memory

no code implementations7 May 2022 Qunsong Zeng, Jiawei Liu, Jun Lan, Yi Gong, Zhongrui Wang, Yida Li, Kaibin Huang

To support emerging applications ranging from holographic communications to extended reality, next-generation mobile wireless communication systems require ultra-fast and energy-efficient (UFEE) baseband processors.

XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding

1 code implementation CVPR 2022 Zhangxuan Gu, Changhua Meng, Ke Wang, Jun Lan, Weiqiang Wang, Ming Gu, Liqing Zhang

Recently, various multimodal networks for Visually-Rich Document Understanding(VRDU) have been proposed, showing the promotion of transformers by integrating visual and layout information with the text embeddings.

document understanding Optical Character Recognition (OCR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.