Search Results for author: Yiming Wu

Found 24 papers, 10 papers with code

Improving vision-language alignment with graph spiking hybrid Networks

no code implementations31 Jan 2025 Siyu Zhang, Heming Zheng, Yiming Wu, Yeming Chen

To bridge the semantic gap between vision and language (VL), it is necessary to develop a good alignment strategy, which includes handling semantic diversity, abstract representation of visual information, and generalization ability of models.

Contrastive Learning Diversity +2

MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

no code implementations29 Nov 2024 Yiming Wu, Wei Ji, Kecheng Zheng, Zicheng Wang, Dong Xu

Recently, human motion analysis has experienced great improvement due to inspiring generative models such as the denoising diffusion model and large language model.

Decoder Denoising +5

Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models

no code implementations27 Nov 2024 Yiming Wu, Huan Wang, Zhenghao Chen, Dong Xu

Additionally, we propose an \textbf{Individual Content and Motion Dynamics (ICMD)} Consistency Loss to gain comparable generation performance as larger VDM, i. e., the teacher to VDMini i. e., the student.

Model Compression Video Generation

Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach

1 code implementation3 Nov 2024 Qihe Pan, Zhen Zhao, Zicheng Wang, Sifan Long, Yiming Wu, Wei Ji, Haoran Liang, Ronghua Liang

A plethora of text-guided image editing methods has recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models especially Stable Diffusion.

Image Generation Object +1

PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis

1 code implementation24 May 2024 Zicheng Wang, Zhenghao Chen, Yiming Wu, Zhen Zhao, Luping Zhou, Dong Xu

In this study, we introduce PoinTramba, a pioneering hybrid framework that synergies the analytical power of Transformer with the remarkable computational efficiency of Mamba for enhanced point cloud analysis.

Art Analysis Computational Efficiency +1

Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

no code implementations24 May 2024 Yiming Wu, Hangfei Li, Fangfang Wang, Yilong Zhang, Ronghua Liang

In response, we propose a Self-distilled Dynamic Fusion Network to compose the multi-granularity features dynamically by considering the consistency of routing path and modality-specific information simultaneously.

Image Retrieval Retrieval

SOEDiff: Efficient Distillation for Small Object Editing

no code implementations15 May 2024 Yiming Wu, Qihe Pan, Zhen Zhao, Zicheng Wang, Sifan Long, Ronghua Liang

In this paper, we delve into a new task known as small object editing (SOE), which focuses on text-based image inpainting within a constrained, small-sized area.

Image Inpainting Object

Training-Free Unsupervised Prompt for Vision-Language Models

1 code implementation25 Apr 2024 Sifan Long, Linbin Wang, Zhen Zhao, Zichang Tan, Yiming Wu, Shengsheng Wang, Jingdong Wang

In light of this, we propose Training-Free Unsupervised Prompts (TFUP), which maximally preserves the inherent representation capabilities and enhances them with a residual connection to similarity-based prediction probabilities in a training-free and labeling-free manner.

Prompt Learning

Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds

1 code implementation27 Nov 2023 Zicheng Wang, Zhen Zhao, Yiming Wu, Luping Zhou, Dong Xu

In this work, we propose a novel framework that deeply couples the classifier and feature extractor adaption for 3D UDA, dubbed Progressive Classifier and Feature Extractor Adaptation (PCFEA).

Self-Supervised Learning Unsupervised Domain Adaptation

Panoptic Scene Graph Generation with Semantics-Prototype Learning

1 code implementation28 Jul 2023 Li Li, Wei Ji, Yiming Wu, Mengze Li, You Qin, Lina Wei, Roger Zimmermann

To promise consistency and accuracy during the transfer process, we propose to measure the invariance of representations in each predicate class, and learn unbiased prototypes of predicates with different intensities.

Graph Generation Panoptic Scene Graph Generation

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

no code implementations25 Jul 2023 Yiming Wu, Ruixiang Li, Zequn Qin, Xinhai Zhao, Xi Li

In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths.

3D Object Detection Autonomous Driving +1

MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding

no code implementations26 Dec 2022 Wei Ji, Long Chen, Yinwei Wei, Yiming Wu, Tat-Seng Chua

In this work, we propose a novel multi-resolution temporal video sentence grounding network: MRTNet, which consists of a multi-modal feature encoder, a Multi-Resolution Temporal (MRT) module, and a predictor module.

Decoder Descriptive +1

D3T-GAN: Data-Dependent Domain Transfer GANs for Few-shot Image Generation

no code implementations12 May 2022 Xintian Wu, Huanyu Wang, Yiming Wu, Xi Li

To transfer knowledge between discriminators, we design a multi-level discriminant knowledge distillation from the source discriminator to the target discriminator on both the real and fake samples.

Image Generation Knowledge Distillation +1

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

no code implementations12 May 2022 Xintian Wu, Qihang Zhang, Yiming Wu, Huanyu Wang, Songyuan Li, Lingyun Sun, Xi Li

Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion.

MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification

1 code implementation12 Oct 2021 Yiming Wu, Xintian Wu, Xi Li, Jian Tian

As a challenging task, unsupervised person ReID aims to match the same identity with query images which does not require any labeled information.

Unsupervised Person Re-Identification

Adaptive Graph Representation Learning for Video Person Re-identification

1 code implementation5 Sep 2019 Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou

While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.

Graph Representation Learning Video-Based Person Re-Identification

An Enhanced Ad Event-Prediction Method Based on Feature Engineering

no code implementations3 Jul 2019 Saeid Soheily Khah, Yiming Wu

In digital advertising, Click-Through Rate (CTR) and Conversion Rate (CVR) are very important metrics for evaluating ad performance.

Feature Engineering Marketing +1

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

1 code implementation CVPR 2019 Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha

We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors.

Bayesian Optimization Efficient Neural Network +2

Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images

no code implementations6 Oct 2018 Yiming Wu, Wei Ji, Xi Li, Gang Wang, Jianwei Yin, Fei Wu

As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images.

Hand Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.