Search Results for author: Wanhua Li

Found 22 papers, 14 papers with code

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

1 code implementation • 2 Apr 2024 • Ye Liu, Jixuan He, Wanhua Li, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen

Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to ground relevant clips in untrimmed videos given natural language queries.

Highlight Detection Moment Retrieval +4

Paper
Code

Joint-Task Regularization for Partially Labeled Multi-Task Learning

1 code implementation • 2 Apr 2024 • Kento Nishi, Junsik Kim, Wanhua Li, Hanspeter Pfister

Multi-task learning has become increasingly popular in the machine learning field, but its practicality is hindered by the need for large, labeled datasets.

Multi-Task Learning

Paper
Code

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

1 code implementation • 31 Mar 2024 • Ye Liu, Jixuan He, Wanhua Li, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen

Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to ground relevant clips in untrimmed videos given natural language queries.

Ranked #2 on Highlight Detection on QVHighlights

Highlight Detection Moment Retrieval +4

Paper
Code

TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images

no code implementations • 25 Jan 2024 • Jia Wan, Wanhua Li, Jason Ken Adhinarta, Atmadeep Banerjee, Evelina Sjostedt, Jingpeng Wu, Jeff Lichtman, Hanspeter Pfister, Donglai Wei

Furthermore, we developed a zero-shot cortical blood vessel segmentation method named TriSAM, which leverages the powerful segmentation model SAM for 3D segmentation.

Benchmarking Segmentation

Paper
Add Code

LangSplat: 3D Language Gaussian Splatting

1 code implementation • 26 Dec 2023 • Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, Hanspeter Pfister

Humans live in a 3D world and commonly use natural language to interact with a 3D scene.

Object Localization Semantic Segmentation

394

Paper
Code

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation

1 code implementation • ICCV 2023 • Devaansh Gupta, Siddhant Kharbanda, Jiawei Zhou, Wanhua Li, Hanspeter Pfister, Donglai Wei

Simultaneously, there has been an influx of multilingual pre-trained models for NMT and multimodal pre-trained models for vision-language tasks, primarily in English, which have shown exceptional generalisation ability.

Image Captioning Multimodal Machine Translation +2

Paper
Code

DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation

1 code implementation • CVPR 2023 • Shuai Shen, Wenliang Zhao, Zibin Meng, Wanhua Li, Zheng Zhu, Jie zhou, Jiwen Lu

In this way, the proposed DiffTalk is capable of producing high-quality talking head videos in synchronization with the source audio, and more importantly, it can be naturally generalized across different identities without any further fine-tuning.

Denoising Talking Head Generation

401

Paper
Code

CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering

no code implementations • ICCV 2023 • Shuai Shen, Wanhua Li, Xiaobing Wang, Dafeng Zhang, Zhezhu Jin, Jie zhou, Jiwen Lu

Furthermore, we develop a neighbor-aware proxy generator that fuses the features describing various attributes into a proxy feature to build a bridge among different sub-clusters and reduce the intra-class variance.

Attribute Clustering +2

Paper
Add Code

Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis

1 code implementation • 24 Jul 2022 • Shuai Shen, Wanhua Li, Zheng Zhu, Yueqi Duan, Jie zhou, Jiwen Lu

Thus the facial radiance field can be flexibly adjusted to the new identity with few reference images.

Talking Face Generation Talking Head Generation

329

Paper
Code

Label2Label: A Language Modeling Framework for Multi-Attribute Learning

1 code implementation • 18 Jul 2022 • Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie zhou, Jiwen Lu

As each sample is annotated with multiple attribute labels, these "words" will naturally form an unordered but meaningful "sentence", which depicts the semantic information of the corresponding sample.

Ranked #1 on Clothing Attribute Recognition on Clothing Attributes Dataset

Attribute Clothing Attribute Recognition +4

Paper
Code

MetaAge: Meta-Learning Personalized Age Estimators

1 code implementation • 12 Jul 2022 • Wanhua Li, Jiwen Lu, Abudukelimu Wuerkaixi, Jianjiang Feng, Jie zhou

Unlike most existing personalized methods that learn the parameters of a personalized estimator for each person in the training set, our method learns the mapping from identity information to age estimator parameters.

Ranked #1 on Age Estimation on ChaLearn 2015

Age Estimation Meta-Learning +1

Paper
Code

OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression

1 code implementation • 6 Jun 2022 • Wanhua Li, Xiaoke Huang, Zheng Zhu, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu

In this paper, we propose to learn the rank concepts from the rich semantic CLIP latent space.

Ranked #1 on Few-shot Age Estimation on MORPH Album2

Aesthetics Quality Assessment Few-shot Age Estimation +4

Paper
Code

3rd Place Scheme on Instance Segmentation Track of ICCV 2021 VIPriors Challenges

no code implementations • 1 Oct 2021 • PengYu Chen, Wanhua Li

In the end, our team achieved the result of 0. 366 for AP@0. 50:0. 95 on the test set, which is competitive with other top-ranking methods while only one GPU is used.

Data Augmentation Instance Segmentation +2

Paper
Add Code

Reasoning Graph Networks for Kinship Verification: from Star-shaped to Hierarchical

no code implementations • 6 Sep 2021 • Wanhua Li, Jiwen Lu, Abudukelimu Wuerkaixi, Jianjiang Feng, Jie zhou

To address this, we propose a Star-shaped Reasoning Graph Network (S-RGN).

Ranked #1 on Kinship Verification on KinFaceW-I

Kinship Verification Relational Reasoning

Paper
Add Code

Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes

1 code implementation • CVPR 2021 • Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie zhou

To address the dilemma of large-scale training and efficient inference, we propose the STructure-AwaRe Face Clustering (STAR-FC) method.

Clustering Face Clustering +1

Paper
Code

Meta-Mining Discriminative Samples for Kinship Verification

no code implementations • CVPR 2021 • Wanhua Li, Shiwei Wang, Jiwen Lu, Jianjiang Feng, Jie zhou

In the end, the samples in the unbalanced train batch are re-weighted by the learned meta-miner to optimize the kinship models.

Ranked #1 on Kinship Verification on KinFaceW-II

Kinship Verification

Paper
Add Code

Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression

1 code implementation • CVPR 2021 • Wanhua Li, Xiaoke Huang, Jiwen Lu, Jianjiang Feng, Jie zhou

An ordinal distribution constraint is proposed to exploit the ordinal nature of regression.

Ranked #2 on Age Estimation on Adience

Aesthetics Quality Assessment Age And Gender Classification +3

Paper
Code

Structure-Aware Face Clustering on a Large-Scale Graph with $\bf{10^{7}}$ Nodes

1 code implementation • 24 Mar 2021 • Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie zhou

To address the dilemma of large-scale training and efficient inference, we propose the STructure-AwaRe Face Clustering (STAR-FC) method.

Clustering Face Clustering +1

Paper
Code

Frequency-Aware Spatiotemporal Transformers for Video Inpainting Detection

no code implementations • ICCV 2021 • Bingyao Yu, Wanhua Li, Xiu Li, Jiwen Lu, Jie zhou

In this paper, we propose a frequency-aware spatiotemporal transformers for deep In this paper, we propose a Frequency-Aware Spatiotemporal Transformer (FAST) for video inpainting detection, which aims to simultaneously mine the traces of video inpainting from spatial, temporal, and frequency domains.

Video Inpainting

Paper
Add Code

Graph-Based Social Relation Reasoning

1 code implementation • ECCV 2020 • Wanhua Li, Yueqi Duan, Jiwen Lu, Jianjiang Feng, Jie zhou

Human beings are fundamentally sociable -- that we generally organize our social lives in terms of relations with other people.

Ranked #1 on Visual Social Relationship Recognition on PIPA

Relation Relational Reasoning +1

Paper
Code

Graph-based Kinship Reasoning Network

no code implementations • 22 Apr 2020 • Wanhua Li, Yingqiang Zhang, Kangchen Lv, Jiwen Lu, Jianjiang Feng, Jie zhou

In this paper, we propose a graph-based kinship reasoning (GKR) network for kinship verification, which aims to effectively perform relational reasoning on the extracted features of an image pair.

Ranked #3 on Kinship Verification on KinFaceW-II

Kinship Verification Relational Reasoning

Paper
Add Code

BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation

no code implementations • CVPR 2019 • Wanhua Li, Jiwen Lu, Jianjiang Feng, Chunjing Xu, Jie zhou, Qi Tian

Existing methods for age estimation usually apply a divide-and-conquer strategy to deal with heterogeneous data caused by the non-stationary aging process.

Ranked #2 on Age Estimation on FGNET

Age Estimation MORPH

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.