Search Results for author: Wenbin Wang

Found 19 papers, 8 papers with code

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

1 code implementation30 Sep 2024 Haoyu Zhang, Wenbin Wang, Tianshu Yu

The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness.

Multimodal Sentiment Analysis

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models

1 code implementation28 Aug 2024 Wenbin Wang, Liang Ding, Minyan Zeng, Xiabin Zhou, Li Shen, Yong Luo, DaCheng Tao

Building upon this insight, we propose Divide, Conquer and Combine (DC$^2$), a novel training-free framework for enhancing MLLM perception of HR images.

2k 4k +1

STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

3 code implementations13 Jun 2024 Yansheng Li, LinLin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

This paper constructs a large-scale dataset for SGG in large-size VHR SAI with image sizes ranging from 512 x 768 to 27, 860 x 31, 096 pixels, named STAR (Scene graph generaTion in lArge-size satellite imageRy), encompassing over 210K objects and over 400K triplets.

Graph Generation Object +3

USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

1 code implementation28 Apr 2024 Wenbin Wang, Yang song, Sanjay Jha

To prevent catastrophic forgetting and reduce storage implications for few-shot speaker adaptation, we designed two adapters and a unique adaptation procedure.

Decoder Text to Speech

Principled Preferential Bayesian Optimization

1 code implementation8 Feb 2024 Wenjie Xu, Wenbin Wang, Yuning Jiang, Bratislav Svetozarevic, Colin N. Jones

We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions.

Bayesian Optimization Gaussian Processes

Dynamic Association Learning of Self-Attention and Convolution in Image Restoration

no code implementations9 Nov 2023 Kui Jiang, Xuemei Jia, Wenxin Huang, Wenbin Wang, Zheng Wang, Junjun Jiang

Thus, we propose to refine background textures with the predicted degradation prior in an association learning manner.

Decoder Image Restoration +1

Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations

no code implementations24 Aug 2023 Wenbin Wang, Yang song, Sanjay Jha

However, most current approaches suffer from the degradation of naturalness and speaker similarity when synthesizing speech for unseen speakers (i. e., speakers not in the training dataset) due to the poor generalizability of the model in out-of-distribution data.

Representation Learning Speech Synthesis +2

Pose-disentangled Contrastive Learning for Self-supervised Facial Representation

1 code implementation CVPR 2023 Yuanyuan Liu, Wenbin Wang, Yibing Zhan, Shaoze Feng, Kejun Liu, Zhe Chen

Self-supervised facial representation has recently attracted increasing attention due to its ability to perform face understanding without relying on large-scale annotated datasets heavily.

Contrastive Learning Data Augmentation +7

AutoLV: Automatic Lecture Video Generator

no code implementations19 Sep 2022 Wenbin Wang, Yang song, Sanjay Jha

We propose an end-to-end lecture video generation system that can generate realistic and complete lecture videos directly from annotated slides, instructor's reference voice and instructor's reference portrait video.

Speech Synthesis Talking Head Generation +1

Expression Snippet Transformer for Robust Video-based Facial Expression Recognition

no code implementations17 Sep 2021 Yuanyuan Liu, Wenbin Wang, Chuanxu Feng, Haoyu Zhang, Zhe Chen, Yibing Zhan

To this end, we propose to decompose each video into a series of expression snippets, each of which contains a small number of facial movements, and attempt to augment the Transformer's ability for modeling intra-snippet and inter-snippet visual relations, respectively, obtaining the Expression snippet Transformer (EST).

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Topic Scene Graph Generation by Attention Distillation From Caption

no code implementations ICCV 2021 Wenbin Wang, Ruiping Wang, Xilin Chen

To this end, we let the scene graph borrow the ability from the image caption so that it can be a specialist on the basis of remaining all-around, resulting in the so-called Topic Scene Graph.

Caption Generation Graph Generation +1

Magnetoelectric coupling and decoupling in multiferroic hexagonal YbFeO3 thin films

no code implementations13 Nov 2020 Yu Yun, Xin Li, Arashdeep Singh Thind, Yuewei Yin, Hao liu, Qiang Li, Wenbin Wang, Alpha T. N Diaye, Corbyn Mellinger, Xuanyuan Jiang, Rohan Mishra, Xiaoshan Xu

The coupling between ferroelectric and magnetic orders in multiferroic materials and the nature of magnetoelectric (ME) effects are enduring experimental challenges.

Materials Science Other Condensed Matter

Exploring Context and Visual Pattern of Relationship for Scene Graph Generation

no code implementations CVPR 2019 Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen

Therefore, inspired by the successful application of context to object-oriented tasks, we especially construct context for relationships where all of them are gathered so that the recognition could benefit from their association.

Diversity Graph Generation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.