Search Results for author: Mingjie Wang

Found 19 papers, 4 papers with code

Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

no code implementations14 Apr 2024 Quanxiu Wang, Hui Huang, Mingjie Wang, Yong Dai, Jinzuomu Zhong, Benlai Tang

Furthermore, a parallelized TTS frontend model is delicately devised to execute TN, PD, and PBP prediction tasks, respectively in the second stage.

Polyphone disambiguation

Enhancing Zero-shot Counting via Language-guided Exemplar Learning

no code implementations8 Feb 2024 Mingjie Wang, Jun Zhou, Yong Dai, Eric Buys, Minglun Gong

Recently, Class-Agnostic Counting (CAC) problem has garnered increasing attention owing to its intriguing generality and superior efficiency compared to Category-Specific Counting (CSC).

Object Counting Zero-Shot Counting +1

GazeCLIP: Towards Enhancing Gaze Estimation via Text Guidance

no code implementations30 Dec 2023 Jun Wang, Hao Ruan, Mingjie Wang, Chuanghui Zhang, Huachun Li, Jun Zhou

Over the past decade, visual gaze estimation has garnered growing attention within the research community, thanks to its wide-ranging application scenarios.

Gaze Estimation Image Generation

Fine-grained Text and Image Guided Point Cloud Completion with CLIP Model

no code implementations17 Aug 2023 Wei Song, Jun Zhou, Mingjie Wang, Hongchen Tan, Nannan Li, Xiuping Liu

In this work, we propose a novel multimodal fusion network for point cloud completion, which can simultaneously fuse visual and textual information to predict the semantic and geometric characteristics of incomplete shapes effectively.

Language Modelling Point Cloud Completion

CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation

no code implementations23 May 2023 Jingning Xu, Benlai Tang, Mingjie Wang, Minghao Li, Meirong Ma

Recently, talking face generation has drawn ever-increasing attention from the research community in computer vision due to its arduous challenges and widespread application scenarios, e. g. movie animation and virtual anchor.

Talking Face Generation

FER-former: Multi-modal Transformer for Facial Expression Recognition

no code implementations23 Mar 2023 Yande Li, Mingjie Wang, Minglun Gong, Yonggang Lu, Li Liu

The ever-increasing demands for intuitive interactions in Virtual Reality has triggered a boom in the realm of Facial Expression Recognition (FER).

Facial Expression Recognition Facial Expression Recognition (FER)

GCNet: Probing Self-Similarity Learning for Generalized Counting Network

no code implementations10 Feb 2023 Mingjie Wang, Yande Li, Jun Zhou, Graham W. Taylor, Minglun Gong

The class-agnostic counting (CAC) problem has caught increasing attention recently due to its wide societal applications and arduous challenges.

FedCL: Federated Multi-Phase Curriculum Learning to Synchronously Correlate User Heterogeneity

1 code implementation14 Nov 2022 Mingjie Wang, Jianxiong Guo, Weijia Jia

However, a significant challenge in FL is handling the heterogeneity of local data distribution, which often results in a drifted global model that is difficult to converge.

Federated Learning Knowledge Distillation +2

Look Closer to Your Enemy: Learning to Attack via Teacher-Student Mimicking

1 code implementation27 Jul 2022 Mingjie Wang, Jianxiong Guo, Sirui Li, Dingwen Xiao, Zhiqing Tang

Deep neural networks have significantly advanced person re-identification (ReID) applications in the realm of the industrial internet, yet they remain vulnerable.

Adversarial Attack Domain Adaptation +1

CrowdMLP: Weakly-Supervised Crowd Counting via Multi-Granularity MLP

no code implementations15 Mar 2022 Mingjie Wang, Jun Zhou, Hao Cai, Minglun Gong

Existing state-of-the-art crowd counting algorithms rely excessively on location-level annotations, which are burdensome to acquire.

Crowd Counting

Towards Realistic Visual Dubbing with Heterogeneous Sources

no code implementations17 Jan 2022 Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma

The task of few-shot visual dubbing focuses on synchronizing the lip movements with arbitrary speech input for any talking head video.

Disentanglement Talking Head Generation

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

1 code implementation14 Oct 2021 Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma

To tackle this problem, most recent AdaIN-based architectures are proposed to extract clothes and scenario features for generation.

Style Transfer Video Generation

Local Aggressive Adversarial Attacks on 3D Point Cloud

1 code implementation19 May 2021 Yiming Sun, Feng Chen, Zhiyu Chen, Mingjie Wang

However, the perturbations of global point are not effective for misleading the victim model.

Adversarial Attack Image to 3D

Improvement of Normal Estimation for PointClouds via Simplifying Surface Fitting

no code implementations21 Apr 2021 Jun Zhou, Wei Jin, Mingjie Wang, Xiuping Liu, Zhiyang Li, Zhaobin Liu

Firstly, a dynamic top-k selection strategy is introduced to better focus on the most critical points of a given patch, and the points selected by our learning method tend to fit a surface by way of a simple tangent plane, which can dramatically improve the normal estimation results of patches with sharp corners or complex patterns.

Fast and Accurate Normal Estimation for Point Cloud via Patch Stitching

no code implementations30 Mar 2021 Jun Zhou, Wei Jin, Mingjie Wang, Xiuping Liu, Zhiyang Li, Zhaobin Liu

At the stitching stage, we use the learned weights of multi-branch planar experts and distance weights between points to select the best normal from the overlapping parts.

Retrieval

STNet: Scale Tree Network with Multi-level Auxiliator for Crowd Counting

no code implementations18 Dec 2020 Mingjie Wang, Hao Cai, XianFeng Han, Jun Zhou, Minglun Gong

To battle the ingrained issue of accuracy degradation, we propose a novel and powerful network called Scale Tree Network (STNet) for accurate crowd counting.

Crowd Counting

Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd Counting

no code implementations25 May 2020 Mingjie Wang, Hao Cai, Jun Zhou, Minglun Gong

Crowd counting is an important vision task, which faces challenges on continuous scale variation within a given scene and huge density shift both within and across images.

Crowd Counting

Multi-scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets

no code implementations2 Oct 2018 Mingjie Wang, Jun Zhou, Wendong Mao, Minglun Gong

To address this problem, a regularization method named Stochastic Feature Reuse is also presented.

Cannot find the paper you are looking for? You can Submit a new open access paper.