Search Results for author: Quan Wang

Found 90 papers, 48 papers with code

Eyeglasses 3D shape reconstruction from a single face image

no code implementations ECCV 2020 Yating Wang, Quan Wang, Feng Xu

A complete 3D face reconstruction requires to explicitly model the eyeglasses on the face, which is less investigated in the literature.

3D Face Reconstruction 3D Reconstruction +2

EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction

1 code implementation NAACL 2022 Benfeng Xu, Quan Wang, Yajuan Lyu, Yabing Shi, Yong Zhu, Jie Gao, Zhendong Mao

Multi-triple extraction is a challenging task due to the existence of informative inter-triple correlations, and consequently rich interactions across the constituent entities and relations. While existing works only explore entity representations, we propose to explicitly introduce relation representation, jointly represent it with entities, and novelly align them to identify valid triples. We perform comprehensive experiments on document-level relation extraction and joint entity and relation extraction along with ablations to demonstrate the advantage of the proposed method.

Document-level Relation Extraction Joint Entity and Relation Extraction

Learn and Review: Enhancing Continual Named Entity Recognition via Reviewing Synthetic Samples

no code implementations Findings (ACL) 2022 Yu Xia, Quan Wang, Yajuan Lyu, Yong Zhu, Wenhao Wu, Sujian Li, Dai Dai

However, the existing method depends on the relevance between tasks and is prone to inter-type confusion. In this paper, we propose a novel two-stage framework Learn-and-Review (L&R) for continual NER under the type-incremental setting to alleviate the above issues. Specifically, for the learning stage, we distill the old knowledge from teacher to a student on the current dataset.

Continual Named Entity Recognition named-entity-recognition +2

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

no code implementations15 Sep 2023 Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

Whether it is the conventional modularized approach or the more recent end-to-end neural diarization (EEND), an additional automatic speech recognition (ASR) model and an orchestration algorithm are required to associate the speaker labels with recognized words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models

no code implementations14 Sep 2023 Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang

We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages.

Change Detection

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

no code implementations8 Sep 2023 Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

1 code implementation24 May 2023 Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Yongdong Zhang, Zhendong Mao

The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting of prompts.

Instruction Following Language Modelling +1

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

1 code implementation CVPR 2023 Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang

Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.

Image Generation Image Reconstruction +1

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

1 code implementation24 Mar 2023 Benfeng Xu, Quan Wang, Zhendong Mao, Yajuan Lyu, Qiaoqiao She, Yongdong Zhang

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs.

CGOF++: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

no code implementations23 Nov 2022 Keqiang Sun, Shangzhe Wu, Ning Zhang, Zhaoyang Huang, Quan Wang, Hongsheng Li

Capitalizing on the recent advances in image generation models, existing controllable face image synthesis methods are able to generate high-fidelity images with some levels of controllability, e. g., controlling the shapes, expressions, textures, and poses of the generated face images.

Face Generation

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

no code implementations11 Nov 2022 Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Ignacio Lopez Moreno

Due to the sparsity of the speaker changes in the training data, the conventional T-T based SCD model loss leads to sub-optimal detection accuracy.

Change Detection

Exploring Sequence-to-Sequence Transformer-Transducer Models for Keyword Spotting

no code implementations11 Nov 2022 Beltrán Labrador, Guanlong Zhao, Ignacio López Moreno, Angelo Scorza Scarpati, Liam Fowl, Quan Wang

In this paper, we present a novel approach to adapt a sequence-to-sequence Transformer-Transducer ASR system to the keyword spotting (KWS) task.

Keyword Spotting

Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering

1 code implementation25 Oct 2022 Quan Wang, Yiling Huang, Han Lu, Guanlong Zhao, Ignacio Lopez Moreno

While recent research advances in speaker diarization mostly focus on improving the quality of diarization results, there is also an increasing interest in improving the efficiency of diarization systems.

Clustering speaker-diarization +1

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

1 code implementation20 Oct 2022 Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang

In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task.

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation

no code implementations3 Sep 2022 Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang

Text-to-image generation aims at generating realistic images which are semantically consistent with the given text.

MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection

1 code implementation 2022 2022 Jiaqi Zheng, Xi Zhang, Sanchuan Guo, Quan Wang, Wenyu Zang, Yongdong Zhang

Rumor spreaders are increasingly taking advantage of multimedia content to attract and mislead news consumers on social media.

Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation

1 code implementation19 Jul 2022 Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu

Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details.

Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

no code implementations16 Jun 2022 Keqiang Sun, Shangzhe Wu, Zhaoyang Huang, Ning Zhang, Quan Wang, Hongsheng Li

Capitalizing on the recent advances in image generation models, existing controllable face image synthesis methods are able to generate high-fidelity images with some levels of controllability, e. g., controlling the shapes, expressions, textures, and poses of the generated face images.

Face Generation

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

no code implementations8 Apr 2022 Shaojin Ding, Rajeev Rikhye, Qiao Liang, Yanzhang He, Quan Wang, Arun Narayanan, Tom O'Malley, Ian McGraw

Personalization of on-device speech recognition (ASR) has seen explosive growth in recent years, largely due to the increasing popularity of personal assistant features on mobile devices and smart home speakers.

Action Detection Activity Detection +2

Fast fluorescence lifetime imaging analysis via extreme learning machine

no code implementations25 Mar 2022 Zhenya Zang, Dong Xiao, Quan Wang, Zinuo Li, Wujun Xie, Yu Chen, David Day Uei Li

As there is no back-propagation process for ELM during the training phase, the training speed is much higher than existing neural network approaches.

Edge-computing Efficient Neural Network

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

1 code implementation CVPR 2022 Li SiYao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints.

Motion Synthesis

Parameter-Free Attentive Scoring for Speaker Verification

1 code implementation10 Mar 2022 Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio Lopez Moreno

This paper presents a novel study of parameter-free attentive scoring for speaker verification.

Speaker Verification

Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

no code implementations24 Feb 2022 Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

However, one limitation of VoiceFilter-Lite, and other speaker-conditioned speech models in general, is that these models are usually limited to a single target speaker.

Speaker Verification speech-recognition +1

A Conformer-based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation

no code implementations18 Nov 2021 Tom O'Malley, Arun Narayanan, Quan Wang, Alex Park, James Walker, Nathan Howard

Compared to the noisy baseline, the joint model reduces the word error rate in low signal-to-noise ratio conditions by at least 71% on our echo cancellation dataset, 10% on our noisy dataset, and 26% on our multi-speaker dataset.

Acoustic echo cancellation Automatic Speech Recognition +4

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

1 code implementation14 Oct 2021 Quan Wang, Songtai Dai, Benfeng Xu, Yajuan Lyu, Yong Zhu, Hua Wu, Haifeng Wang

In this work we introduce eHealth, a Chinese biomedical PLM built from scratch with a new pre-training framework.

Domain Adaptation

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

no code implementations2 Jul 2021 Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Link Prediction on N-ary Relational Facts: A Graph-based Approach

1 code implementation Findings (ACL) 2021 Quan Wang, Haifeng Wang, Yajuan Lyu, Yong Zhu

The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention.

Knowledge Graphs Link Prediction

Inverting Generative Adversarial Renderer for Face Reconstruction

no code implementations CVPR 2021 Jingtan Piao, Keqiang Sun, KwanYee Lin, Quan Wang, Hongsheng Li

Since the GAR learns to model the complicated real-world image, instead of relying on the simplified graphics rules, it is capable of producing realistic images, which essentially inhibits the domain-shift noise in training and optimization.

Face Reconstruction

Personalized Keyphrase Detection using Speaker and Environment Information

no code implementations28 Apr 2021 Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng, Huang, Arun Narayanan, Ian McGraw

In this paper, we introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition

no code implementations5 Apr 2021 Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno

In this work we propose scoring these representations in a way that can capture uncertainty, enroll/test asymmetry and additional non-linear information.

Speaker Recognition

Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech

no code implementations24 Nov 2020 Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang

In recent years, Text-To-Speech (TTS) has been used as a data augmentation technique for speech recognition to help complement inadequacies in the training data.

Data Augmentation Speaker Recognition +2

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

1 code implementation9 Sep 2020 Quan Wang, Ignacio Lopez Moreno, Mert Saglam, Kevin Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein

We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the device to preserve only the speech signals from a target user, as part of a streaming speech recognition system.

speech-recognition Speech Recognition

Textual Echo Cancellation

no code implementations13 Aug 2020 Shaojin Ding, Ye Jia, Ke Hu, Quan Wang

In this paper, we propose Textual Echo Cancellation (TEC) - a framework for cancelling the text-to-speech (TTS) playback echo from overlapping speech recordings.

Acoustic echo cancellation speech-recognition +1

Version Control of Speaker Recognition Systems

1 code implementation23 Jul 2020 Quan Wang, Ignacio Lopez Moreno

In this paper, we describe different version control strategies for speaker recognition systems that had been carefully studied at Google from years of engineering practice.

Speaker Recognition

A Comparative Study on Polyp Classification using Convolutional Neural Networks

no code implementations12 Jul 2020 Krushi Patel, Kaidong Li, Ke Tao, Quan Wang, Ajay Bansal, Amit Rastogi, Guanghui Wang

In this work, we compare the performance of the state-of-the-art general object classification models for polyp classification.

Classification General Classification

Curriculum Learning for Natural Language Understanding

no code implementations ACL 2020 Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang

With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks.

Natural Language Understanding

Interpretable and Efficient Heterogeneous Graph Convolutional Network

1 code implementation27 May 2020 Yaming Yang, Ziyu Guan, Jian-Xin Li, Wei Zhao, Jiangtao Cui, Quan Wang

However, regarding Heterogeneous Information Network (HIN), existing HIN-oriented GCN methods still suffer from two deficiencies: (1) they cannot flexibly explore all possible meta-paths and extract the most useful ones for a target object, which hinders both effectiveness and interpretability; (2) they often need to generate intermediate meta-path based dense graphs, which leads to high computational complexity.

CoKE: Contextualized Knowledge Graph Embedding

3 code implementations6 Nov 2019 Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu

This work presents Contextualized Knowledge Graph Embedding (CoKE), a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings.

Knowledge Graph Embedding Link Prediction

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension

1 code implementation WS 2019 Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang

In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models.

Machine Reading Comprehension Multi-Task Learning +1

Make a Face: Towards Arbitrary High Fidelity Face Manipulation

no code implementations ICCV 2019 Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He

Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.

Clustering Disentanglement +1

Personal VAD: Speaker-Conditioned Voice Activity Detection

2 code implementations12 Aug 2019 Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio Lopez Moreno

In this paper, we propose "personal VAD", a system to detect the voice activity of a target speaker at the frame level.

Action Detection Activity Detection +4

Adaptive Convolution for Multi-Relational Learning

no code implementations NAACL 2019 Xiaotian Jiang, Quan Wang, Bin Wang

We consider the problem of learning distributed representations for entities and relations of multi-relational data so as to predict missing links therein.

Link Prediction Relational Reasoning

Tuplemax Loss for Language Identification

1 code implementation29 Nov 2018 Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno

In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages.

Language Identification

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

4 code implementations11 Oct 2018 Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio Lopez Moreno

In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.

Speaker Recognition Speaker Separation +3

Fully Supervised Speaker Diarization

1 code implementation10 Oct 2018 Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang

In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN).

Clustering speaker-diarization +1

An Efficient Approach for Polyps Detection in Endoscopic Videos Based on Faster R-CNN

no code implementations4 Sep 2018 Xi Mo, Ke Tao, Quan Wang, Guanghui Wang

Polyp has long been considered as one of the major etiologies to colorectal cancer which is a fatal disease around the world, thus early detection and recognition of polyps plays a crucial role in clinical routines.

Look at Boundary: A Boundary-Aware Face Alignment Algorithm

2 code implementations CVPR 2018 Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, Qiang Zhou

By utilising boundary information of 300-W dataset, our method achieves 3. 92% mean error with 0. 39% failure rate on COFW dataset, and 1. 25% mean error on AFLW-Full dataset.

Ranked #3 on Face Alignment on AFLW-19 (using extra training data)

Face Alignment Facial Landmark Detection

Improving Knowledge Graph Embedding Using Simple Constraints

1 code implementation ACL 2018 Boyang Ding, Quan Wang, Bin Wang, Li Guo

We examine non-negativity constraints on entity representations and approximate entailment constraints on relation representations.

Knowledge Graph Embedding Knowledge Graphs

Links: A High-Dimensional Online Clustering Method

1 code implementation30 Jan 2018 Philip Andrew Mansfield, Quan Wang, Carlton Downey, Li Wan, Ignacio Lopez Moreno

We present a novel algorithm, called Links, designed to perform online clustering on unit vectors in a high-dimensional Euclidean space.

Clustering Online Clustering +1

Wavenet based low rate speech coding

1 code implementation1 Dec 2017 W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used.

Bandwidth Extension

Knowledge Graph Embedding with Iterative Guidance from Soft Rules

1 code implementation30 Nov 2017 Shu Guo, Quan Wang, Lihong Wang, Bin Wang, Li Guo

In this paper, we propose Rule-Guided Embedding (RUGE), a novel paradigm of KG embedding with iterative guidance from soft rules.

Knowledge Graph Embedding Knowledge Graphs +1

Generalized End-to-End Loss for Speaker Verification

28 code implementations28 Oct 2017 Li Wan, Quan Wang, Alan Papir, Ignacio Lopez Moreno

In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function.

Domain Adaptation Speaker Verification

Attention-Based Models for Text-Dependent Speaker Verification

2 code implementations28 Oct 2017 F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence.

Image Captioning Machine Translation +5

Speaker Diarization with LSTM

4 code implementations28 Oct 2017 Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications.

Clustering speaker-diarization +2

Label Consistent Fisher Vectors for Supervised Feature Aggregation

1 code implementation 2014 22nd International Conference on Pattern Recognition 2014 Quan Wang, Xin Shen, Meng Wang, Kim L. Boyer

In this paper, we present a simple and efficient way to add supervised information into Fisher vectors, which has become a popular image representation method for image classification and retrieval purposes in recent years.

Classification General Classification +2

Semantic Context Forests for Learning-Based Knee Cartilage Segmentation in 3D MR Images

1 code implementation11 Jul 2013 Quan Wang, Dijia Wu, Le Lu, Meizhu Liu, Kim L. Boyer, Shaohua Kevin Zhou

The automatic segmentation of human knee cartilage from 3D MR images is a useful yet challenging task due to the thin sheet structure of the cartilage with diffuse boundaries and inhomogeneous intensities.

3D Medical Imaging Segmentation

Feature Learning by Multidimensional Scaling and its Applications in Object Recognition

1 code implementation14 Jun 2013 Quan Wang, Kim L. Boyer

The aspects of the images that are captured by the learned features, which we call MDS features, completely depend on what kind of image distance measurement is employed.

Object Recognition

GMM-Based Hidden Markov Random Field for Color Image and 3D Volume Segmentation

1 code implementation18 Dec 2012 Quan Wang

In this project, we first study the Gaussian-based hidden Markov random field (HMRF) model and its expectation-maximization (EM) algorithm.

Image Segmentation Semantic Segmentation

Kernel Principal Component Analysis and its Applications in Face Recognition and Active Shape Models

2 code implementations15 Jul 2012 Quan Wang

Principal component analysis (PCA) is a popular tool for linear dimensionality reduction and feature extraction.

Dimensionality Reduction Face Recognition +1

HMRF-EM-image: Implementation of the Hidden Markov Random Field Model and its Expectation-Maximization Algorithm

1 code implementation15 Jul 2012 Quan Wang

In this project, we study the hidden Markov random field (HMRF) model and its expectation-maximization (EM) algorithm.

Image Segmentation Semantic Segmentation

Tracking Tetrahymena Pyriformis Cells using Decision Trees

1 code implementation13 Jul 2012 Quan Wang, Yan Ou, A. Agung Julius, Kim L. Boyer, Min Jun Kim

Matching cells over time has long been the most difficult step in cell tracking.

Cannot find the paper you are looking for? You can Submit a new open access paper.