Search Results for author: Quan Wang

Found 102 papers, 55 papers with code

Learn and Review: Enhancing Continual Named Entity Recognition via Reviewing Synthetic Samples

no code implementations • Findings (ACL) 2022 • Yu Xia, Quan Wang, Yajuan Lyu, Yong Zhu, Wenhao Wu, Sujian Li, Dai Dai

However, the existing method depends on the relevance between tasks and is prone to inter-type confusion. In this paper, we propose a novel two-stage framework Learn-and-Review (L&R) for continual NER under the type-incremental setting to alleviate the above issues. Specifically, for the learning stage, we distill the old knowledge from teacher to a student on the current dataset.

Continual Named Entity Recognition named-entity-recognition +2

Paper
Add Code

BDKG at MEDIQA 2021: System Report for the Radiology Report Summarization Task

no code implementations • NAACL (BioNLP) 2021 • Songtai Dai, Quan Wang, Yajuan Lyu, Yong Zhu

This paper presents our winning system at the Radiology Report Summarization track of the MEDIQA 2021 shared task.

Domain Adaptation

Paper
Add Code

Eyeglasses 3D shape reconstruction from a single face image

no code implementations • ECCV 2020 • Yating Wang, Quan Wang, Feng Xu

A complete 3D face reconstruction requires to explicitly model the eyeglasses on the face, which is less investigated in the literature.

3D Face Reconstruction 3D Reconstruction +2

Paper
Add Code

EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction

1 code implementation • NAACL 2022 • Benfeng Xu, Quan Wang, Yajuan Lyu, Yabing Shi, Yong Zhu, Jie Gao, Zhendong Mao

Multi-triple extraction is a challenging task due to the existence of informative inter-triple correlations, and consequently rich interactions across the constituent entities and relations. While existing works only explore entity representations, we propose to explicitly introduce relation representation, jointly represent it with entities, and novelly align them to identify valid triples. We perform comprehensive experiments on document-level relation extraction and joint entity and relation extraction along with ablations to demonstrate the advantage of the proposed method.

Document-level Relation Extraction Joint Entity and Relation Extraction +2

Paper
Code

Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

1 code implementation • 5 Apr 2024 • Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying WEI, Defu Lian, Zhendong Mao

Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods.

Attribute Benchmarking +2

Paper
Code

NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud

1 code implementation • 23 Jan 2024 • Shuai Liu, Di Wang, Quan Wang, Kai Huang

NIV strategy can serve as a bridge between classification and regression branches by calculating two types of statistical data from the regression output to correct the classification confidence.

Classification Data Augmentation +1

Paper
Code

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

2 code implementations • 7 Jan 2024 • Quan Wang, Yiling Huang, Guanlong Zhao, Evan Clark, Wei Xia, Hank Liao

In this paper, we introduce DiarizationLM, a framework to leverage large language models (LLM) to post-process the outputs from a speaker diarization system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

312

Paper
Code

Benchmarking Large Language Models on Controllable Generation under Diversified Instructions

1 code implementation • 1 Jan 2024 • Yihan Chen, Benfeng Xu, Quan Wang, Yi Liu, Zhendong Mao

While large language models (LLMs) have exhibited impressive instruction-following capabilities, it is still unclear whether and to what extent they can respond to explicit constraints that might be entailed in various instructions.

Benchmarking Instruction Following +1

Paper
Code

Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers

no code implementations • 18 Dec 2023 • Guru Prakash Arumugam, Shuo-Yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang, Shaan Bijwadia

ASR models often suffer from a long-form deletion problem where the model predicts sequential blanks instead of words when transcribing a lengthy audio (in the order of minutes or hours).

speech-recognition Speech Recognition

Paper
Add Code

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation

no code implementations • 25 Nov 2023 • Fengyi Fu, Lei Zhang, Quan Wang, Zhendong Mao

Then we propose an emotion correlation enhanced decoder, with a novel correlation-aware aggregation and soft/hard strategy, respectively improving the emotion perception and response generation.

Dialogue Generation Response Generation

Paper
Add Code

Grammatical Error Correction via Mixed-Grained Weighted Training

no code implementations • 23 Nov 2023 • Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, Yongdong Zhang

In this paper, the inherent discrepancies are manifested in two aspects, namely, accuracy of data annotation and diversity of potential annotations.

Grammatical Error Correction Sentence

Paper
Add Code

On the Calibration of Large Language Models and Alignment

no code implementations • 22 Nov 2023 • Chiwei Zhu, Benfeng Xu, Quan Wang, Yongdong Zhang, Zhendong Mao

As large language models attract increasing attention and find widespread application, concurrent challenges of reliability also arise at the same time.

Paper
Add Code

Personalizing Keyword Spotting with Speaker Information

no code implementations • 6 Nov 2023 • Beltrán Labrador, Pai Zhu, Guanlong Zhao, Angelo Scorza Scarpati, Quan Wang, Alicia Lozano-Diez, Alex Park, Ignacio López Moreno

Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups.

Keyword Spotting Speaker Recognition +1

Paper
Add Code

Random Entity Quantization for Parameter-Efficient Compositional Knowledge Graph Representation

1 code implementation • 24 Oct 2023 • Jiaang Li, Quan Wang, Yi Liu, Licheng Zhang, Zhendong Mao

We analyze this phenomenon and reveal that entity codes, the quantization outcomes for expressing entities, have higher entropy at the code level and Jaccard distance at the codeword level under random entity quantization.

Knowledge Graphs Quantization +1

Paper
Code

Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation

1 code implementation • 23 Oct 2023 • Tianqi Zhong, Quan Wang, Jingxuan Han, Yongdong Zhang, Zhendong Mao

Then we design a novel attribute distribution reconstruction method to balance the obtained distributions and use the reconstructed distributions to guide language models for generation, effectively avoiding the issue of Attribute Collapse.

Attribute Text Generation

Paper
Code

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

no code implementations • 15 Sep 2023 • Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

Whether it is the conventional modularized approach or the more recent end-to-end neural diarization (EEND), an additional automatic speech recognition (ASR) model and an orchestration algorithm are required to associate the speaker labels with recognized words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models

no code implementations • 14 Sep 2023 • Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang

We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages.

Change Detection

Paper
Add Code

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

1 code implementation • 8 Sep 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

Paper
Code

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

2 code implementations • 24 May 2023 • Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Yongdong Zhang, Zhendong Mao

The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting of prompts.

In-Context Learning Instruction Following +2

289

Paper
Code

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang

Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.

Image Generation Image Reconstruction +1

Paper
Code

Inductive Relation Prediction from Relational Paths and Context with Hierarchical Transformers

1 code implementation • 1 Apr 2023 • Jiaang Li, Quan Wang, Zhendong Mao

Relation prediction on knowledge graphs (KGs) is a key research topic.

Inductive Relation Prediction Knowledge Graphs +1

Paper
Code

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

1 code implementation • 24 Mar 2023 • Benfeng Xu, Quan Wang, Zhendong Mao, Yajuan Lyu, Qiaoqiao She, Yongdong Zhang

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs.

In-Context Learning

Paper
Code

DeformToon3D: Deformable Neural Radiance Fields for 3D Toonification

no code implementations • ICCV 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

Paper
Add Code

CGOF++: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

no code implementations • 23 Nov 2022 • Keqiang Sun, Shangzhe Wu, Ning Zhang, Zhaoyang Huang, Quan Wang, Hongsheng Li

Capitalizing on the recent advances in image generation models, existing controllable face image synthesis methods are able to generate high-fidelity images with some levels of controllability, e. g., controlling the shapes, expressions, textures, and poses of the generated face images.

Face Generation

Paper
Add Code

Exploring Sequence-to-Sequence Transformer-Transducer Models for Keyword Spotting

no code implementations • 11 Nov 2022 • Beltrán Labrador, Guanlong Zhao, Ignacio López Moreno, Angelo Scorza Scarpati, Liam Fowl, Quan Wang

In this paper, we present a novel approach to adapt a sequence-to-sequence Transformer-Transducer ASR system to the keyword spotting (KWS) task.

Keyword Spotting

Paper
Add Code

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

no code implementations • 11 Nov 2022 • Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Ignacio Lopez Moreno

Due to the sparsity of the speaker changes in the training data, the conventional T-T based SCD model loss leads to sub-optimal detection accuracy.

Change Detection

Paper
Add Code

Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering

1 code implementation • 25 Oct 2022 • Quan Wang, Yiling Huang, Han Lu, Guanlong Zhao, Ignacio Lopez Moreno

While recent research advances in speaker diarization mostly focus on improving the quality of diarization results, there is also an increasing interest in improving the efficiency of diarization systems.

Clustering speaker-diarization +1

490

Paper
Code

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

1 code implementation • 20 Oct 2022 • Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang

In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task.

Paper
Code

A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

no code implementations • 14 Sep 2022 • Tom O'Malley, Arun Narayanan, Quan Wang

The joint model uses contextual information, such as a reference of the playback audio, noise context, and speaker embedding.

Acoustic echo cancellation Automatic Speech Recognition +3

Paper
Add Code

Compact and Robust Deep Learning Architecture for Fluorescence Lifetime Imaging and FPGA Implementation

no code implementations • 7 Sep 2022 • Zhenya Zang, Dong Xiao, Quan Wang, Ziao Jiao, Chen Yu, David Day-Uei Li

FLAN+LS on hardware achieves the highest computing efficiency compared to 1-D CNN and FLAN.

Quantization

Paper
Add Code

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation

no code implementations • 3 Sep 2022 • Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang

Text-to-image generation aims at generating realistic images which are semantically consistent with the given text.

Generative Adversarial Network Text-to-Image Generation

Paper
Add Code

MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection

1 code implementation • 2022 2022 • Jiaqi Zheng, Xi Zhang, Sanchuan Guo, Quan Wang, Wenyu Zang, Yongdong Zhang

Rumor spreaders are increasingly taking advantage of multimedia content to attract and mislead news consumers on social media.

Paper
Code

Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation

1 code implementation • 19 Jul 2022 • Jingwang Ling, Zhibo Wang, Ming Lu, Quan Wang, Chen Qian, Feng Xu

Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details.

Paper
Code

Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

no code implementations • 16 Jun 2022 • Keqiang Sun, Shangzhe Wu, Zhaoyang Huang, Ning Zhang, Quan Wang, Hongsheng Li

Face Generation

Paper
Add Code

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

no code implementations • 8 Apr 2022 • Shaojin Ding, Rajeev Rikhye, Qiao Liang, Yanzhang He, Quan Wang, Arun Narayanan, Tom O'Malley, Ian McGraw

Personalization of on-device speech recognition (ASR) has seen explosive growth in recent years, largely due to the increasing popularity of personal assistant features on mobile devices and smart home speakers.

Action Detection Activity Detection +2

Paper
Add Code

Fast fluorescence lifetime imaging analysis via extreme learning machine

no code implementations • 25 Mar 2022 • Zhenya Zang, Dong Xiao, Quan Wang, Zinuo Li, Wujun Xie, Yu Chen, David Day Uei Li

As there is no back-propagation process for ELM during the training phase, the training speed is much higher than existing neural network approaches.

Edge-computing Efficient Neural Network

Paper
Add Code

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

1 code implementation • CVPR 2022 • Li SiYao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints.

Ranked #1 on Motion Synthesis on AIST++

Motion Synthesis

366

Paper
Code

Parameter-Free Attentive Scoring for Speaker Verification

1 code implementation • 10 Mar 2022 • Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio Lopez Moreno

This paper presents a novel study of parameter-free attentive scoring for speaker verification.

Speaker Verification

312

Paper
Code

Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

no code implementations • 24 Feb 2022 • Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

However, one limitation of VoiceFilter-Lite, and other speaker-conditioned speech models in general, is that these models are usually limited to a single target speaker.

Speaker Verification speech-recognition +1

Paper
Add Code

Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech

1 code implementation • 24 Feb 2022 • Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno

In this paper, we introduce a novel language identification system based on conformer layers.

Domain Adaptation Language Identification

312

Paper
Code

Microservice Deployment in Edge Computing Based on Deep Q Learning

1 code implementation • IEEE Transactions on Parallel and Distributed Systems 2022 • Wenkai Lv, Quan Wang, Pengfei Yang

Then, we propose a multi-objective microservice deployment problem (MMDP) in edge computing.

Edge-computing Q-Learning

Paper
Code

CVSS Corpus and Massively Multilingual Speech-to-Speech Translation

1 code implementation • LREC 2022 • Ye Jia, Michelle Tadmor Ramanovich, Quan Wang, Heiga Zen

In addition, CVSS provides normalized translation text which matches the pronunciation in the translation speech.

Sentence Speech-to-Speech Translation +2

169

Paper
Code

Negative-Aware Attention Framework for Image-Text Matching

1 code implementation • CVPR 2022 • Kun Zhang, Zhendong Mao, Quan Wang, Yongdong Zhang

Image-text matching, as a fundamental task, bridges the gap between vision and language.

Image-text matching Text Matching +1

Paper
Code

A Conformer-based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation

no code implementations • 18 Nov 2021 • Tom O'Malley, Arun Narayanan, Quan Wang, Alex Park, James Walker, Nathan Howard

Compared to the noisy baseline, the joint model reduces the word error rate in low signal-to-noise ratio conditions by at least 71% on our echo cancellation dataset, 10% on our noisy dataset, and 26% on our multi-speaker dataset.

Acoustic echo cancellation Automatic Speech Recognition +4

Paper
Add Code

Cross-attention conformer for context modeling in speech enhancement for ASR

no code implementations • 30 Oct 2021 • Arun Narayanan, Chung-Cheng Chiu, Tom O'Malley, Quan Wang, Yanzhang He

This work introduces \emph{cross-attention conformer}, an attention-based architecture for context modeling in speech enhancement.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

1 code implementation • 14 Oct 2021 • Quan Wang, Songtai Dai, Benfeng Xu, Yajuan Lyu, Yong Zhu, Hua Wu, Haifeng Wang

In this work we introduce eHealth, a Chinese biomedical PLM built from scratch with a new pre-training framework.

Domain Adaptation

11,418

Paper
Code

Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection

1 code implementation • 23 Sep 2021 • Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak

In this paper, we present a novel speaker diarization system for streaming on-device applications.

Clustering speaker-diarization +1

490

Paper
Code

Learning Oculomotor Behaviors from Scanpath

1 code implementation • 11 Aug 2021 • Beibin Li, Nicholas Nuechterlein, Erin Barney, Claire Foster, Minah Kim, Monique Mahony, Adham Atyabi, Li Feng, Quan Wang, Pamela Ventola, Linda Shapiro, Frederick Shic

Identifying oculomotor behaviors relevant for eye-tracking applications is a critical but often challenging task.

Contrastive Learning

Paper
Code

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

no code implementations • 2 Jul 2021 • Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Link Prediction on N-ary Relational Facts: A Graph-based Approach

1 code implementation • Findings (ACL) 2021 • Quan Wang, Haifeng Wang, Yajuan Lyu, Yong Zhu

The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention.

Knowledge Graphs Link Prediction

1,694

Paper
Code

Inverting Generative Adversarial Renderer for Face Reconstruction

no code implementations • CVPR 2021 • Jingtan Piao, Keqiang Sun, KwanYee Lin, Quan Wang, Hongsheng Li

Since the GAR learns to model the complicated real-world image, instead of relying on the simplified graphics rules, it is capable of producing realistic images, which essentially inhibits the domain-shift noise in training and optimization.

Face Reconstruction

Paper
Add Code

Personalized Keyphrase Detection using Speaker and Environment Information

no code implementations • 28 Apr 2021 • Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng, Huang, Arun Narayanan, Ian McGraw

In this paper, we introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System

no code implementations • 5 Apr 2021 • Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno

To the best of our knowledge, this is the first study of speaker verification systems at the scale of 46 languages.

Speaker Recognition Text-Independent Speaker Verification

Paper
Add Code

Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition

no code implementations • 5 Apr 2021 • Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno

In this work we propose scoring these representations in a way that can capture uncertainty, enroll/test asymmetry and additional non-linear information.

Speaker Recognition

Paper
Add Code

Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

3 code implementations • 20 Feb 2021 • Benfeng Xu, Quan Wang, Yajuan Lyu, Yong Zhu, Zhendong Mao

Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN.

Ranked #3 on Relation Extraction on DocRED

Document-level Relation Extraction Relation

1,694

Paper
Code

Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech

no code implementations • 24 Nov 2020 • Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang

In recent years, Text-To-Speech (TTS) has been used as a data augmentation technique for speech recognition to help complement inadequacies in the training data.

Data Augmentation Speaker Recognition +2

Paper
Add Code

Event Extraction as Multi-turn Question Answering

no code implementations • Findings of the Association for Computational Linguistics 2020 • Fayuan Li, Weihua Peng, Yuguang Chen, Quan Wang, Lu Pan, Yajuan Lyu, Yong Zhu

Most traditional approaches formulate this task as classification problems, with event types or argument roles taken as golden labels.

Event Extraction Question Answering +2

Paper
Add Code

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

1 code implementation • 9 Sep 2020 • Quan Wang, Ignacio Lopez Moreno, Mert Saglam, Kevin Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein

We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the device to preserve only the speech signals from a target user, as part of a streaming speech recognition system.

speech-recognition Speech Recognition

1,031

Paper
Code

Textual Echo Cancellation

no code implementations • 13 Aug 2020 • Shaojin Ding, Ye Jia, Ke Hu, Quan Wang

In this paper, we propose Textual Echo Cancellation (TEC) - a framework for cancelling the text-to-speech (TTS) playback echo from overlapping speech recordings.

Acoustic echo cancellation speech-recognition +1

Paper
Add Code

Version Control of Speaker Recognition Systems

1 code implementation • 23 Jul 2020 • Quan Wang, Ignacio Lopez Moreno

In this paper, we describe different version control strategies for speaker recognition systems that had been carefully studied at Google from years of engineering practice.

Speaker Recognition

Paper
Code

A Comparative Study on Polyp Classification using Convolutional Neural Networks

no code implementations • 12 Jul 2020 • Krushi Patel, Kaidong Li, Ke Tao, Quan Wang, Ajay Bansal, Amit Rastogi, Guanghui Wang

In this work, we compare the performance of the state-of-the-art general object classification models for polyp classification.

Classification General Classification +1

Paper
Add Code

Curriculum Learning for Natural Language Understanding

no code implementations • ACL 2020 • Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang

With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks.

Natural Language Understanding

Paper
Add Code

Fast and Accurate: Structure Coherence Component for Face Alignment

no code implementations • 21 Jun 2020 • Beier Zhu, Chunze Lin, Quan Wang, Renjie Liao, Chen Qian

In this paper, we propose a fast and accurate coordinate regression method for face alignment.

Ranked #15 on Face Alignment on COFW

Face Alignment regression

Paper
Add Code

Interpretable and Efficient Heterogeneous Graph Convolutional Network

1 code implementation • 27 May 2020 • Yaming Yang, Ziyu Guan, Jian-Xin Li, Wei Zhao, Jiangtao Cui, Quan Wang

However, regarding Heterogeneous Information Network (HIN), existing HIN-oriented GCN methods still suffer from two deficiencies: (1) they cannot flexibly explore all possible meta-paths and extract the most useful ones for a target object, which hinders both effectiveness and interpretability; (2) they often need to generate intermediate meta-path based dense graphs, which leads to high computational complexity.

Object

Paper
Code

CoKE: Contextualized Knowledge Graph Embedding

3 code implementations • 6 Nov 2019 • Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu

This work presents Contextualized Knowledge Graph Embedding (CoKE), a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings.

Knowledge Graph Embedding Link Prediction +1

6,868

Paper
Code

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

no code implementations • 5 Nov 2019 • Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling

Spoofing attacks within a logical access (LA) scenario are generated with the latest speech synthesis and voice conversion technologies, including state-of-the-art neural acoustic and waveform model techniques.

Person Recognition Speaker Verification +2

Paper
Add Code

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension

1 code implementation • WS 2019 • Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang

In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models.

Machine Reading Comprehension Multi-Task Learning +1

Paper
Code

FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos

1 code implementation • ICCV 2019 • Keqiang Sun, Wayne Wu, Tinghao Liu, Shuo Yang, Quan Wang, Qiang Zhou, Zuochang Ye, Chen Qian

A structure predictor is proposed to predict the missing face structural information temporally, which serves as a geometry prior.

Deblurring Facial Landmark Detection

Paper
Code

Make a Face: Towards Arbitrary High Fidelity Face Manipulation

no code implementations • ICCV 2019 • Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He

Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.

Clustering Disentanglement +1

Paper
Add Code

Personal VAD: Speaker-Conditioned Voice Activity Detection

2 code implementations • 12 Aug 2019 • Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio Lopez Moreno

In this paper, we propose "personal VAD", a system to detect the voice activity of a target speaker at the frame level.

Action Detection Activity Detection +4

Paper
Code

Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension

1 code implementation • ACL 2019 • An Yang, Quan Wang, Jing Liu, Kai Liu, Yajuan Lyu, Hua Wu, Qiaoqiao She, Sujian Li

In this work, we investigate the potential of leveraging external knowledge bases (KBs) to further improve BERT for MRC.

Machine Reading Comprehension

334

Paper
Code

Adaptive Convolution for Multi-Relational Learning

no code implementations • NAACL 2019 • Xiaotian Jiang, Quan Wang, Bin Wang

We consider the problem of learning distributed representations for entities and relations of multi-relational data so as to predict missing links therein.

Ranked #10 on Link Prediction on WN18

Link Prediction Relation +1

Paper
Add Code

Tuplemax Loss for Language Identification

1 code implementation • 29 Nov 2018 • Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno

In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages.

Language Identification

Paper
Code

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

4 code implementations • 11 Oct 2018 • Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio Lopez Moreno

In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.

Speaker Recognition Speaker Separation +3

196

Paper
Code

Fully Supervised Speaker Diarization

1 code implementation • 10 Oct 2018 • Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang

In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN).

Ranked #1 on Speaker Diarization on Hub5'00 CallHome

Clustering speaker-diarization +1

1,530

Paper
Code

Sample Efficient Adaptive Text-to-Speech

no code implementations • ICLR 2019 • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers.

Meta-Learning Voice Similarity

Paper
Add Code

An Efficient Approach for Polyps Detection in Endoscopic Videos Based on Faster R-CNN

no code implementations • 4 Sep 2018 • Xi Mo, Ke Tao, Quan Wang, Guanghui Wang

Polyp has long been considered as one of the major etiologies to colorectal cancer which is a fatal disease around the world, thus early detection and recognition of polyps plays a crucial role in clinical routines.

Paper
Add Code

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

11 code implementations • NeurIPS 2018 • Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Speaker Verification Speech Synthesis +3

50,743

Paper
Code

Look at Boundary: A Boundary-Aware Face Alignment Algorithm

2 code implementations • CVPR 2018 • Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, Qiang Zhou

By utilising boundary information of 300-W dataset, our method achieves 3. 92% mean error with 0. 39% failure rate on COFW dataset, and 1. 25% mean error on AFLW-Full dataset.

Ranked #4 on Face Alignment on AFLW-19 (using extra training data)

Face Alignment Facial Landmark Detection

5,006

Paper
Code

Improving Knowledge Graph Embedding Using Simple Constraints

1 code implementation • ACL 2018 • Boyang Ding, Quan Wang, Bin Wang, Li Guo

We examine non-negativity constraints on entity representations and approximate entailment constraints on relation representations.

Knowledge Graph Embedding Knowledge Graphs

Paper
Code

Links: A High-Dimensional Online Clustering Method

1 code implementation • 30 Jan 2018 • Philip Andrew Mansfield, Quan Wang, Carlton Downey, Li Wan, Ignacio Lopez Moreno

We present a novel algorithm, called Links, designed to perform online clustering on unit vectors in a high-dimensional Euclidean space.

Clustering Online Clustering +1

Paper
Code

Wavenet based low rate speech coding

1 code implementation • 1 Dec 2017 • W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used.

Bandwidth Extension

3,732

Paper
Code

Knowledge Graph Embedding with Iterative Guidance from Soft Rules

1 code implementation • 30 Nov 2017 • Shu Guo, Quan Wang, Lihong Wang, Bin Wang, Li Guo

In this paper, we propose Rule-Guided Embedding (RUGE), a novel paradigm of KG embedding with iterative guidance from soft rules.

Ranked #2 on Link Prediction on YAGO37

Knowledge Graph Embedding Knowledge Graphs +1

Paper
Code

Generalized End-to-End Loss for Speaker Verification

28 code implementations • 28 Oct 2017 • Li Wan, Quan Wang, Alan Papir, Ignacio Lopez Moreno

In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function.

Ranked #1 on Speaker Verification on CALLHOME

Domain Adaptation Speaker Verification

50,743

Paper
Code

Speaker Diarization with LSTM

4 code implementations • 28 Oct 2017 • Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications.

Ranked #2 on Speaker Diarization on CALLHOME-109

Clustering speaker-diarization +2

490

Paper
Code

Attention-Based Models for Text-Dependent Speaker Verification

2 code implementations • 28 Oct 2017 • F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence.

Image Captioning Machine Translation +5

Paper
Code

Relation Extraction with Multi-instance Multi-label Convolutional Neural Networks

no code implementations • COLING 2016 • Xiaotian Jiang, Quan Wang, Peng Li, Bin Wang

In this paper, we propose a multi-instance multi-label convolutional neural network for distantly supervised RE.

Multi-Label Learning Relation +2

Paper
Add Code

Multi-Granularity Chinese Word Embedding

no code implementations • EMNLP 2016 • Rongchao Yin, Quan Wang, Peng Li, Rui Li, Bin Wang

Learning Word Embeddings

Paper
Add Code

Jointly Embedding Knowledge Graphs and Logical Rules

no code implementations • EMNLP 2016 • Shu Guo, Quan Wang, Lihong Wang, Bin Wang, Li Guo

Knowledge Graph Embedding Knowledge Graphs +2

Paper
Add Code

Knowledge Base Completion via Coupled Path Ranking

no code implementations • ACL 2016 • Quan Wang, Jing Liu, Yuanfei Luo, Bin Wang, Chin-Yew Lin

Clustering Knowledge Base Completion +1

Paper
Add Code

Context-Dependent Knowledge Graph Embedding

no code implementations • EMNLP 2015 • Yuanfei Luo, Quan Wang, Bin Wang, Li Guo

Knowledge Graph Embedding Knowledge Graphs +1

Paper
Add Code

Semantically Smooth Knowledge Graph Embedding

no code implementations • IJCNLP 2015 • Shu Guo, Quan Wang, Bin Wang, Lihong Wang, Li Guo

Entity Resolution Knowledge Graph Embedding +5

Paper
Add Code

A Regularized Competition Model for Question Difficulty Estimation in Community Question Answering Services

no code implementations • EMNLP 2014 • Quan Wang, Jing Liu, Bin Wang, Li Guo

Community Question Answering

Paper
Add Code

Label Consistent Fisher Vectors for Supervised Feature Aggregation

1 code implementation • 2014 22nd International Conference on Pattern Recognition 2014 • Quan Wang, Xin Shen, Meng Wang, Kim L. Boyer

In this paper, we present a simple and efficient way to add supervised information into Fisher vectors, which has become a popular image representation method for image classification and retrieval purposes in recent years.

Classification General Classification +2

Paper
Code

Question Difficulty Estimation in Community Question Answering Services

no code implementations • EMNLP 2013 • Jing Liu, Quan Wang, Chin-Yew Lin, Hsiao-Wuen Hon

Community Question Answering

Paper
Add Code

Semantic Context Forests for Learning-Based Knee Cartilage Segmentation in 3D MR Images

1 code implementation • 11 Jul 2013 • Quan Wang, Dijia Wu, Le Lu, Meizhu Liu, Kim L. Boyer, Shaohua Kevin Zhou

The automatic segmentation of human knee cartilage from 3D MR images is a useful yet challenging task due to the thin sheet structure of the cartilage with diffuse boundaries and inhomogeneous intensities.

3D Medical Imaging Segmentation Segmentation

Paper
Code

Feature Learning by Multidimensional Scaling and its Applications in Object Recognition

1 code implementation • 14 Jun 2013 • Quan Wang, Kim L. Boyer

The aspects of the images that are captured by the learned features, which we call MDS features, completely depend on what kind of image distance measurement is employed.

Object Recognition

Paper
Code

GMM-Based Hidden Markov Random Field for Color Image and 3D Volume Segmentation

1 code implementation • 18 Dec 2012 • Quan Wang

In this project, we first study the Gaussian-based hidden Markov random field (HMRF) model and its expectation-maximization (EM) algorithm.

Image Segmentation Segmentation +1

Paper
Code

The active geometric shape model: A new robust deformable shape model and its applications

1 code implementation • journal 2012 • Quan Wang, Kim L. Boyer

Similar to active shape models and active contours, a force field is used in our approach.

Paper
Code

Kernel Principal Component Analysis and its Applications in Face Recognition and Active Shape Models

2 code implementations • 15 Jul 2012 • Quan Wang

Principal component analysis (PCA) is a popular tool for linear dimensionality reduction and feature extraction.

Dimensionality Reduction Face Recognition +1

Paper
Code

HMRF-EM-image: Implementation of the Hidden Markov Random Field Model and its Expectation-Maximization Algorithm

1 code implementation • 15 Jul 2012 • Quan Wang

In this project, we study the hidden Markov random field (HMRF) model and its expectation-maximization (EM) algorithm.

Image Segmentation Segmentation +1

Paper
Code

Tracking Tetrahymena Pyriformis Cells using Decision Trees

1 code implementation • 13 Jul 2012 • Quan Wang, Yan Ou, A. Agung Julius, Kim L. Boyer, Min Jun Kim

Matching cells over time has long been the most difficult step in cell tracking.

Cell Tracking

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.