Search Results for author: Tomi Kinnunen

Found 49 papers, 14 papers with code

ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan

1 code implementation • 1 Sep 2021 • Héctor Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Jose Patino, Md Sahidullah, Massimiliano Todisco, Xin Wang, Junichi Yamagishi

The automatic speaker verification spoofing and countermeasures (ASVspoof) challenge series is a community-led initiative which aims to promote the consideration of spoofing and the development of countermeasures.

Face Swapping Speaker Verification

165

Paper
Code

VoxCeleb Enrichment for Age and Gender Recognition

1 code implementation • 28 Sep 2021 • Khaled Hechmi, Trung Ngo Trong, Ville Hautamaki, Tomi Kinnunen

VoxCeleb datasets are widely used in speaker recognition studies.

Age Estimation regression +1

56

Paper
Code

General Characterization of Agents by States they Visit

1 code implementation • 2 Dec 2020 • Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Behavioural characterizations (BCs) of decision-making agents, or their policies, are used to study outcomes of training algorithms and as part of the algorithms themselves to encourage unique policies, match expert policy or restrict changes to policy per update.

Imitation Learning

20

Paper
Code

GAN-Aimbots: Using Machine Learning for Cheating in First Person Shooters

1 code implementation • 14 May 2022 • Anssi Kanervisto, Tomi Kinnunen, Ville Hautamäki

Playing games with cheaters is not fun, and in a multi-billion-dollar video game industry with hundreds of millions of players, game developers aim to improve the security and, consequently, the user experience of their games by preventing cheating.

BIG-bench Machine Learning

20

Paper
Code

An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning

1 code implementation • 6 Feb 2020 • Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

The spoofing countermeasure (CM) systems in automatic speaker verification (ASV) are not typically used in isolation of each other.

Speaker Verification

13

Paper
Code

t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators

1 code implementation • 21 Sep 2023 • Tomi Kinnunen, Kong Aik Lee, Hemlata Tak, Nicholas Evans, Andreas Nautsch

The proposed approach is a strong candidate metric for the tandem evaluation of PAD systems and biometric comparators.

12

Paper
Code

Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search

1 code implementation • 8 Nov 2018 • Ville Vestman, Bilal Soomro, Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen

The popularization of science can often be disregarded by scientists as it may be challenging to put highly sophisticated research into words that general public can understand.

Audio and Speech Processing Sound

6

Paper
Code

Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing

1 code implementation • 31 May 2023 • Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen

Audio anti-spoofing for automatic speaker verification aims to safeguard users' identities from spoofing attacks.

Speaker Verification

4

Paper
Code

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing

1 code implementation • 11 Jun 2021 • Tomi Kinnunen, Andreas Nautsch, Md Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Kong Aik Lee

Whether it be for results summarization, or the analysis of classifier fusion, some means to compare different classifiers can often provide illuminating insight into their behaviour, (dis)similarity or complementarity.

Speaker Verification Voice Anti-spoofing

3

Paper
Code

ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification

1 code implementation • 23 Feb 2024 • Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen

One promising approach is to align vocal-tract parameters between adults and children through children-specific data augmentation, referred here to as ChildAugment.

Data Augmentation Speaker Verification

3

Paper
Code

a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

1 code implementation • 3 Mar 2024 • Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans, Jean-Francois Bonastre, Itshak Lapidot

Spoofing detection is today a mainstream research topic.

Benchmarking Speaker Verification

3

Paper
Code

Improving speaker de-identification with functional data analysis of f0 trajectories

1 code implementation • 31 Mar 2022 • Lauri Tavi, Tomi Kinnunen, Rosa González Hautamäki

Due to a constantly increasing amount of speech data that is stored in different types of databases, voice privacy has become a major concern.

De-identification

1

Paper
Code

A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech

no code implementations • 8 May 2018 • Akihiro Kato, Tomi Kinnunen

The latest prior research addresses this problem first as a frame-by-frame-classification problem followed by sequence tracking using deep neural network hidden Markov model (DNN-HMM) hybrid architecture.

Language Identification Speech Synthesis +1

Paper
Add Code

Supervector Compression Strategies to Speed up I-Vector System Development

no code implementations • 3 May 2018 • Ville Vestman, Tomi Kinnunen

The results suggest that, in terms of ASV accuracy, the supervector compression approaches are on a par with FEFA.

Speaker Verification

Paper
Add Code

t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

no code implementations • 25 Apr 2018 • Tomi Kinnunen, Kong Aik Lee, Hector Delgado, Nicholas Evans, Massimiliano Todisco, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

The two challenge editions in 2015 and 2017 involved the assessment of spoofing countermeasures (CMs) in isolation from ASV using an equal error rate (EER) metric.

Speaker Verification

Paper
Add Code

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment

no code implementations • 23 Apr 2018 • Tomi Kinnunen, Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Zhen-Hua Ling

As a supplement to subjective results for the 2018 Voice Conversion Challenge (VCC'18) data, we configure a standard constant-Q cepstral coefficient CM to quantify the extent of processing artifacts.

Benchmarking Speaker Verification +1

Paper
Add Code

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods

no code implementations • 12 Apr 2018 • Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhen-Hua Ling

We present the Voice Conversion Challenge 2018, designed as a follow up to the 2016 edition with the aim of providing a common framework for evaluating and comparing different state-of-the-art voice conversion (VC) systems.

Voice Conversion

Paper
Add Code

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data

no code implementations • 2 Mar 2018 • Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen

Thanks to the growing availability of spoofing databases and rapid advances in using them, systems for detecting voice spoofing attacks are becoming more and more capable, and error rates close to zero are being reached for the ASVspoof2015 database.

Generative Adversarial Network Speech Enhancement +2

Paper
Add Code

Waveform to Single Sinusoid Regression to Estimate the F0 Contour from Noisy Speech Using Recurrent Deep Neural Networks

no code implementations • 2 Jul 2018 • Akihiro Kato, Tomi Kinnunen

The fundamental frequency (F0) represents pitch in speech that determines prosodic characteristics of speech and is needed in various tasks for speech analysis and synthesis.

Paper
Add Code

Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection

no code implementations • 9 Nov 2018 • Tomi Kinnunen, Rosa González Hautamäki, Ville Vestman, Md Sahidullah

We consider technology-assisted mimicry attacks in the context of automatic speaker verification (ASV).

Speaker Recognition Speaker Verification

Paper
Add Code

Introduction to Voice Presentation Attack Detection and Recent Advances

no code implementations • 4 Jan 2019 • Md Sahidullah, Hector Delgado, Massimiliano Todisco, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Kong-Aik Lee

Over the past few years significant progress has been made in the field of presentation attack detection (PAD) for automatic speaker recognition (ASV).

Benchmarking Speaker Recognition

Paper
Add Code

SWAN - Scientific Writing AssistaNt. A Tool for Helping Scholars to Write Reader-Friendly Manuscripts

no code implementations • EACL 2012 • Tomi Kinnunen, Henri Leisma, Monika Machunik, Tuomo Kakkonen, Jean-Luc LeBrun

Paper
Add Code

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

no code implementations • 16 Apr 2019 • Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K. George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-Francois Bonastre, Cheng-Lin Xu, Zhi Hao Lim, Eng Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans

The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE).

Domain Adaptation Speaker Recognition

Paper
Add Code

Voice Mimicry Attacks Assisted by Automatic Speaker Verification

no code implementations • 3 Jun 2019 • Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, Md Sahidullah

Our goal is to gain insights how well similarity rankings transfer from the attacker's ASV system to the attacked ASV system, whether the attackers are able to improve their attacks by mimicking, and how the properties of the voices of attackers change due to mimicking.

Speaker Verification

Paper
Add Code

Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores

no code implementations • 4 Nov 2019 • Alexey Sholokhov, Tomi Kinnunen, Ville Vestman, Kong Aik Lee

We put forward a novel performance assessment framework to address both the inadequacy of the random-impostor evaluation model and the size limitation of evaluation corpora by addressing ASV security against closest impostors on arbitrarily large datasets.

Speaker Verification

Paper
Add Code

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

no code implementations • 5 Nov 2019 • Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling

Spoofing attacks within a logical access (LA) scenario are generated with the latest speech synthesis and voice conversion technologies, including state-of-the-art neural acoustic and waveform model techniques.

Person Recognition Speaker Verification +2

Paper
Add Code

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

no code implementations • 12 Jul 2020 • Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs.

Speaker Verification

Paper
Add Code

UIAI System for Short-Duration Speaker Verification Challenge 2020

no code implementations • 26 Jul 2020 • Md Sahidullah, Achintya Kumar Sarkar, Ville Vestman, Xuechen Liu, Romain Serizel, Tomi Kinnunen, Zheng-Hua Tan, Emmanuel Vincent

Our primary submission to the challenge is the fusion of seven subsystems which yields a normalized minimum detection cost function (minDCF) of 0. 072 and an equal error rate (EER) of 2. 14% on the evaluation set.

Text-Dependent Speaker Verification

Paper
Add Code

A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings

no code implementations • 30 Jul 2020 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

Modern automatic speaker verification relies largely on deep neural networks (DNNs) trained on mel-frequency cepstral coefficient (MFCC) features.

Speaker Verification

Paper
Add Code

Extrapolating false alarm rates in automatic speaker verification

no code implementations • 8 Aug 2020 • Alexey Sholokhov, Tomi Kinnunen, Ville Vestman, Kong Aik Lee

Automatic speaker verification (ASV) vendors and corpus providers would both benefit from tools to reliably extrapolate performance metrics for large speaker populations without collecting new speakers.

Speaker Verification

Paper
Add Code

ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech

no code implementations • 11 Feb 2021 • Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee

The ASVspoof initiative was conceived to spearhead research in anti-spoofing for automatic speaker verification (ASV).

Speaker Verification Speech Synthesis +2

Paper
Add Code

Learnable MFCCs for Speaker Verification

no code implementations • 20 Feb 2021 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

We propose a learnable mel-frequency cepstral coefficient (MFCC) frontend architecture for deep neural network (DNN) based automatic speaker verification.

Speaker Verification

Paper
Add Code

Data Quality as Predictor of Voice Anti-Spoofing Generalization

no code implementations • 26 Mar 2021 • Bhusan Chettri, Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen

Voice anti-spoofing aims at classifying a given utterance either as a bonafide human sample, or a spoofing attack (e. g. synthetic or replayed sample).

Voice Anti-spoofing

Paper
Add Code

ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection

no code implementations • 1 Sep 2021 • Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado

In addition to a continued focus upon logical and physical access tasks in which there are a number of advances compared to previous editions, ASVspoof 2021 introduces a new task involving deepfake speech detection.

Face Swapping Speaker Verification

Paper
Add Code

Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification

no code implementations • 24 Sep 2021 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

After their introduction to robust speech recognition, power normalized cepstral coefficient (PNCC) features were successfully adopted to other tasks, including speaker verification.

Robust Speech Recognition Speaker Verification +1

Paper
Add Code

Parameterized Channel Normalization for Far-field Deep Speaker Verification

no code implementations • 24 Sep 2021 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

We address far-field speaker verification with deep neural network (DNN) based speaker embedding extractor, where mismatch between enrollment and test data often comes from convolutive effects (e. g. room reverberation) and noise.

Speaker Verification

Paper
Add Code

Optimizing Multi-Taper Features for Deep Speaker Verification

no code implementations • 21 Oct 2021 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

Multi-taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs).

Open-Ended Question Answering Speaker Verification

Paper
Add Code

Optimizing Tandem Speaker Verification and Anti-Spoofing Systems

no code implementations • 24 Jan 2022 • Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are typically used in conjunction with spoofing countermeasure (CM) systems to improve security.

Speaker Verification

Paper
Add Code

Learnable Nonlinear Compression for Robust Speaker Verification

no code implementations • 10 Feb 2022 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner.

Speaker Verification

Paper
Add Code

Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation

no code implementations • 21 Mar 2022 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

In this paper, we initiate the concern of enhancing the spoofing robustness of the automatic speaker verification (ASV) system, without the primary presence of a separate countermeasure module.

Speaker Verification Unsupervised Domain Adaptation

Paper
Add Code

SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

no code implementations • 28 Mar 2022 • Jee-weon Jung, Hemlata Tak, Hye-jin Shim, Hee-Soo Heo, Bong-Jin Lee, Soo-Whan Chung, Ha-Jin Yu, Nicholas Evans, Tomi Kinnunen

Pre-trained spoofing detection and speaker verification models are provided as open source and are used in two baseline SASV solutions.

Speaker Verification

Paper
Add Code

Baselines and Protocols for Household Speaker Recognition

1 code implementation • 30 Apr 2022 • Alexey Sholokhov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen

Speaker recognition on household devices, such as smart speakers, features several challenges: (i) robustness across a vast number of heterogeneous domains (households), (ii) short utterances, (iii) possibly absent speaker labels of the enrollment data (passive enrollment), and (iv) presence of unknown persons (guests).

Speaker Recognition

0

Paper
Code

I4U System Description for NIST SRE'20 CTS Challenge

no code implementations • 2 Nov 2022 • Kong Aik Lee, Tomi Kinnunen, Daniele Colibro, Claudio Vair, Andreas Nautsch, Hanwu Sun, Liang He, Tianyu Liang, Qiongqiong Wang, Mickael Rouvier, Pierre-Michel Bousquet, Rohan Kumar Das, Ignacio Viñals Bailo, Meng Liu, Héctor Deldago, Xuechen Liu, Md Sahidullah, Sandro Cumani, Boning Zhang, Koji Okabe, Hitoshi Yamamoto, Ruijie Tao, Haizhou Li, Alfonso Ortega Giménez, Longbiao Wang, Luis Buera

This manuscript describes the I4U submission to the 2020 NIST Speaker Recognition Evaluation (SRE'20) Conversational Telephone Speech (CTS) Challenge.

Speaker Recognition

Paper
Add Code

Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation

no code implementations • 20 Feb 2023 • Mark Anderson, Tomi Kinnunen, Naomi Harte

We show that although performance is overall improved, the filterbanks exhibit strong sensitivity to their initialisation strategy.

Action Detection Activity Detection

Paper
Add Code

Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification

no code implementations • 2 Mar 2023 • Xuechen Liu, Md Sahidullah, Tomi Kinnunen

Even though deep speaker models have demonstrated impressive accuracy in speaker verification tasks, this often comes at the expense of increased model size and computation time, presenting challenges for deployment in resource-constrained environments.

Knowledge Distillation Speaker Verification

Paper
Add Code

Towards single integrated spoofing-aware speaker verification embeddings

1 code implementation • 30 May 2023 • Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung

Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outperformed single embedding solutions by a large margin in the SASV2022 challenge.

Speaker Verification

0

Paper
Code

How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning

no code implementations • 31 May 2023 • Hye-jin Shim, Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen

Shortcut learning, or `Clever Hans effect` refers to situations where a learning agent (e. g., deep neural networks) learns spurious correlations present in data, resulting in biased models.

Paper
Add Code

Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech

no code implementations • 13 Jun 2023 • Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen

The first dataset, used for addressing short-term ageing (up to 10 years time difference between enrollment and test) under uncontrolled conditions, is VoxCeleb.

Speaker Verification

Paper
Add Code

Generalizing Speaker Verification for Spoof Awareness in the Embedding Space

no code implementations • 20 Jan 2024 • Xuechen Liu, Md Sahidullah, Kong Aik Lee, Tomi Kinnunen

To this end, we propose to generalize the standalone ASV (G-SASV) against spoofing attacks, where we leverage limited training data from CM to enhance a simple backend in the embedding space, without the involvement of a separate CM module during the test (authentication) phase.

Domain Adaptation Speaker Verification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.