Search Results for author: Tanja Schultz

Found 17 papers, 4 papers with code

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition

no code implementations2 Feb 2024 Yi Chang, Zhao Ren, Zixing Zhang, Xin Jing, Kun Qian, Xi Shao, Bin Hu, Tanja Schultz, Björn W. Schuller

Speech contains rich information on the emotions of humans, and Speech Emotion Recognition (SER) has been an important topic in the area of human-computer interaction.

Adversarial Attack Speech Emotion Recognition

Uncovering the Full Potential of Visual Grounding Methods in VQA

1 code implementation15 Jan 2024 Daniel Reich, Tanja Schultz

In this study, we demonstrate that current evaluation schemes for VG-methods are problematic due to the flawed assumption of availability of relevant visual information.

Question Answering Visual Grounding +1

NeuroHeed: Neuro-Steered Speaker Extraction using EEG Signals

no code implementations26 Jul 2023 Zexu Pan, Marvin Borsdorf, Siqi Cai, Tanja Schultz, Haizhou Li

We propose both an offline and an online NeuroHeed, with the latter designed for real-time inference.

EEG

Measuring Faithful and Plausible Visual Grounding in VQA

1 code implementation24 May 2023 Daniel Reich, Felix Putze, Tanja Schultz

Metrics for Visual Grounding (VG) in Visual Question Answering (VQA) systems primarily aim to measure a system's reliance on relevant parts of the image when inferring an answer to the given question.

Question Answering Visual Grounding +1

Visually Grounded VQA by Lattice-based Retrieval

1 code implementation15 Nov 2022 Daniel Reich, Felix Putze, Tanja Schultz

Visual Grounding (VG) in Visual Question Answering (VQA) systems describes how well a system manages to tie a question and its answer to relevant image regions.

Information Retrieval Question Answering +4

Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs

no code implementations28 Jun 2021 Daniel Reich, Felix Putze, Tanja Schultz

With the expressed goal of improving system transparency and visual grounding in the reasoning process in VQA, we present a modular system for the task of compositional VQA based on scene graphs.

Question Answering Task 2 +2

DNN-Based Multilingual Automatic Speech Recognition for Wolaytta using Oromo Speech

no code implementations LREC 2020 Martha Yifiru Tachbelie, Solomon Teferra Abate, Tanja Schultz

Our results show the possibility of developing ASR system for a language, if we have pronunciation dictionary and language model, using an existing speech corpus of another language irrespective of their language family.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Building Language Models for Morphological Rich Low-Resource Languages using Data from Related Donor Languages: the Case of Uyghur

no code implementations LREC 2020 Ayimunishagu Abulimiti, Tanja Schultz

In this work, we show our effort to build word-based as well as morpheme-based language models for Uyghur, a language that combines both challenges, i. e. it is a low-resource and agglutinative language.

Language Modelling

Automatic Speech Recognition for Uyghur through Multilingual Acoustic Modeling

no code implementations LREC 2020 Ayimunishagu Abulimiti, Tanja Schultz

For the developing of multilingual speech recognition system for Uyghur, we used Turkish as donor language, which we selected from GlobalPhone corpus as the most similar language to Uyghur.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Syntactic and Semantic Features For Code-Switching Factored Language Models

no code implementations4 Oct 2017 Heike Adel, Ngoc Thang Vu, Katrin Kirchhoff, Dominic Telaar, Tanja Schultz

The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

GlobalPhone: Pronunciation Dictionaries in 20 Languages

no code implementations LREC 2014 Tanja Schultz, Tim Schlippe

This paper describes the advances in the multilingual text and speech database GlobalPhone, a multilingual database of high-quality read speech with corresponding transcriptions and pronunciation dictionaries in 20 languages.

Language Identification Language Modelling +4

Cannot find the paper you are looking for? You can Submit a new open access paper.