Search Results for author: David Thulke

Found 11 papers, 8 papers with code

ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

1 code implementation17 Jan 2024 David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein, Taylor Tragemann, Katie Nguyen, Ariana Fowler, Andrew Stanco, Jon Gabriel, Jordan Taylor, Dean Moro, Evgenii Tsymbalov, Juliette de Waal, Evgeny Matusov, Mudar Yaghi, Mohammad Shihadah, Hermann Ney, Christian Dugast, Jonathan Dotan, Daniel Erasmus

To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages.

Machine Translation Retrieval

Exploring Spoken Named Entity Recognition: A Cross-Lingual Perspective

1 code implementation3 Jul 2023 Moncef Benaicha, David Thulke, M. A. Tuğtekin Turan

Recent advancements in Named Entity Recognition (NER) have significantly improved the identification of entities in textual data.

Cross-Lingual Transfer named-entity-recognition +4

Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

no code implementations14 Apr 2023 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges (DSTC9 and DSTC10).

Automatic Speech Recognition Data Augmentation +2

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

1 code implementation9 Nov 2022 Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

1 code implementation31 Oct 2022 Nico Daheim, David Thulke, Christian Dugast, Hermann Ney

In this work, we present a model for document-grounded response generation in dialog that is decomposed into two components according to Bayes theorem.

Response Generation

Does Joint Training Really Help Cascaded Speech Translation?

1 code implementation24 Oct 2022 Viet Anh Khoa Tran, David Thulke, Yingbo Gao, Christian Herold, Hermann Ney

Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Adapting Document-Grounded Dialog Systems to Spoken Conversations using Data Augmentation and a Noisy Channel Model

1 code implementation16 Dec 2021 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our submission to Task 2 of the second track of the 10th Dialog System Technology Challenge (DSTC10) "Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations".

Data Augmentation Task 2

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

1 code implementation ACL (dialdoc) 2021 Nico Daheim, David Thulke, Christian Dugast, Hermann Ney

For the second subtask, we use a cascaded model which grounds the response prediction on the predicted span instead of the full document.

Response Generation valid

On Sampling-Based Training Criteria for Neural Language Modeling

no code implementations21 Apr 2021 Yingbo Gao, David Thulke, Alexander Gerstenberger, Khoa Viet Tran, Ralf Schlüter, Hermann Ney

As the vocabulary size of modern word-based language models becomes ever larger, many sampling-based training criteria are proposed and investigated.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Efficient Retrieval Augmented Generation from Unstructured Knowledge for Task-Oriented Dialog

1 code implementation9 Feb 2021 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our work on the first track of the ninth Dialog System Technology Challenge (DSTC 9), "Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access".

Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.