no code implementations • 14 Mar 2023 • Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
Programming robot behaviour in a complex world faces challenges on multiple levels, from dextrous low-level skills to high-level planning and reasoning.
no code implementations • 7 Mar 2023 • Mostafa Kotb, Cornelius Weber, Stefan Wermter
Model-based reinforcement learning (MBRL) with real-time planning has shown great potential in locomotion and manipulation control tasks.
no code implementations • 20 Feb 2023 • Leyuan Qu, Cornelius Weber, Stefan Wermter
Furthermore, our proposed combined loss rescaling and weight consolidation methods can support continual learning of an ASR system.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 1 Feb 2023 • Mengdi Li, Xufeng Zhao, Jae Hee Lee, Cornelius Weber, Stefan Wermter
We study a class of reinforcement learning problems where the reward signals for policy learning are generated by a discriminator that is dependent on and jointly optimized with the policy.
no code implementations • 9 Jan 2023 • Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Muhammad Burhan Hafez, Patrick Bruns, Stefan Wermter
Only occasionally, a learning infant would receive a matching verbal description of an action it is committing, which is similar to supervised learning.
no code implementations • 14 Dec 2022 • Leyuan Qu, Taihao Li, Cornelius Weber, Theresa Pekarek-Rosin, Fuji Ren, Stefan Wermter
Human speech can be characterized by different components, including semantic content, speaker identity and prosodic information.
1 code implementation • 23 Nov 2022 • Hugo Carneiro, Cornelius Weber, Stefan Wermter
Finally, we devise a model for emotion recognition in conversations trained on the realigned MELD-FAIR videos, which outperforms state-of-the-art models for ERC based on vision alone.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 22 Nov 2022 • Yuan YAO, Tianyu Yu, Ao Zhang, Mengdi Li, Ruobing Xie, Cornelius Weber, Zhiyuan Liu, Hai-Tao Zheng, Stefan Wermter, Tat-Seng Chua, Maosong Sun
In this work, we present CLEVER, which formulates CKE as a distantly supervised multi-instance learning problem, where models learn to summarize commonsense relations from a bag of images about an entity pair without any human annotation on image instances.
no code implementations • 16 Nov 2022 • Leyuan Qu, Wei Wang, Taihao Li, Cornelius Weber, Stefan Wermter, Fuji Ren
Once training is completed, EmoAug enriches expressions of emotional speech in different prosodic attributes, such as stress, rhythm and intensity, by feeding different styles into the paralinguistic encoder.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 4 Aug 2022 • Xufeng Zhao, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
Sound is one of the most informative and abundant modalities in the real world while being robust to sense without contacts by small and cheap sensors that can be placed on mobile devices.
no code implementations • 15 Jul 2022 • Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Stefan Wermter
In this work, we propose the paired gated autoencoders (PGAE) for flexible translation between robot actions and language descriptions in a tabletop object manipulation scenario.
1 code implementation • 6 Jul 2022 • Kyra Ahrens, Matthias Kerzel, Jae Hee Lee, Cornelius Weber, Stefan Wermter
Spatial reasoning poses a particular challenge for intelligent agents and is at the same time a prerequisite for their successful interaction and communication in the physical world.
1 code implementation • 5 May 2022 • Jae Hee Lee, Matthias Kerzel, Kyra Ahrens, Cornelius Weber, Stefan Wermter
Grounding relative directions is more difficult than grounding absolute directions because it not only requires a model to detect objects in the image and to identify spatial relation based on this information, but it also needs to recognize the orientation of objects and integrate this information into the reasoning process.
no code implementations • LREC 2022 • Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira, Stefan Wermter
Large datasets as required for deep learning of lip reading do not exist in many languages.
no code implementations • 17 Jan 2022 • Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Stefan Wermter
Human infants learn language while interacting with their environment in which their caregivers may describe the objects and actions they perform.
no code implementations • 9 Dec 2021 • Leyuan Qu, Cornelius Weber, Stefan Wermter
The aim of this work is to investigate the impact of crossmodal self-supervised pre-training for speech reconstruction (video-to-audio) by leveraging the natural co-occurrence of audio and visual streams in videos.
1 code implementation • 11 Nov 2021 • Vadym Gryshchuk, Cornelius Weber, Chu Kiong Loo, Stefan Wermter
Lifelong learning is a long-standing aim for artificial agents that act in dynamic environments, in which an agent needs to accumulate knowledge incrementally without forgetting previously learned representations.
no code implementations • 1 Sep 2021 • Hugo Carneiro, Cornelius Weber, Stefan Wermter
The strong relation between face and voice can aid active speaker detection systems when faces are visible, even in difficult settings, when the face of a speaker is not clear or when there are several people in the same scene.
no code implementations • 3 Aug 2021 • Aaron Eisermann, Jae Hee Lee, Cornelius Weber, Stefan Wermter
Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution.
no code implementations • 12 Apr 2021 • Victor Uc-Cetina, Nicolas Navarro-Guerrero, Anabel Martin-Gonzalez, Cornelius Weber, Stefan Wermter
In recent years some researchers have explored the use of reinforcement learning (RL) algorithms as key components in the solution of various natural language processing tasks.
1 code implementation • ICCV 2021 • Yuan YAO, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun
In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
no code implementations • 23 Mar 2021 • Henrique Siqueira, Pablo Barros, Sven Magg, Cornelius Weber, Stefan Wermter
In domains where computational resources and labeled data are limited, such as in robotics, deep networks with millions of weights might not be the optimal solution.
1 code implementation • 10 Feb 2021 • Julien Scholz, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
Using a model of the environment, reinforcement learning agents can plan their future moves and achieve superhuman performance in board games like Chess, Shogi, and Go, while remaining relatively sample-efficient.
1 code implementation • 24 Jun 2020 • Stefan Heinrich, Yuan YAO, Tobias Hinz, Zhiyuan Liu, Thomas Hummel, Matthias Kerzel, Cornelius Weber, Stefan Wermter
From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, sensory and sensorimotor modalities, and acquired by means of crossmodal integration.
no code implementations • 17 May 2020 • Leyuan Qu, Cornelius Weber, Stefan Wermter
Target speech separation refers to isolating target speech from a multi-speaker mixture signal by conditioning on auxiliary information about the target speaker.
Audio and Speech Processing Sound
1 code implementation • 19 Apr 2020 • Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
In this paper, we present a novel dual-system motor learning approach where a meta-controller arbitrates online between model-based and model-free decisions based on an estimate of the local reliability of the learned model.
2 code implementations • LREC 2020 • Chandrakant Bothe, Cornelius Weber, Sven Magg, Stefan Wermter
These neural models annotate the emotion corpora with dialogue act labels, and an ensemble annotator extracts the final dialogue act label.
1 code implementation • 10 Nov 2019 • Mehmet Süzen, J. J. Cerdà, Cornelius Weber
Establishing associations between the structure and the generalisation ability of deep neural networks (DNNs) is a challenging task in modern machine learning.
no code implementations • 10 Oct 2019 • Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
The learned models are used to generate imagined experiences, augmenting the training set of real experiences.
no code implementations • 5 Sep 2019 • Di Fu, Cornelius Weber, Guochun Yang, Matthias Kerzel, Weizhi Nan, Pablo Barros, Haiyan Wu, Xun Liu, Stefan Wermter
Selective attention plays an essential role in information acquisition and utilization from the environment.
no code implementations • 5 May 2019 • Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
Recent success in deep reinforcement learning for continuous control has been dominated by model-free approaches which, unlike model-based approaches, do not suffer from representational limitations in making assumptions about the world dynamics and model errors inevitable in complex domains.
1 code implementation • EMNLP 2018 • Egor Lakomkin, Sven Magg, Cornelius Weber, Stefan Wermter
In this paper, we describe KT-Speech-Crawler: an approach for automatic dataset construction for speech recognition by crawling YouTube videos.
no code implementations • 28 Feb 2019 • Egor Lakomkin, Mohammad Ali Zamani, Cornelius Weber, Sven Magg, Stefan Wermter
We argue that using ground-truth transcriptions during training and evaluation phases leads to a significant discrepancy in performance compared to real-world conditions, as the spoken text has to be recognized on the fly and can contain speech recognition mistakes.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 26 Oct 2018 • Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input.
1 code implementation • 29 Jun 2018 • Chandrakant Bothe, Sven Magg, Cornelius Weber, Stefan Wermter
Spoken language understanding is one of the key factors in a dialogue system, and a context in a conversation plays an important role to understand the current utterance.
1 code implementation • 28 May 2018 • German I. Parisi, Jun Tani, Cornelius Weber, Stefan Wermter
Both growing networks can expand in response to novel sensory experience: the episodic memory learns fine-grained spatiotemporal representations of object instances in an unsupervised fashion while the semantic memory uses task-relevant signals to regulate structural plasticity levels and develop more compact representations from episodic experience.
1 code implementation • 16 May 2018 • Chandrakant Bothe, Sven Magg, Cornelius Weber, Stefan Wermter
Recent approaches for dialogue act recognition have shown that context from preceding utterances is important to classify the subsequent one.
Ranked #9 on
Dialogue Act Classification
on Switchboard corpus
1 code implementation • LREC 2018 • Chandrakant Bothe, Cornelius Weber, Sven Magg, Stefan Wermter
Dialogue act recognition is an important part of natural language understanding.
Ranked #10 on
Dialogue Act Classification
on Switchboard corpus
no code implementations • 6 Apr 2018 • Egor Lakomkin, Mohammad Ali Zamani, Cornelius Weber, Sven Magg, Stefan Wermter
Speech emotion recognition (SER) is an important aspect of effective human-robot collaboration and received a lot of attention from the research community.
no code implementations • 3 Apr 2018 • Egor Lakomkin, Mohammad Ali Zamani, Cornelius Weber, Sven Magg, Stefan Wermter
Acoustically expressed emotions can make communication with a robot more efficient.
no code implementations • EACL 2017 • Egor Lakomkin, Cornelius Weber, Stefan Wermter
In this work, we tackle a problem of speech emotion classification.
no code implementations • IJCNLP 2017 • Egor Lakomkin, Cornelius Weber, Sven Magg, Stefan Wermter
Acoustic emotion recognition aims to categorize the affective state of the speaker and is still a difficult task for machine learning models.
1 code implementation • 25 Apr 2017 • Mehmet Süzen, Cornelius Weber, Joan J. Cerdà
It is observed that as the matrix size increases the level of spectral ergodicity of the ensemble rises, i. e., the eigenvalue spectra obtained for a single realisation at random from the ensemble is closer to the spectra obtained averaging over the whole ensemble.