Search Results for author: Cornelius Weber

Found 43 papers, 18 papers with code

Chat with the Environment: Interactive Multimodal Perception using Large Language Models

no code implementations14 Mar 2023 Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter

Programming robot behaviour in a complex world faces challenges on multiple levels, from dextrous low-level skills to high-level planning and reasoning.

Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning

no code implementations7 Mar 2023 Mostafa Kotb, Cornelius Weber, Stefan Wermter

Model-based reinforcement learning (MBRL) with real-time planning has shown great potential in locomotion and manipulation control tasks.

Continuous Control Contrastive Learning +2

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

no code implementations20 Feb 2023 Leyuan Qu, Cornelius Weber, Stefan Wermter

Furthermore, our proposed combined loss rescaling and weight consolidation methods can support continual learning of an ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Internally Rewarded Reinforcement Learning

no code implementations1 Feb 2023 Mengdi Li, Xufeng Zhao, Jae Hee Lee, Cornelius Weber, Stefan Wermter

We study a class of reinforcement learning problems where the reward signals for policy learning are generated by a discriminator that is dependent on and jointly optimized with the policy.

reinforcement-learning Reinforcement Learning (RL)

Disentangling Prosody Representations with Unsupervised Speech Reconstruction

no code implementations14 Dec 2022 Leyuan Qu, Taihao Li, Cornelius Weber, Theresa Pekarek-Rosin, Fuji Ren, Stefan Wermter

Human speech can be characterized by different components, including semantic content, speaker identity and prosodic information.

Association Automatic Speech Recognition +6

Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge

1 code implementation23 Nov 2022 Hugo Carneiro, Cornelius Weber, Stefan Wermter

Finally, we devise a model for emotion recognition in conversations trained on the realigned MELD-FAIR videos, which outperforms state-of-the-art models for ERC based on vision alone.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Visually Grounded Commonsense Knowledge Acquisition

1 code implementation22 Nov 2022 Yuan YAO, Tianyu Yu, Ao Zhang, Mengdi Li, Ruobing Xie, Cornelius Weber, Zhiyuan Liu, Hai-Tao Zheng, Stefan Wermter, Tat-Seng Chua, Maosong Sun

In this work, we present CLEVER, which formulates CKE as a distantly supervised multi-instance learning problem, where models learn to summarize commonsense relations from a bag of images about an entity pair without any human annotation on image instances.

Language Modelling

Data Augmentation with Unsupervised Speaking Style Transfer for Speech Emotion Recognition

no code implementations16 Nov 2022 Leyuan Qu, Wei Wang, Taihao Li, Cornelius Weber, Stefan Wermter, Fuji Ren

Once training is completed, EmoAug enriches expressions of emotional speech in different prosodic attributes, such as stress, rhythm and intensity, by feeding different styles into the paralinguistic encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations

1 code implementation4 Aug 2022 Xufeng Zhao, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter

Sound is one of the most informative and abundant modalities in the real world while being robust to sense without contacts by small and cheap sensors that can be placed on mobile devices.

Efficient Exploration Unsupervised Reinforcement Learning

Learning Flexible Translation between Robot Actions and Language Descriptions

no code implementations15 Jul 2022 Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Stefan Wermter

In this work, we propose the paired gated autoencoders (PGAE) for flexible translation between robot actions and language descriptions in a tabletop object manipulation scenario.

Language Modelling Multi-Task Learning +1

Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for Grounding Relative Directions via Multi-Task Learning

1 code implementation6 Jul 2022 Kyra Ahrens, Matthias Kerzel, Jae Hee Lee, Cornelius Weber, Stefan Wermter

Spatial reasoning poses a particular challenge for intelligent agents and is at the same time a prerequisite for their successful interaction and communication in the physical world.

Multi-Task Learning Question Answering +2

What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning

1 code implementation5 May 2022 Jae Hee Lee, Matthias Kerzel, Kyra Ahrens, Cornelius Weber, Stefan Wermter

Grounding relative directions is more difficult than grounding absolute directions because it not only requires a model to detect objects in the image and to identify spatial relation based on this information, but it also needs to recognize the orientation of objects and integrate this information into the reasoning process.

Multi-Task Learning Question Answering +2

Language Model-Based Paired Variational Autoencoders for Robotic Language Learning

no code implementations17 Jan 2022 Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Stefan Wermter

Human infants learn language while interacting with their environment in which their caregivers may describe the objects and actions they perform.

Language Modelling

LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading

no code implementations9 Dec 2021 Leyuan Qu, Cornelius Weber, Stefan Wermter

The aim of this work is to investigate the impact of crossmodal self-supervised pre-training for speech reconstruction (video-to-audio) by leveraging the natural co-occurrence of audio and visual streams in videos.

Lip Reading speech-recognition +1

Lifelong Learning from Event-based Data

1 code implementation11 Nov 2021 Vadym Gryshchuk, Cornelius Weber, Chu Kiong Loo, Stefan Wermter

Lifelong learning is a long-standing aim for artificial agents that act in dynamic environments, in which an agent needs to accumulate knowledge incrementally without forgetting previously learned representations.

FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection

no code implementations1 Sep 2021 Hugo Carneiro, Cornelius Weber, Stefan Wermter

The strong relation between face and voice can aid active speaker detection systems when faces are visible, even in difficult settings, when the face of a speaker is not clear or when there are several people in the same scene.


Generalization in Multimodal Language Learning from Simulation

no code implementations3 Aug 2021 Aaron Eisermann, Jae Hee Lee, Cornelius Weber, Stefan Wermter

Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution.

Survey on reinforcement learning for language processing

no code implementations12 Apr 2021 Victor Uc-Cetina, Nicolas Navarro-Guerrero, Anabel Martin-Gonzalez, Cornelius Weber, Stefan Wermter

In recent years some researchers have explored the use of reinforcement learning (RL) algorithms as key components in the solution of various natural language processing tasks.

reinforcement-learning Reinforcement Learning (RL)

Visual Distant Supervision for Scene Graph Generation

1 code implementation ICCV 2021 Yuan YAO, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun

In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.

Graph Generation Predicate Classification +1

A Sub-Layered Hierarchical Pyramidal Neural Architecture for Facial Expression Recognition

no code implementations23 Mar 2021 Henrique Siqueira, Pablo Barros, Sven Magg, Cornelius Weber, Stefan Wermter

In domains where computational resources and labeled data are limited, such as in robotics, deep networks with millions of weights might not be the optimal solution.

Facial Expression Recognition (FER)

Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision

1 code implementation10 Feb 2021 Julien Scholz, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter

Using a model of the environment, reinforcement learning agents can plan their future moves and achieve superhuman performance in board games like Chess, Shogi, and Go, while remaining relatively sample-efficient.

Board Games Model-based Reinforcement Learning +3

Crossmodal Language Grounding in an Embodied Neurocognitive Model

1 code implementation24 Jun 2020 Stefan Heinrich, Yuan YAO, Tobias Hinz, Zhiyuan Liu, Thomas Hummel, Matthias Kerzel, Cornelius Weber, Stefan Wermter

From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, sensory and sensorimotor modalities, and acquired by means of crossmodal integration.

Multimodal Target Speech Separation with Voice and Face References

no code implementations17 May 2020 Leyuan Qu, Cornelius Weber, Stefan Wermter

Target speech separation refers to isolating target speech from a multi-speaker mixture signal by conditioning on auxiliary information about the target speaker.

Audio and Speech Processing Sound

Improving Robot Dual-System Motor Learning with Intrinsically Motivated Meta-Control and Latent-Space Experience Imagination

1 code implementation19 Apr 2020 Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter

In this paper, we present a novel dual-system motor learning approach where a meta-controller arbitrates online between model-based and model-free decisions based on an estimate of the local reliability of the learned model.

Robotic Grasping

Periodic Spectral Ergodicity: A Complexity Measure for Deep Neural Networks and Neural Architecture Search

1 code implementation10 Nov 2019 Mehmet Süzen, J. J. Cerdà, Cornelius Weber

Establishing associations between the structure and the generalisation ability of deep neural networks (DNNs) is a challenging task in modern machine learning.

Neural Architecture Search

Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning

no code implementations5 May 2019 Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter

Recent success in deep reinforcement learning for continuous control has been dominated by model-free approaches which, unlike model-based approaches, do not suffer from representational limitations in making assumptions about the world dynamics and model errors inevitable in complex domains.

Continuous Control

KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition from YouTube Videos

1 code implementation EMNLP 2018 Egor Lakomkin, Sven Magg, Cornelius Weber, Stefan Wermter

In this paper, we describe KT-Speech-Crawler: an approach for automatic dataset construction for speech recognition by crawling YouTube videos.

speech-recognition Speech Recognition

Incorporating End-to-End Speech Recognition Models for Sentiment Analysis

no code implementations28 Feb 2019 Egor Lakomkin, Mohammad Ali Zamani, Cornelius Weber, Sven Magg, Stefan Wermter

We argue that using ground-truth transcriptions during training and evaluation phases leads to a significant discrepancy in performance compared to real-world conditions, as the spoken text has to be recognized on the fly and can contain speech recognition mistakes.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning

no code implementations26 Oct 2018 Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter

In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input.

Continuous Control

Discourse-Wizard: Discovering Deep Discourse Structure in your Conversation with RNNs

1 code implementation29 Jun 2018 Chandrakant Bothe, Sven Magg, Cornelius Weber, Stefan Wermter

Spoken language understanding is one of the key factors in a dialogue system, and a context in a conversation plays an important role to understand the current utterance.

Spoken Language Understanding

Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization

1 code implementation28 May 2018 German I. Parisi, Jun Tani, Cornelius Weber, Stefan Wermter

Both growing networks can expand in response to novel sensory experience: the episodic memory learns fine-grained spatiotemporal representations of object instances in an unsupervised fashion while the semantic memory uses task-relevant signals to regulate structural plasticity levels and develop more compact representations from episodic experience.

Active Learning Continuous Object Recognition +1

On the Robustness of Speech Emotion Recognition for Human-Robot Interaction with Deep Neural Networks

no code implementations6 Apr 2018 Egor Lakomkin, Mohammad Ali Zamani, Cornelius Weber, Sven Magg, Stefan Wermter

Speech emotion recognition (SER) is an important aspect of effective human-robot collaboration and received a lot of attention from the research community.

Data Augmentation Speech Emotion Recognition

Reusing Neural Speech Representations for Auditory Emotion Recognition

no code implementations IJCNLP 2017 Egor Lakomkin, Cornelius Weber, Sven Magg, Stefan Wermter

Acoustic emotion recognition aims to categorize the affective state of the speaker and is still a difficult task for machine learning models.

Emotion Recognition General Classification +1

Spectral Ergodicity in Deep Learning Architectures via Surrogate Random Matrices

1 code implementation25 Apr 2017 Mehmet Süzen, Cornelius Weber, Joan J. Cerdà

It is observed that as the matrix size increases the level of spectral ergodicity of the ensemble rises, i. e., the eigenvalue spectra obtained for a single realisation at random from the ensemble is closer to the spectra obtained averaging over the whole ensemble.

Cannot find the paper you are looking for? You can Submit a new open access paper.