In general, the direct Speech-to-text translation (ST) is jointly trained with Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks.
Ranked #1 on Speech-to-Text Translation on MuST-C EN->DE (using extra training data)
Inspired by these learning patterns in humans, we suggest a simple yet generic task aware framework to incorporate into existing joint learning strategies.
The current re-translation approaches are based on autoregressive sequence generation models (ReTA), which generate tar-get tokens in the (partial) translation sequentially.
This paper describes our submission to the TL;DR challenge.
Experimental results using chitchat data reveal that (1) near human-like dialogue policies can be induced, (2) generalisation to unseen data is a difficult problem, and (3) training an ensemble of chatbot agents is essential for improved performance over using a single agent.
Training chatbots using the reinforcement learning paradigm is challenging due to high-dimensional states, infinite action spaces and the difficulty in specifying the reward function.
In this paper, we propose a self-learning architecture for generating natural language templates for conversational assistants.