Recent advancements in machine reading and listening comprehension involve the annotation of long texts.
In Natural Language Understanding, the task of response generation is usually focused on responses to short texts, such as tweets or a turn in a dialog.
To this end, we collected a large dataset of $400$ speeches in English discussing $200$ controversial topics, mined claims for each topic, and asked annotators to identify the mined claims mentioned in each speech.
We applied baseline methods addressing the task, to be used as a benchmark for future work over this dataset.
We describe a large, high-quality benchmark for the evaluation of Mention Detection tools.
This paper describes an English audio and textual dataset of debating speeches, a unique resource for the growing research field of computational argumentation and debating technologies.
The stream of words produced by Automatic Speech Recognition (ASR) systems is typically devoid of punctuations and formatting.