Interoperability of Dialogue Corpora through ISO 24617-2-based Querying

LREC 2014  ·  Volha Petukhova, Andrei Malchanau, Harry Bunt ·

This paper explores a way of achieving interoperability: developing a query format for accessing existing annotated corpora whose expressions make use of the annotation language defined by the standard. The interpretation of expressions in the query implements a mapping from ISO 24617-2 concepts to those of the annotation scheme used in the corpus. We discuss two possible ways to query existing annotated corpora using DiAML. One way is to transform corpora into DiAML compliant format, and subsequently query these data using XQuery or XPath. The second approach is to define a DiAML query that can be directly used to retrieve requested information from the annotated data. Both approaches are valid. The first one presents a standard way of querying XML data. The second approach is a DiAML-oriented querying of dialogue act annotated data, for which we designed an interface. The proposed approach is tested on two important types of existing dialogue corpora: spoken two-person dialogue corpora collected and annotated within the HCRC Map Task paradigm, and multiparty face-to-face dialogues of the AMI corpus. We present the results and evaluate them with respect to accuracy and completeness through statistical comparisons between retrieved and manually constructed reference annotations.

PDF Abstract
No code implementations yet. Submit your code now



  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here