For goal-oriented document-grounded dialogs, it often involves complex contexts for identifying the most relevant information, which requires better understanding of the inter-relations between conversations and documents. Meanwhile, many online user-oriented documents use both semi-structured and unstructured contents for guiding users to access information of different contexts. Thus, we create a new goal-oriented document-grounded dialogue dataset that captures more diverse scenarios derived from various document contents from multiple domains such ssa.gov and studentaid.gov. For data collection, we propose a novel pipeline approach for dialogue data construction, which has been adapted and evaluated for several domains.
34 PAPERS • NO BENCHMARKS YET
Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them.
13 PAPERS • 1 BENCHMARK
In MutualFriends, two agents, A and B, each have a private knowledge base, which contains a list of friends with multiple attributes (e.g., name, school, major, etc.). The agents must chat with each other to find their unique mutual friend.
7 PAPERS • NO BENCHMARKS YET
DiaASQ is a fine-grained Aspect-based Sentiment Analysis (ABSA) benchmark under the conversation scenario. It challenges existing ABSA methods by 1) extracting quadruple of target-aspect-opinion-sentiment in a dialogue, and 2) modeling the dialogue discourse structures. The dataset is constructed by systematically crawling tweets from digital bloggers, followed by a series of preprocessing steps including filtering, normalizing, pruning, and annotating the collected dialogues, resulting in a final corpus of 1,000 dialogues. To enhance the multilingual usability, DiaASQ has both the English and Chinese versions of languages.
3 PAPERS • 2 BENCHMARKS
Emotional Dialogue Acts data contains dialogue act labels for existing emotion multi-modal conversational datasets. We chose two popular multimodal emotion datasets: Multimodal EmotionLines Dataset (MELD) and Interactive Emotional dyadic MOtion CAPture database (IEMOCAP). EDAs reveal associations between dialogue acts and emotional states in a natural-conversational language such as Accept/Agree dialogue acts often occur with the Joy emotion, Apology with Sadness, and Thanking with Joy.
3 PAPERS • NO BENCHMARKS YET
The Mafia Dataset was created to model the behavior of deceptive actors in the context of the Mafia game, as described in the paper “Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia”. We hope that this dataset will be of use to others studying the effects of deception on language use.
1 PAPER • NO BENCHMARKS YET