no code implementations • NAACL (CLPsych) 2021 • Zixiu Wu, Rim Helaoui, Diego Reforgiato Recupero, Daniele Riboni
Gauging therapist empathy in counselling is an important component of understanding counselling quality.
1 code implementation • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022 • Zixiu Wu, Simone Balloccu, Vivek Kumar, Rim Helaoui, Ehud Reiter, Diego Reforgiato Recupero, Daniele Riboni
Research on natural language processing for counselling dialogue analysis has seen substantial development in recent years, but access to this area remains extremely limited due to the lack of publicly available expert-annotated therapy conversations.
no code implementations • 20 May 2021 • Zixiu Wu, Rim Helaoui, Vivek Kumar, Diego Reforgiato Recupero, Daniele Riboni
Empathetic response from the therapist is key to the success of clinical psychotherapy, especially motivational interviewing.
no code implementations • EMNLP (IWSLT) 2019 • Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia
Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 Oct 2019 • Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia
This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.
no code implementations • 5 Aug 2019 • Zixiu Wu, Julia Ive, Josiah Wang, Pranava Madhyastha, Lucia Specia
The question we ask ourselves is whether visual features can support the translation process, in particular, given that this is a dataset extracted from videos, we focus on the translation of actions, which we believe are poorly captured in current static image-text datasets currently used for multimodal translation.