Search Results for author: Dong Won Lee

Found 9 papers, 4 papers with code

Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback

no code implementations17 Mar 2024 Dong Won Lee, Hae Won Park, Yoon Kim, Cynthia Breazeal, Louis-Philippe Morency

We describe an approach for aligning an LLM-based dialogue agent based on global (i. e., dialogue-level) rewards, while also taking into account naturally-occurring multimodal signals.

HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer

no code implementations21 May 2023 Yubin Kim, Dong Won Lee, Paul Pu Liang, Sharifa Algohwinem, Cynthia Breazeal, Hae Won Park

Accurately modeling affect dynamics, which refers to the changes and fluctuations in emotions and affective displays during human conversations, is crucial for understanding human interactions.

Language Modelling Large Language Model

Multipar-T: Multiparty-Transformer for Capturing Contingent Behaviors in Group Conversations

no code implementations19 Apr 2023 Dong Won Lee, Yubin Kim, Rosalind Picard, Cynthia Breazeal, Hae Won Park

As we move closer to real-world AI systems, AI agents must be able to deal with multiparty (group) conversations.

Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos

no code implementations ICCV 2023 Dong Won Lee, Chaitanya Ahuja, Paul Pu Liang, Sanika Natu, Louis-Philippe Morency

We introduce three research tasks, (1) figure-to-text retrieval, (2) text-to-figure retrieval, and (3) generation of slide explanations, which are grounded in multimedia learning and psychology principles to test a vision-language model's understanding of multimodal content.

Attribute Retrieval +1

Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides

2 code implementations17 Aug 2022 Dong Won Lee, Chaitanya Ahuja, Paul Pu Liang, Sanika Natu, Louis-Philippe Morency

As a step toward developing AI to aid in student learning as intelligent teacher assistants, we introduce the Multimodal Lecture Presentations dataset as a large-scale benchmark testing the capabilities of machine learning models in multimodal understanding of educational content.

Attribute

Low-Resource Adaptation for Personalized Co-Speech Gesture Generation

no code implementations CVPR 2022 Chaitanya Ahuja, Dong Won Lee, Louis-Philippe Morency

Personalizing an avatar for co-speech gesture generation from spoken language requires learning the idiosyncrasies of a person's gesture style from a small amount of data.

Gesture Generation

Crossmodal clustered contrastive learning: Grounding of spoken language to gesture

1 code implementation ACM ICMI Workshop GENEA 2021 Dong Won Lee, Chaitanya Ahuja, Louis-Philippe Morency

Crossmodal grounding is a key challenge for the task of generating relevant and well-timed gestures from just spoken language as an input.

Contrastive Learning

Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach

1 code implementation ECCV 2020 Chaitanya Ahuja, Dong Won Lee, Yukiko I. Nakano, Louis-Philippe Morency

A key challenge, called gesture style transfer, is to learn a model that generates these gestures for a speaking agent 'A' in the gesturing style of a target speaker 'B'.

Gesture Generation Style Transfer

Cannot find the paper you are looking for? You can Submit a new open access paper.