There lacks an efficient method to help users conduct gesture exploration, which is challenging due to the intrinsically temporal evolution of gestures and their complex correlation to speech content.
Despite being a critical communication skill, grasping humor is challenging -- a successful use of humor requires a mixture of both engaging content build-up and an appropriate vocal delivery (e. g., pause).
The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech.
Our visualization system features a channel coherence view and a sentence clustering view that together enable users to obtain a quick overview of emotion coherence and its temporal evolution.
Often, it is difficult to explore the relationships between the learned parameters and the model performance due to a large number of parameters and different random initializations.