Search Results for author: Liyun Zhang

Found 5 papers, 1 papers with code

Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting

no code implementations26 Nov 2024 Liyun Zhang, Dian Ding, Yu Lu, Yi-Chao Chen, Guangtao Xue

In this paper, we present a framework, Lantern, that can improve the performance of a certain vanilla model by prompting large language models with receptive-field-aware attention weighting.

Emotion Recognition

UniAutoML: A Human-Centered Framework for Unified Discriminative and Generative AutoML with Large Language Models

1 code implementation9 Oct 2024 Jiayi Guo, Zan Chen, Yingrui Ji, Liyun Zhang, Daqin Luo, Zhigang Li, Yiqin Shen

Additionally, these frameworks lack interpretability and user engagement during the training process, primarily due to the absence of human-centered design.

AutoML Model Selection

3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy

no code implementations17 Sep 2024 Xuanmeng Sha, Liyun Zhang, Tomohiro Mashita, Yuki Uranishi

This method generates variable and realistic human facial movements by predicting the 3D vertex trajectory on the 3D facial template with diffusion policy instead of facial generation for every frame.

MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues

no code implementations23 Jul 2024 Liyun Zhang

Multimodal Large Language Models (MLLMs) have demonstrated remarkable multimodal emotion recognition capabilities, integrating multimodal cues from visual, acoustic, and linguistic contexts in the video to recognize human emotional states.

Multimodal Emotion Recognition

Panoptic-aware Image-to-Image Translation

no code implementations3 Dec 2021 Liyun Zhang, Photchara Ratsamee, Bowen Wang, Zhaojie Luo, Yuki Uranishi, Manabu Higashida, Haruo Takemura

The panoptic perception (i. e., foreground instances and background semantics of the image scene) is extracted to achieve alignment between object content codes of the input domain and panoptic-level style codes sampled from the target style space, then refined by a proposed feature masking module for sharping object boundaries.

Image-to-Image Translation Object +3

Cannot find the paper you are looking for? You can Submit a new open access paper.