Search Results for author: Haiyang Sun

Found 13 papers, 5 papers with code

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation • 28 Mar 2024 • Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the \textbf{domain gap} between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the \textbf{lack of data} with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

Paper
Code

Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

no code implementations • 22 Mar 2024 • Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, JianHua Tao

Multimodal fusion is a significant method for most multimodal tasks.

Paper
Add Code

Can Deception Detection Go Deeper? Dataset, Evaluation, and Benchmark for Deception Reasoning

no code implementations • 18 Feb 2024 • Kang Chen, Zheng Lian, Haiyang Sun, Bin Liu, JianHua Tao

To address data scarcity, this paper proposes a new data collection pipeline.

Deception Detection

Paper
Add Code

Street Gaussians for Modeling Dynamic Urban Scenes

no code implementations • 2 Jan 2024 • Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng

We introduce Street Gaussians, a new explicit scene representation that tackles all these limitations.

Paper
Add Code

SVFAP: Self-supervised Video Facial Affect Perceiver

1 code implementation • 31 Dec 2023 • Licai Sun, Zheng Lian, Kexin Wang, Yu He, Mingyu Xu, Haiyang Sun, Bin Liu, JianHua Tao

Video-based facial affect analysis has recently attracted increasing attention owing to its critical role in human-computer interaction.

Ranked #3 on Dynamic Facial Expression Recognition on FERV39k

Dynamic Facial Expression Recognition Emotion Recognition +2

Paper
Code

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

no code implementations • 12 Dec 2023 • Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu

OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.

object-detection Object Detection

Paper
Add Code

GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition

1 code implementation • 7 Dec 2023 • Zheng Lian, Licai Sun, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Bin Liu, JianHua Tao

To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion recognition, dynamic facial emotion recognition, and multimodal emotion recognition.

Facial Emotion Recognition Micro Expression Recognition +3

Paper
Code

MFAS: Emotion Recognition through Multiple Perspectives Fusion Architecture Search Emulating Human Cognition

no code implementations • 12 Jun 2023 • Haiyang Sun, FuLin Zhang, Zheng Lian, Yingying Guo, Shilei Zhang

Additionally, considering that humans adjust their perception of emotional words in textual semantic based on certain cues present in speech, we design a novel search space and search for the optimal fusion strategy for the two types of information.

Quantization Speech Emotion Recognition

Paper
Add Code

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning

3 code implementations • 18 Apr 2023 • Zheng Lian, Haiyang Sun, Licai Sun, Kang Chen, Mingyu Xu, Kexin Wang, Ke Xu, Yu He, Ying Li, Jinming Zhao, Ye Liu, Bin Liu, Jiangyan Yi, Meng Wang, Erik Cambria, Guoying Zhao, Björn W. Schuller, JianHua Tao

The first Multimodal Emotion Recognition Challenge (MER 2023) was successfully held at ACM Multimedia.

Multi-Label Learning Multimodal Emotion Recognition

Paper
Code

Fully Automated End-to-End Fake Audio Detection

no code implementations • 20 Aug 2022 • Chenglong Wang, Jiangyan Yi, JianHua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu

The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure.

Paper
Add Code

Two-Aspect Information Fusion Model For ABAW4 Multi-task Challenge

no code implementations • 23 Jul 2022 • Haiyang Sun, Zheng Lian, Bin Liu, JianHua Tao, Licai Sun, Cong Cai

In this paper, we propose the solution to the Multi-Task Learning (MTL) Challenge of the 4th Affective Behavior Analysis in-the-wild (ABAW) competition.

Multi-Task Learning Vocal Bursts Valence Prediction