Search Results for author: Haiyang Sun

Found 13 papers, 5 papers with code

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation28 Mar 2024 Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the \textbf{domain gap} between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the \textbf{lack of data} with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

Street Gaussians for Modeling Dynamic Urban Scenes

no code implementations2 Jan 2024 Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng

We introduce Street Gaussians, a new explicit scene representation that tackles all these limitations.

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

no code implementations12 Dec 2023 Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu

OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.

object-detection Object Detection

GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition

1 code implementation7 Dec 2023 Zheng Lian, Licai Sun, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Bin Liu, JianHua Tao

To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion recognition, dynamic facial emotion recognition, and multimodal emotion recognition.

Facial Emotion Recognition Micro Expression Recognition +3

MFAS: Emotion Recognition through Multiple Perspectives Fusion Architecture Search Emulating Human Cognition

no code implementations12 Jun 2023 Haiyang Sun, FuLin Zhang, Zheng Lian, Yingying Guo, Shilei Zhang

Additionally, considering that humans adjust their perception of emotional words in textual semantic based on certain cues present in speech, we design a novel search space and search for the optimal fusion strategy for the two types of information.

Quantization Speech Emotion Recognition

Fully Automated End-to-End Fake Audio Detection

no code implementations20 Aug 2022 Chenglong Wang, Jiangyan Yi, JianHua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu

The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure.

Two-Aspect Information Fusion Model For ABAW4 Multi-task Challenge

no code implementations23 Jul 2022 Haiyang Sun, Zheng Lian, Bin Liu, JianHua Tao, Licai Sun, Cong Cai

In this paper, we propose the solution to the Multi-Task Learning (MTL) Challenge of the 4th Affective Behavior Analysis in-the-wild (ABAW) competition.

Multi-Task Learning Vocal Bursts Valence Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.