Search Results for author: Junxiao Xue

Found 13 papers, 3 papers with code

Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering

no code implementations30 Dec 2024 Junxiao Xue, Quan Deng, Fei Yu, Yanhao Wang, Jun Wang, Yuehua Li

Multimodal large language models (MLLMs), such as GPT-4o, Gemini, LLaVA, and Flamingo, have made significant progress in integrating visual and textual modalities, excelling in tasks like visual question answering (VQA), image captioning, and content retrieval.

Image Captioning Object Recognition +4

3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations

no code implementations10 Dec 2024 Xuecheng Wu, Junxiao Xue, Liangyu Fu, Jiayu Nie, Danlei Huang, Xinyi Yin

Recent research on real-time object detectors (e. g., YOLO series) has demonstrated the effectiveness of attention mechanisms for elevating model performance.

Pilot-guided Multimodal Semantic Communication for Audio-Visual Event Localization

no code implementations9 Dec 2024 Fei Yu, Zhe Xiang, Nan Che, Zhuoran Zhang, Yuandi Li, Junxiao Xue, Zhiguo Wan

Existing methods often focus on single modality tasks and fail to handle multimodal stream data, such as video and audio, and their corresponding tasks.

audio-visual event localization Autonomous Driving +1

Edge-Cloud Collaborative Satellite Image Analysis for Efficient Man-Made Structure Recognition

no code implementations8 Oct 2024 Kaicheng Sheng, Junxiao Xue, HUI ZHANG

The increasing availability of high-resolution satellite imagery has created immense opportunities for various applications.

Cloud Computing

Towards Emotion Analysis in Short-form Videos: A Large-Scale Dataset and Baseline

1 code implementation29 Nov 2023 Xuecheng Wu, Heli Sun, Junxiao Xue, Jiayu Nie, Xiangyan Kong, Ruofan Zhai, Liang He

The prevailing use of SVs to spread emotions leads to the necessity of conducting video emotion analysis (VEA) towards SVs.

audio-visual learning Form +2

Affective Video Content Analysis: Decade Review and New Perspectives

no code implementations26 Oct 2023 Junxiao Xue, Jie Wang, Xuecheng Wu, Qian Zhang

In this study, we comprehensively review the development of AVCA over the past decade, particularly focusing on the most advanced methods adopted to address the three major challenges of video feature extraction, expression subjectivity, and multimodal feature fusion.

Emotional Intelligence Facial Expression Recognition +1

Cross-modal information fusion for voice spoofing detection

1 code implementation journal 2023 Junxiao Xue, Hao Zhou, Huawei Song, Bin Wu, Lei Shi

Researchers have proposed many methods to defend against these attacks, but in the existing methods, researchers just focus on speech features.

Automatic Speech Recognition fake voice detection +3

Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection

no code implementations1 Sep 2021 Junxiao Xue, Hao Zhou, Yabo Wang

This method involves feature extraction, a densely connected convolutional neural network with squeeze and excitation block (SE-DenseNet), multi-scale residual neural network with squeeze and excitation block (SE-Res2Net) and feature fusion strategies.

Speaker Verification Speech Synthesis +1

Multi-Agent Path Planning based on MPC and DDPG

no code implementations26 Feb 2021 Junxiao Xue, Xiangyan Kong, Bowei Dong, Mingliang Xu

The problem of mixed static and dynamic obstacle avoidance is essential for path planning in highly dynamic environment.

Decision Making Model Predictive Control +1

Agent-Based Campus Novel Coronavirus Infection and Control Simulation

no code implementations22 Feb 2021 Pei Lv, Quan Zhang, Boya Xu, Ran Feng, Chaochao Li, Junxiao Xue, Bing Zhou, Mingliang Xu

Corona Virus Disease 2019 (COVID-19), due to its extremely high infectivity, has been spreading rapidly around the world and bringing huge influence to socioeconomic development as well as people's daily life.

Social and Information Networks Physics and Society Populations and Evolution

Multi-scale discriminative Region Discovery for Weakly-Supervised Object Localization

no code implementations24 Sep 2019 Pei Lv, Haiyu Yu, Junxiao Xue, Junjin Cheng, Lisha Cui, Bing Zhou, Mingliang Xu, Yi Yang

On ILSVRC 2016, the proposed method yields the Top-1 localization error of 48. 65\%, which outperforms previous results by 2. 75\%.

Weakly-Supervised Object Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.