Search Results for author: Mingcheng Li

Found 21 papers, 7 papers with code

Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning

no code implementations5 Nov 2024 Mingcheng Li, Dingkang Yang, Yang Liu, Shunli Wang, Jiawei Chen, Shuaibing Wang, Jinjie Wei, Yue Jiang, Qingyao Xu, Xiaolu Hou, Mingyang Sun, Ziyun Qian, Dongliang Kou, Lihua Zhang

Specifically, we propose a fine-grained representation factorization module that sufficiently extracts valuable sentiment information by factorizing modality into sentiment-relevant and modality-specific representations through crossmodal translation and sentiment semantic reconstruction.

Multimodal Sentiment Analysis Representation Learning

HybridOcc: NeRF Enhanced Transformer-based Multi-Camera 3D Occupancy Prediction

no code implementations17 Aug 2024 Xiao Zhao, Bo Chen, Mingyang Sun, Dingkang Yang, Youxing Wang, Xukun Zhang, Mingcheng Li, Dongliang Kou, Xiaoyi Wei, Lihua Zhang

This paper proposes HybridOcc, a hybrid 3D volume query proposal method generated by Transformer framework and NeRF representation and refined in a coarse-to-fine SSC prediction framework.

3D geometry 3D Semantic Scene Completion +1

MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation

no code implementations17 Aug 2024 Xiao Zhao, Xukun Zhang, Dingkang Yang, Mingyang Sun, Mingcheng Li, Shunli Wang, Lihua Zhang

However, current multimodal perception research follows independent paradigms designed for specific perception tasks, leading to a lack of complementary learning among tasks and decreased performance in multi-task learning (MTL) due to joint training.

3D Object Detection Autonomous Driving +5

Faster Diffusion Action Segmentation

no code implementations4 Aug 2024 Shuaibing Wang, Shunli Wang, Mingcheng Li, Dingkang Yang, Haopeng Kuang, Ziyun Qian, Lihua Zhang

However, the heavy sampling steps required by diffusion models pose a substantial computational burden, limiting their practicality in real-time applications.

Action Segmentation Computational Efficiency +2

Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations

no code implementations6 Jul 2024 Dingkang Yang, Mingcheng Li, Linhao Qu, Kun Yang, Peng Zhai, Song Wang, Lihua Zhang

To tackle these issues, we propose a Multimodal fusion approach for learning modality-Exclusive and modality-Agnostic representations (MEA) to refine multimodal features and leverage the complementarity across distinct modalities.

CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation

no code implementations17 Jun 2024 Yue Jiang, Jiawei Chen, Dingkang Yang, Mingcheng Li, Shunli Wang, Tong Wu, Ke Li, Lihua Zhang

Automatic medical report generation (MRG), which possesses significant research value as it can aid radiologists in clinical diagnosis and report composition, has garnered increasing attention.

Hallucination Medical Report Generation

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models

no code implementations14 Jun 2024 Jiawei Chen, Dingkang Yang, Tong Wu, Yue Jiang, Xiaolu Hou, Mingcheng Li, Shunli Wang, Dongling Xiao, Ke Li, Lihua Zhang

To bridge this gap, we introduce Med-HallMark, the first benchmark specifically designed for hallucination detection and evaluation within the medical multimodal domain.

Hallucination Medical Visual Question Answering +2

PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications

1 code implementation29 May 2024 Dingkang Yang, Jinjie Wei, Dongling Xiao, Shunli Wang, Tong Wu, Gang Li, Mingcheng Li, Shuaibing Wang, Jiawei Chen, Yue Jiang, Qingyao Xu, Ke Li, Peng Zhai, Lihua Zhang

In the parameter-efficient secondary SFT phase, a mixture of universal-specific experts strategy is presented to resolve the competency conflict between medical generalist and pediatric expertise mastery.

Domain Adaptation

SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion

no code implementations5 May 2024 Ziyun Qian, Zeyu Xiao, Zhenyi Wu, Dingkang Yang, Mingcheng Li, Shunli Wang, Shuaibing Wang, Dongliang Kou, Lihua Zhang

To address these problems, we consider style motion as a condition and propose the Style Motion Conditioned Diffusion (SMCD) framework for the first time, which can more comprehensively learn the style features of motion.

Mamba Motion Style Transfer +1

Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

no code implementations CVPR 2024 Mingcheng Li, Dingkang Yang, Xiao Zhao, Shuaibing Wang, Yan Wang, Kun Yang, Mingyang Sun, Dongliang Kou, Ziyun Qian, Lihua Zhang

Specifically, we present a sample-level contrastive distillation mechanism that transfers comprehensive knowledge containing cross-sample correlations to reconstruct missing semantics.

Disentanglement Knowledge Distillation +1

Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models

no code implementations25 Apr 2024 Jiawei Chen, Dingkang Yang, Yue Jiang, Mingcheng Li, Jinjie Wei, Xiaolu Hou, Lihua Zhang

In the realm of Medical Visual Language Models (Med-VLMs), the quest for universal efficient fine-tuning mechanisms remains paramount, especially given researchers in interdisciplinary fields are often extremely short of training resources, yet largely unexplored.

Medical Visual Question Answering parameter-efficient fine-tuning +2

Can LLMs' Tuning Methods Work in Medical Multimodal Domain?

2 code implementations11 Mar 2024 Jiawei Chen, Yue Jiang, Dingkang Yang, Mingcheng Li, Jinjie Wei, Ziyun Qian, Lihua Zhang

In this paper, we delve into the fine-tuning methods of LLMs and conduct extensive experiments to investigate the impact of fine-tuning methods for large models on the existing multimodal model in the medical domain from the training data level and the model structure level.

Transfer Learning World Knowledge

Robust Emotion Recognition in Context Debiasing

no code implementations CVPR 2024 Dingkang Yang, Kun Yang, Mingcheng Li, Shunli Wang, Shuaibing Wang, Lihua Zhang

Following the causal graph, CLEF introduces a non-invasive context branch to capture the adverse direct effect caused by the context bias.

counterfactual Emotion Recognition in Context

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

no code implementations8 Mar 2024 Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, Lihua Zhang

In the inference phase, given a factual multimodal input, MCIS imagines two counterfactual scenarios to purify and mitigate these biases.

counterfactual Counterfactual Inference +1

Debiased Multimodal Understanding for Human Language Sequences

no code implementations8 Mar 2024 Zhi Xu, Dingkang Yang, Mingcheng Li, Yuzheng Wang, Zhaoyu Chen, Jiawei Chen, Jinjie Wei, Lihua Zhang

Human multimodal language understanding (MLU) is an indispensable component of expression analysis (e. g., sentiment or humor) from heterogeneous modalities, including visual postures, linguistic contents, and acoustic behaviours.

Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception

1 code implementation ICCV 2023 Kun Yang, Dingkang Yang, Jingyu Zhang, Mingcheng Li, Yang Liu, Jing Liu, Hanqi Wang, Peng Sun, Liang Song

In this paper, we propose SCOPE, a novel collaborative perception framework that aggregates the spatio-temporal awareness characteristics across on-road agents in an end-to-end manner.

3D Object Detection Autonomous Vehicles +1

Context De-confounded Emotion Recognition

1 code implementation CVPR 2023 Dingkang Yang, Zhaoyu Chen, Yuzheng Wang, Shunli Wang, Mingcheng Li, Siao Liu, Xiao Zhao, Shuai Huang, Zhiyan Dong, Peng Zhai, Lihua Zhang

However, a long-overlooked issue is that a context bias in existing datasets leads to a significantly unbalanced distribution of emotional states among different context scenarios.

Emotion Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.