Search Results for author: Mohan Li

Found 14 papers, 2 papers with code

Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information Flow

1 code implementation28 Feb 2025 Jiaqi Bai, Hongcheng Guo, Zhongyuan Peng, Jian Yang, Zhoujun Li, Mohan Li, Zhihong Tian

Furthermore, we propose an entropy-based noise-controlling strategy to enable the injected noise to be adaptively constrained regarding the smoothness of the similarity distribution.

Hallucination Object +3

Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks

no code implementations18 Feb 2025 Wenpeng Xing, Minghao Li, Mohan Li, Meng Han

Embodied AI systems, including robots and autonomous vehicles, are increasingly integrated into real-world applications, where they encounter a range of vulnerabilities stemming from both environmental and system-level factors.

Adversarial Attack Autonomous Vehicles +4

Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks

1 code implementation16 Jan 2025 Yixiao Xu, Binxing Fang, Rui Wang, Yinghai Zhou, Shouling Ji, YuAn Liu, Mohan Li, Zhihong Tian

Guided by the model, we further introduce: (1) a similarity-based training-free watermarking method for plug-and-play and flexible watermarking, and (2) a distribution-based multi-step watermark information transmission strategy for robust watermarking.

Model extraction

A Survey on Federated Learning in Human Sensing

no code implementations7 Jan 2025 Mohan Li, Martin Gjoreski, Pietro Barbiero, Gašper Slapničar, Mitja Luštrek, Nicholas D. Lane, Marc Langheinrich

However, its reliance on detailed and often privacy-sensitive data as the basis for its machine learning (ML) models raises significant legal and ethical concerns.

Federated Learning Survey

WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding

no code implementations29 Aug 2024 Mohan Li, Cong-Thanh Do, Simon Keizer, Youmna Farag, Svetlana Stoyanchev, Rama Doddipatla

Speech large language models (speech-LLMs) integrate speech and text-based foundation models to provide a unified framework for handling a wide range of downstream tasks.

slot-filling Spoken Language Understanding +1

Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding

no code implementations21 Jun 2024 Mohan Li, Simon Keizer, Rama Doddipatla

The system is efficiently trained with prefix-tuning, optimising a minimal set of parameters rather than the entire Whisper model.

Cross-corpus Decoder +4

DiaLoc: An Iterative Approach to Embodied Dialog Localization

no code implementations CVPR 2024 Chao Zhang, Mohan Li, Ignas Budvytis, Stephan Liwicki

However, most existing works in embodied dialog research focus on navigation and leave the localization task understudied.

Economic Forces in Stock Returns

no code implementations6 Jan 2024 Yue Chen, Mohan Li

Driven by such motivation, we conduct an attribution analysis based on the general framework of their model to further prove the importance of the economic factors and identify the specific identity of significant factors.

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

no code implementations24 Apr 2023 Mohan Li, Rama Doddipatla, Catalin Zorila

In previous works, latency was optimised by truncating the online attention weights based on the hard alignments obtained from conventional ASR models, without taking into account the potential loss of ASR accuracy.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

no code implementations21 Apr 2023 Mohan Li, Rama Doddipatla

This paper presents the use of non-autoregressive (NAR) approaches for joint automatic speech recognition (ASR) and spoken language understanding (SLU) tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer

no code implementations29 Jul 2022 Cong-Thanh Do, Mohan Li, Rama Doddipatla

The multiple-hypothesis approach yields a relative reduction of 3. 3% WER on the CHiME-4's single-channel real noisy evaluation set when compared with the single-hypothesis approach.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Transformer-based Streaming ASR with Cumulative Attention

no code implementations11 Mar 2022 Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla

In this paper, we propose an online attention mechanism, known as cumulative attention (CA), for streaming Transformer-based automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Head-synchronous Decoding for Transformer-based Streaming ASR

no code implementations26 Apr 2021 Mohan Li, Catalin Zorila, Rama Doddipatla

Online Transformer-based automatic speech recognition (ASR) systems have been extensively studied due to the increasing demand for streaming applications.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

End-to-end Speech Recognition with Adaptive Computation Steps

no code implementations30 Aug 2018 Mohan Li, Min Liu, Masanori Hattori

In this paper, we present Adaptive Computation Steps (ACS) algo-rithm, which enables end-to-end speech recognition models to dy-namically decide how many frames should be processed to predict a linguistic output.

Decoder speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.