Search Results for author: Bowen Cao

Found 11 papers, 8 papers with code

InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation

no code implementations2 Apr 2025 Bowen Cao, Deng Cai, Wai Lam

In-context learning (ICL) is critical for large language models (LLMs), but its effectiveness is constrained by finite context windows, particularly in ultra-long contexts.

In-Context Learning

Beyond Intermediate States: Explaining Visual Redundancy through Language

1 code implementation26 Mar 2025 Dingchen Yang, Bowen Cao, Anran Zhang, Weibo Gu, Winston Hu, Guang Chen

Multi-modal Large Langue Models (MLLMs) often process thousands of visual tokens, which consume a significant portion of the context window and impose a substantial computational burden.

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

no code implementations24 Jun 2024 Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications.

On the Worst Prompt Performance of Large Language Models

2 code implementations8 Jun 2024 Bowen Cao, Deng Cai, Zhisong Zhang, Yuexian Zou, Wai Lam

To address these limitations, we introduce RobustAlpacaEval, a new benchmark that consists of semantically equivalent case-level queries and emphasizes the importance of using the worst prompt performance to gauge the lower bound of model performance.

Prompt Engineering

Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination

1 code implementation21 Mar 2024 Dingchen Yang, Bowen Cao, Guang Chen, Changjun Jiang

Multi-modal Large Language Models (MLLMs) demonstrate remarkable success across various vision-language tasks.

Hallucination MME +1

Retrieval is Accurate Generation

1 code implementation27 Feb 2024 Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi

To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.

Language Modeling Language Modelling +2

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

no code implementations19 Nov 2023 Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Specifically, in fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively, aiming to iteratively share knowledge between these two models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Alleviating Over-smoothing for Unsupervised Sentence Representation

1 code implementation9 May 2023 Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Bowen Cao, Jianhui Chang, Daxin Jiang, Jia Li

Currently, learning better unsupervised sentence representations is the pursuit of many natural language processing communities.

Contrastive Learning Semantic Textual Similarity +1

FTM: A Frame-level Timeline Modeling Method for Temporal Graph Representation Learning

1 code implementation23 Feb 2023 Bowen Cao, Qichen Ye, Weiyuan Xu, Yuexian Zou

Existing neighborhood aggregation strategies fail to capture either the short-term features or the long-term features of temporal graph attributes, leading to unsatisfactory model performance and even poor robustness and domain generality of the representation learning method.

Graph Representation Learning

FiTs: Fine-grained Two-stage Training for Knowledge-aware Question Answering

1 code implementation23 Feb 2023 Qichen Ye, Bowen Cao, Nuo Chen, Weiyuan Xu, Yuexian Zou

Despite the promising result of recent KAQA systems which tend to integrate linguistic knowledge from pre-trained language models (PLM) and factual knowledge from knowledge graphs (KG) to answer complex questions, a bottleneck exists in effectively fusing the representations from PLMs and KGs because of (i) the semantic and distributional gaps between them, and (ii) the difficulties in joint reasoning over the provided knowledge from both modalities.

Knowledge Graphs MedQA +2

Bias-based Universal Adversarial Patch Attack for Automatic Check-out

1 code implementation ECCV 2020 Aishan Liu, Jiakai Wang, Xianglong Liu, Bowen Cao, Chongzhi Zhang, Hang Yu

To address the problem, this paper proposes a bias-based framework to generate class-agnostic universal adversarial patches with strong generalization ability, which exploits both the perceptual and semantic bias of models.

Cannot find the paper you are looking for? You can Submit a new open access paper.