Search Results for author: Zhihao Fan

Found 31 papers, 21 papers with code

Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference

no code implementations17 Dec 2024 Siyuan Wang, Dianyi Wang, Chengxing Zhou, Zejun Li, Zhihao Fan, Xuanjing Huang, Zhongyu Wei

Large Vision-Language Models (LVLMs) typically learn visual capacity through visual instruction tuning, involving updates to both a projector and their LLM backbones.

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

1 code implementation31 Oct 2024 Zhenbiao Cao, Yuanlei Zheng, Zhihao Fan, Xiaojin Zhang, Wei Chen, Xiang Bai

Text-to-SQL generation aims to translate natural language questions into SQL statements.

Text-To-SQL

MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration

1 code implementation6 Oct 2024 Lai Wei, Wenkai Wang, Xiaoyu Shen, Yu Xie, Zhihao Fan, Xiaojin Zhang, Zhongyu Wei, Wei Chen

In recent advancements, multimodal large language models (MLLMs) have been fine-tuned on specific medical image datasets to address medical visual question answering (Med-VQA) tasks.

Medical Visual Question Answering Question Answering +1

From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking

no code implementations21 Jun 2024 Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei

The rapid development of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has exposed vulnerabilities to various adversarial attacks.

DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning

1 code implementation2 Apr 2024 Mengfei Du, Binhao Wu, Jiwen Zhang, Zhihao Fan, Zejun Li, Ruipu Luo, Xuanjing Huang, Zhongyu Wei

For task completion, the agent needs to align and integrate various navigation modalities, including instruction, observation and navigation history.

Contrastive Learning Decision Making +2

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

1 code implementation18 Feb 2024 Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei, Xuanjing Huang

Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations to construct evolving instances testing LLMs against diverse queries, data noise and probing their problem-solving sub-abilities.

Model Selection

AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator

1 code implementation15 Feb 2024 Zhihao Fan, Jialong Tang, Wei Chen, Siyuan Wang, Zhongyu Wei, Jun Xi, Fei Huang, Jingren Zhou

Artificial intelligence has significantly advanced healthcare, particularly through large language models (LLMs) that excel in medical question answering benchmarks.

Benchmarking Question Answering

Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs

1 code implementation23 May 2023 Siyuan Wang, Zhongyu Wei, Meng Han, Zhihao Fan, Haijun Shan, Qi Zhang, Xuanjing Huang

The results demonstrate the effectiveness of our method on logical reasoning over KGs in both inductive and transductive settings.

Knowledge Graphs Logical Reasoning

Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning

no code implementations21 Jan 2023 Siyuan Wang, Zhongyu Wei, Jiarong Xu, Taishan Li, Zhihao Fan

Recent pre-trained language models (PLMs) equipped with foundation reasoning skills have shown remarkable performance on downstream complex tasks.

Language Modeling Language Modelling +1

Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise

1 code implementation22 Dec 2022 Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Nan Duan, Weizhu Chen

In this paper, we introduce a novel dIffusion language modEl pre-training framework for text generation, which we call GENIE.

Decoder Denoising +3

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

1 code implementation COLING 2022 Siyuan Wang, Zhongyu Wei, Zhihao Fan, Qi Zhang, Xuanjing Huang

In this paper, we propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation at each intermediate step, and utilize the inference of the current hop for the next until reasoning out the final result.

Multi-hop Question Answering Question Answering +3

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

1 code implementation Findings (NAACL) 2022 Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan

We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.

Image-text Retrieval Sentence +2

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval

no code implementations12 Sep 2021 Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan

Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.

Image-text Retrieval Representation Learning +2

TCIC: Theme Concepts Learning Cross Language and Vision for Image Captioning

no code implementations21 Jun 2021 Zhihao Fan, Zhongyu Wei, Siyuan Wang, Ruize Wang, Zejun Li, Haijun Shan, Xuanjing Huang

Considering that theme concepts can be learned from both images and captions, we propose two settings for their representations learning based on TTN.

Decoder Image Captioning +1

Bridging by Word: Image Grounded Vocabulary Construction for Visual Captioning

1 code implementation ACL 2019 Zhihao Fan, Zhongyu Wei, Siyuan Wang, Xuanjing Huang

Existing research usually employs the architecture of CNN-RNN that views the generation as a sequential decision-making process and the entire dataset vocabulary is used as decoding space.

Decision Making Image Captioning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.