Search Results for author: Shilong Li

Found 13 papers, 6 papers with code

Bench2FreeAD: A Benchmark for Vision-based End-to-end Navigation in Unstructured Robotic Environments

1 code implementation15 Mar 2025 Yuhang Peng, Sidong Wang, Jihaoyu Yang, Shilong Li, Han Wang, Jiangtao Gong

Thus, this paper presents the first dataset targeting E2E robot navigation tasks in unstructured scenarios, and provides a benchmark based on vision-based E2E autonomous driving algorithms to facilitate the development of E2E navigation technology for logistics and service robots.

Autonomous Driving Robot Navigation

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

1 code implementation26 Feb 2025 Yancheng He, Shilong Li, Jiaheng Liu, Weixun Wang, Xingyuan Bu, Ge Zhang, Zhongyuan Peng, Zhaoxiang Zhang, Zhicheng Zheng, Wenbo Su, Bo Zheng

In this paper, to understand the qualities of these long CoTs and measure the critique abilities of existing LLMs on these long CoTs, we introduce the DeltaBench, including the generated long CoTs from different o1-like models (e. g., QwQ, DeepSeek-R1) for different reasoning tasks (e. g., Math, Code, General Reasoning), to measure the ability to detect errors in long CoT reasoning.

Math

AIR: Complex Instruction Generation via Automatic Iterative Refinement

1 code implementation25 Feb 2025 Wei Liu, Yancheng He, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng

With the development of large language models, their ability to follow simple instructions has significantly improved.

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

no code implementations17 Feb 2025 Jihao Gu, Yingyao Wang, Pi Bu, Chen Wang, ZiMing Wang, Tengtao Song, Donglai Wei, Jiale Yuan, Yingxiu Zhao, Yancheng He, Shilong Li, Jiaheng Liu, Meng Cao, Jun Song, Yingshui Tan, Xiang Li, Wenbo Su, Zhicheng Zheng, Xiaoyong Zhu, Bo Zheng

The evaluation of factual accuracy in large vision language models (LVLMs) has lagged behind their rapid development, making it challenging to fully reflect these models' knowledge capacity and reliability.

Object Recognition Question Answering +1

Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation

no code implementations19 Dec 2024 Jihao Gu, Yingyao Wang, Meng Cao, Pi Bu, Jun Song, Yancheng He, Shilong Li, Bo Zheng

Direct Preference Optimization (DPO) has been demonstrated to be highly effective in mitigating hallucinations in Large Vision Language Models (LVLMs) by aligning their outputs more closely with human preferences.

Hallucination

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision

no code implementations25 Oct 2024 Shilong Li, Yancheng He, Hui Huang, Xingyuan Bu, Jiaheng Liu, Hangyu Guo, Weixun Wang, Jihao Gu, Wenbo Su, Bo Zheng

Recent advancements in Direct Preference Optimization (DPO) have significantly enhanced the alignment of Large Language Models (LLMs) with human preferences, owing to its simplicity and effectiveness.

Fusion Makes Perfection: An Efficient Multi-Grained Matching Approach for Zero-Shot Relation Extraction

1 code implementation17 Jun 2024 Shilong Li, Ge Bai, Zhang Zhang, Ying Liu, Chenji Lu, Daichi Guo, Ruifang Liu, Yong Sun

However, fine-grained matching often requires laborious manual annotation, and rich interactions between instances and label descriptions come with significant computational overhead.

Relation Relation Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.