1 code implementation • 25 May 2025 • Wang Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han
Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to effortlessly process many originally exhausting tasks -- e. g., digesting a long-form document to find answers vs. directly asking an LLM about it.
1 code implementation • 22 May 2025 • Phat Thanh Dang, Saahil Thoppay, Wang Yang, Qifan Wang, Vipin Chaudhary, Xiaotian Han
Large language models suffer issues when operated on long contexts that are larger than their training context length due to the standard position encoding for tokens in the attention layer.
no code implementations • 22 May 2025 • Wang Yang, Zirui Liu, Hongye Jin, Qingyu Yin, Vipin Chaudhary, Xiaotian Han
In this work, we hypothesize that current limitations in reasoning stem, in part, from insufficient long-context capacity, motivated by empirical observations such as (1) higher context window length often leads to stronger reasoning performance, and (2) failed reasoning cases resemble failed long-context cases.
1 code implementation • 12 Apr 2025 • Wang Yang, Xiang Yue, Vipin Chaudhary, Xiaotian Han
Moreover, when applied to a non-reasoning model (Qwen-2. 5-7B-Instruct), our framework boosts its accuracy from 74. 0% to 81. 8% on the same benchmark, achieving a relative improvement of 7. 8%.
1 code implementation • 17 Feb 2025 • Wang Yang, Hongye Jin, Jingfeng Yang, Vipin Chaudhary, Xiaotian Han
To further boost the performance with the SFT data, we propose Thinking Preference Optimization (ThinkPO), a simple yet effective post-SFT method that enhances long CoT reasoning without requiring new long CoT responses.
no code implementations • 21 Apr 2023 • Wei Zhiwei, Xiao Yi, Tong Ying, Xu Wenjia, Wang Yang
First, we use the property graph to express the spatial relations in proximity, similar and linear arrangement between buildings; secondly, the rules of linear pattern recognition are expressed as the rules of knowledge graph reasoning; finally, the linear building patterns are recognized by using the rule-based reasoning in the built knowledge graph.
1 code implementation • 6 Mar 2023 • Candi Zheng, Wang Yang, Shiyi Chen
This paper aims to stabilize MEM, making it possible to simulating very strong normal shock waves on modern GPUs at single precision.
2 code implementations • 13 Jan 2020 • Dou Goodman, Hao Xin, Wang Yang, Wu Yuesheng, Xiong Junfeng, Zhang Huan
In recent years, neural networks have been extensively deployed for computer vision tasks, particularly visual classification problems, where new algorithms reported to achieve or even surpass the human performance.