Search Results for author: Xiaoran Liu

Found 16 papers, 8 papers with code

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

1 code implementation17 Jun 2025 Xiaoran Liu, Zhigeng Liu, Zengfeng Huang, Qipeng Guo, Ziwei He, Xipeng Qiu

Large Language Diffusion Models, or diffusion LLMs, have emerged as a significant focus in NLP research, with substantial effort directed toward understanding their scalability and downstream task performance.

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

no code implementations13 Jun 2025 Xiaoran Liu, Siyang He, Qiqi Wang, Ruixiao Li, Yuerong Song, Zhigeng Liu, Linlin Li, Qun Liu, Zengfeng Huang, Qipeng Guo, Ziwei He, Xipeng Qiu

Large Language Models struggle with memory demands from the growing Key-Value (KV) cache as context lengths increase.

AI Simulation by Digital Twins: Systematic Survey, Reference Framework, and Mapping to a Standardized Architecture

no code implementations6 Jun 2025 Xiaoran Liu, Istvan David

Based on our findings, we derive a reference framework and provide architectural guidelines by mapping it onto the ISO 23247 reference architecture for digital twins.

Thus Spake Long-Context Large Language Model

no code implementations24 Feb 2025 Xiaoran Liu, Ruixiao Li, Mianqiu Huang, Zhigeng Liu, Yuerong Song, Qipeng Guo, Siyang He, Qiqi Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xuanjing Huang, Xipeng Qiu

Moreover, the research on long-context LLMs has expanded from length extrapolation to a comprehensive focus on architecture, infrastructure, training, and evaluation technologies.

Language Modeling Language Modelling +4

Capturing Human Cognitive Styles with Language: Towards an Experimental Evaluation Paradigm

no code implementations18 Feb 2025 Vasudha Varadarajan, Syeda Mahwish, Xiaoran Liu, Julia Buffolino, Christian C. Luhmann, Ryan L. Boyd, H. Andrew Schwartz

While NLP models often seek to capture cognitive states via language, the validity of predicted states is determined by comparing them to annotations created without access the cognitive states of the authors.

Decision Making

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

1 code implementation7 Feb 2025 Xilin Wei, Xiaoran Liu, Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Jian Tong, Haodong Duan, Qipeng Guo, Jiaqi Wang, Xipeng Qiu, Dahua Lin

As part of our analysis, we introduce a challenging V-NIAH-D (Visual Needle-In-A-Haystack with Distractors) task, which adds periodic distractors into V-NIAH.

Hallucination Position +2

LongSafetyBench: Long-Context LLMs Struggle with Safety Issues

1 code implementation11 Nov 2024 Mianqiu Huang, Xiaoran Liu, Shaojun Zhou, Mozhi Zhang, Chenkun Tan, Pengyu Wang, Qipeng Guo, Zhe Xu, Linyang Li, Zhikai Lei, Linlin Li, Qun Liu, Yaqian Zhou, Xipeng Qiu, Xuanjing Huang

With the development of large language models (LLMs), the sequence length of these models continues to increase, drawing significant attention to long-context language models.

DetectiveQA: Evaluating Long-Context Reasoning on Detective Novels

no code implementations4 Sep 2024 Zhe Xu, Jiasheng Ye, Xiangyang Liu, Tianxiang Sun, Xiaoran Liu, Qipeng Guo, Linlin Li, Qun Liu, Xuanjing Huang, Xipeng Qiu

DetectiveQA focuses on evaluating the long-context reasoning ability of LLMs, which not only requires a full understanding of context but also requires extracting important evidences from the context and reasoning according to extracted evidences to answer the given questions.

ReAttention: Training-Free Infinite Context with Finite Attention Scope

no code implementations21 Jul 2024 Xiaoran Liu, Ruixiao Li, Qipeng Guo, Zhigeng Liu, Yuerong Song, Kai Lv, Hang Yan, Linlin Li, Qun Liu, Xipeng Qiu

The long-context capability of the Large Language Models (LLM) has made significant breakthroughs, but the maximum supported context length remains a critical bottleneck limiting their practical applications.

Language Modelling Large Language Model +1

InternLM2 Technical Report

3 code implementations26 Mar 2024 Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

4k Long-Context Understanding

LongWanjuan: Towards Systematic Measurement for Long Text Quality

1 code implementation21 Feb 2024 Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin

The quality of training data are crucial for enhancing the long-text capabilities of foundation models.

Diversity Language Modelling

CoLLiE: Collaborative Training of Large Language Models in an Efficient Way

1 code implementation1 Dec 2023 Kai Lv, Shuo Zhang, Tianle Gu, Shuhao Xing, Jiawei Hong, Keyu Chen, Xiaoran Liu, Yuqing Yang, Honglin Guo, Tengxiao Liu, Yu Sun, Qipeng Guo, Hang Yan, Xipeng Qiu

This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo.

parameter-efficient fine-tuning

Scaling Laws of RoPE-based Extrapolation

1 code implementation8 Oct 2023 Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin

The extrapolation capability of Large Language Models (LLMs) based on Rotary Position Embedding is currently a topic of considerable interest.

16k

Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge

1 code implementation3 May 2023 Vasudha Varadarajan, Swanie Juhng, Syeda Mahwish, Xiaoran Liu, Jonah Luby, Christian Luhmann, H. Andrew Schwartz

While transformer-based systems have enabled greater accuracies with fewer training examples, data acquisition obstacles still persist for rare-class tasks -- when the class label is very infrequent (e. g. < 5% of samples).

Active Learning Implicit Discourse Relation Classification +1

Numerology Selection for OFDM Systems Based on Deep Neural Networks

no code implementations9 Nov 2020 Xiaoran Liu, Jiao Zhang, Jibo Wei

In this letter, we propose a deep neutral network (DNN) approach for numerology selection of OFDM systems.

Anomaly Detection in a Digital Video Broadcasting System Using Timed Automata

no code implementations24 May 2017 Xiaoran Liu, Qin Lin, Sicco Verwer, Dmitri Jarnikov

This paper focuses on detecting anomalies in a digital video broadcasting (DVB) system from providers' perspective.

Anomaly Detection One-class classifier

Cannot find the paper you are looking for? You can Submit a new open access paper.