no code implementations • 18 Feb 2025 • Bingning Wang, Haizhou Zhao, Huozhi Zhou, Liang Song, Mingyu Xu, Wei Cheng, Xiangrong Zeng, Yupeng Zhang, Yuqi Huo, Zecheng Wang, Zhengyun Zhao, Da Pan, Fei Kou, Fei Li, Fuzhong Chen, Guosheng Dong, Han Liu, Hongda Zhang, Jin He, Jinjie Yang, Kangxi Wu, Kegeng Wu, Lei Su, Linlin Niu, Linzhuang Sun, Mang Wang, Pengcheng Fan, Qianli Shen, Rihui Xin, Shunya Dang, Songchi Zhou, WeiPeng Chen, Wenjing Luo, Xin Chen, Xin Men, Xionghai Lin, Xuezhen Dong, Yan Zhang, Yifei Duan, Yuyan Zhou, Zhi Ma, Zhiying Wu
The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce.
no code implementations • 11 Feb 2025 • Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Wayne Xin Zhao, Bingning Wang, WeiPeng Chen
To address these challenges, we propose Long Context Pre-training with Restoration Distillation (LongReD), a novel approach designed to mitigate short-text performance degradation through minimizing the distribution discrepancy between the extended and original models.
2 code implementations • 3 Jan 2025 • Yifan Du, Zikang Liu, YiFan Li, Wayne Xin Zhao, Yuqi Huo, Bingning Wang, WeiPeng Chen, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen
Moreover, it seems that such textual reasoning data can be even more effective than visual reasoning data in eliciting the slow-thinking capacities of MLLMs.
1 code implementation • 29 Nov 2024 • Mingyu Xu, Wei Cheng, Bingning Wang, WeiPeng Chen
In order to more efficiently implement the ability of the model's induction, we revisit the induction heads mechanism and proposed a KV shifting attention.
no code implementations • 16 Nov 2024 • Yao Xu, Shizhu He, Zeng Xiangrong, Jiabei Chen, Guang Liu, Bingning Wang, Jun Zhao, Kang Liu
Specifically, we represent various types of structured data in a unified hypergraph format, and use self-supervised learning to pretrain a hypergraph encoder, and a G-Former compressing encoded hypergraph representations with cross-attention.
1 code implementation • 21 Oct 2024 • Han Huang, Yuqi Huo, Zijia Zhao, Haoyu Lu, Shu Wu, Bingning Wang, Qiang Liu, WeiPeng Chen, Liang Wang
A critical factor in training MLLMs is the quality of image-text pairs within multimodal pretraining datasets.
1 code implementation • 17 Oct 2024 • Yifan Du, Yuqi Huo, Kun Zhou, Zijia Zhao, Haoyu Lu, Han Huang, Wayne Xin Zhao, Bingning Wang, WeiPeng Chen, Ji-Rong Wen
Then, we explore the scaling effects in frame selection and token selection respectively, and fit the corresponding function curve by conducting extensive empirical experiments.
no code implementations • 10 Oct 2024 • Zhipeng Chen, Liang Song, Kun Zhou, Wayne Xin Zhao, Bingning Wang, WeiPeng Chen, Ji-Rong Wen
In the extraction stage, we firstly locate key neurons that are highly related to specific abilities, and then employ them to extract the transferable ability-specific weights.
no code implementations • 27 Aug 2024 • Guosheng Dong, Da Pan, Yiding Sun, Shusen Zhang, Zheng Liang, Xin Wu, Yanjun Shen, Fan Yang, Haoze Sun, Tianpeng Li, MingAn Lin, Jianhua Xu, Yufan Zhang, Xiaonan Nie, Lei Su, Bingning Wang, Wentao Zhang, Jiaxin Mao, Zenan Zhou, WeiPeng Chen
The general capabilities of Large Language Models (LLM) highly rely on the composition and selection on extensive pretraining datasets, treated as commercial secrets by several institutions.
1 code implementation • 20 Jun 2024 • Yifan Du, Kun Zhou, Yuqi Huo, YiFan Li, Wayne Xin Zhao, Haoyu Lu, Zijia Zhao, Bingning Wang, WeiPeng Chen, Ji-Rong Wen
Leveraging an effective instruction synthesis method and an adaptive model architecture, VIM surpasses both state-of-the-art open-source models and GPT-4V on the Event-Bench.
no code implementations • 18 Jun 2024 • Jie Chen, Yupeng Zhang, Bingning Wang, Wayne Xin Zhao, Ji-Rong Wen, WeiPeng Chen
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs).
no code implementations • 17 Jun 2024 • Han Liu, Yupeng Zhang, Bingning Wang, WeiPeng Chen, Xiaolin Hu
Deep Neural Networks (DNNs) excel in various domains but face challenges in providing accurate uncertainty estimates, which are crucial for high-stakes applications.
no code implementations • 17 Jun 2024 • Yuyan Zhou, Liang Song, Bingning Wang, WeiPeng Chen
The advent of large language models (LLMs) like GPT-4 has catalyzed the exploration of multi-task learning (MTL), in which a single model demonstrates proficiency across diverse tasks.
1 code implementation • 13 Jun 2024 • Zijia Zhao, Haoyu Lu, Yuqi Huo, Yifan Du, Tongtian Yue, Longteng Guo, Bingning Wang, WeiPeng Chen, Jing Liu
In this paper, we propose VideoNIAH (Video Needle In A Haystack), a benchmark construction framework through synthetic video generation.
no code implementations • 30 May 2024 • Han Liu, Peng Cui, Bingning Wang, Jun Zhu, Xiaolin Hu
Deep Neural Networks (DNNs) have achieved remarkable success in a variety of tasks, especially when it comes to prediction accuracy.
no code implementations • 23 May 2024 • Xin Men, Mingyu Xu, Bingning Wang, Qingyu Zhang, Hongyu Lin, Xianpei Han, WeiPeng Chen
We revisit the role of RoPE in LLMs and propose a novel property of long-term decay, we derive that the \textit{base of RoPE bounds context length}: there is an absolute lower bound for the base value to obtain certain context length capability.
no code implementations • 28 Mar 2024 • Deyuan Liu, Zecheng Wang, Bingning Wang, WeiPeng Chen, Chunshan Li, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui
The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs.
1 code implementation • 6 Mar 2024 • Xin Men, Mingyu Xu, Qingyu Zhang, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han, WeiPeng Chen
As Large Language Models (LLMs) continue to advance in performance, their size has escalated significantly, with current LLMs containing billions or even trillions of parameters.
2 code implementations • 19 Sep 2023 • Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, WeiPeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu
Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering.
1 code implementation • 7 Apr 2023 • Xiaohui Xie, Qian Dong, Bingning Wang, Feiyang Lv, Ting Yao, Weinan Gan, Zhijing Wu, Xiangsheng Li, Haitao Li, Yiqun Liu, Jin Ma
T2Ranking comprises more than 300K queries and over 2M unique passages from real-world search engines.
1 code implementation • 5 Aug 2022 • Bingning Wang, Feiyang Lv, Ting Yao, Yiming Yuan, Jin Ma, Yu Luo, Haijin Liang
However, in most of the public visual question answering datasets such as VQA, CLEVR, the questions are human generated that specific to the given image, such as `What color are her eyes?'.
no code implementations • 11 Jun 2022 • Han Liu, Bingning Wang, Ting Yao, Haijin Liang, Jianjin Xu, Xiaolin Hu
Large-scale pre-trained language models have achieved great success on natural language generation tasks.
1 code implementation • 16 Jan 2021 • Bingning Wang, Ting Yao, WeiPeng Chen, Jingfang Xu, Xiaochuan Wang
In compositional question answering, the systems should assemble several supporting evidence from the document to generate the final answer, which is more difficult than sentence-level or phrase-level QA.
no code implementations • 12 Dec 2020 • Yichao Luo, Zhengyan Li, Bingning Wang, Xiaoyu Xing, Qi Zhang, Xuanjing Huang
Keyphrase Generation (KG) is the task of generating central topics from a given document or literary work, which captures the crucial information necessary to understand the content.
1 code implementation • EMNLP 2020 • Xiaoyu Xing, Zhijing Jin, Di Jin, Bingning Wang, Qi Zhang, Xuanjing Huang
Based on the SemEval 2014 dataset, we construct the Aspect Robustness Test Set (ARTS) as a comprehensive probe of the aspect robustness of ABSA models.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
1 code implementation • 28 Mar 2019 • Jindou Wu, Yunlun Yang, Chao Deng, Hongyi Tang, Bingning Wang, Haoze Sun, Ting Yao, Qi Zhang
In this paper, we present a Sogou Machine Reading Comprehension (SMRC) toolkit that can be used to provide the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.