no code implementations • Findings (EMNLP) 2021 • Yaochen Liu, Yazhou Zhang, Qiuchi Li, Benyou Wang, Dawei Song
The QPM framework involves a complex-valued multi-modal representation encoder, a quantum-like fusion subnetwork and a quantum measurement mechanism.
no code implementations • 22 Jun 2025 • Junying Chen, Zhenyang Cai, Pengcheng Chen, Shunian Chen, Ke Ji, Xidong Wang, Yunjin Yang, Benyou Wang
Recent advances in multimodal generative models have unlocked photorealistic, instruction-aligned image generation, yet leading systems like GPT-4o-Image remain proprietary and inaccessible.
1 code implementation • 15 Jun 2025 • Wanlong Liu, Junxiao Xu, Fei Yu, Yukang Lin, Ke Ji, Wenyu Chen, Yan Xu, Yasheng Wang, Lifeng Shang, Benyou Wang
This approach enables the model to adaptively employ both reasoning patterns: it prioritizes the Short CoT patterns and activates the Long CoT patterns only when necessary.
1 code implementation • 11 Jun 2025 • Chengpeng Li, Zhengyang Tang, Ziniu Li, Mingfeng Xue, Keqin Bao, Tian Ding, Ruoyu Sun, Benyou Wang, Xiang Wang, Junyang Lin, Dayiheng Liu
Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when handling complex mathematical operations.
1 code implementation • 1 Jun 2025 • Shunian Chen, Xinyuan Xie, Zheshu Chen, Liyan Zhao, Owen Lee, Zhan Su, Qilin Sun, Benyou Wang
High-quality, large-scale audio captioning is crucial for advancing audio understanding, yet current automated methods often generate captions that lack fine-grained detail and contextual accuracy, primarily due to their reliance on limited unimodal or superficial multimodal information.
no code implementations • 30 May 2025 • Xu Wang, Zihao Li, Benyou Wang, Yan Hu, Difan Zou
Large language models (LLMs) store vast amounts of information, making them powerful yet raising privacy and safety concerns when selective knowledge removal is required.
no code implementations • 24 May 2025 • Xunlian Dai, Li Zhou, Benyou Wang, Haizhou Li
We extend this test into an LLM-adaptive, free-relation task to assess the alignment of large language models (LLMs) with cross-cultural cognition.
1 code implementation • 21 May 2025 • Yuhao Zhang, Xiangnan Ma, Kaiqi Kou, Peizhuo Liu, Weiqiao Shan, Benyou Wang, Tong Xiao, Yuxin Huang, Zhengtao Yu, Jingbo Zhu
We propose the unit language to overcome the two modeling challenges.
no code implementations • 16 May 2025 • Bo Yue, Shuqi Guo, Kaiyu Hu, Chujiao Wang, Benyou Wang, Kui Jia, Guiliang Liu
VERGSA establishes 1) a seamless extension from verification of mathematical reasoning into embodied learning by dynamically incorporating contextually relevant tasks into prompts and defining success metrics for both subtasks and overall tasks, and 2) an automated, scalable reward labeling scheme that synthesizes dense reward signals by iteratively finalizing the contribution of scene configuration and subtask learning to overall skill acquisition.
no code implementations • 12 May 2025 • Tongxu Luo, Wenyu Du, Jiaxi Bi, Stephen Chung, Zhengyang Tang, Hao Yang, Min Zhang, Benyou Wang
Notably, our fine-tuned LeaP-T-7B matches the performance of DeepSeek-R1-Distill-Qwen-14B on AIME 2024.
1 code implementation • 27 Mar 2025 • Kaituo Feng, Kaixiong Gong, Bohao Li, Zonghao Guo, Yibing Wang, Tianshuo Peng, Junfei Wu, Xiaoying Zhang, Benyou Wang, Xiangyu Yue
However, directly applying RL training with the GRPO algorithm to video reasoning presents two primary challenges: (i) a lack of temporal modeling for video reasoning, and (ii) the scarcity of high-quality video-reasoning data.
1 code implementation • 11 Mar 2025 • Xiaoxiao Liu, Qingying Xiao, Junying Chen, Xiangyi Feng, Xiangbo Wu, Bairui Zhang, Xiang Wan, Jian Chang, Guangjun Yu, Yan Hu, Benyou Wang
Large language models (LLMs) are increasingly applied to outpatient referral tasks across healthcare systems.
no code implementations • 7 Mar 2025 • Feng Jiang, Zhiyu Lin, Fan Bu, Yuhao Du, Benyou Wang, Haizhou Li
The rapid development of large language models (LLMs) has brought significant attention to speech models, particularly recent progress in speech2speech protocols supporting speech input and output.
1 code implementation • 4 Mar 2025 • Ke Ji, Jiahao Xu, Tian Liang, Qiuzhi Liu, Zhiwei He, Xingyu Chen, Xiaoyuan Liu, Zhijie Wang, Junying Chen, Benyou Wang, Zhaopeng Tu, Haitao Mi, Dong Yu
By training exclusively on the initial prefix substrings (as few as 8 tokens), UPFT removes the need for labeled data or exhaustive sampling.
no code implementations • 19 Feb 2025 • Yiran Qin, Ao Sun, Yuze Hong, Benyou Wang, Ruimao Zhang
Navigating unfamiliar environments presents significant challenges for household robots, requiring the ability to recognize and reason about novel decoration and layout.
1 code implementation • 18 Feb 2025 • Yuhao Zhang, Zhiheng Liu, Fan Bu, Ruiyu Zhang, Benyou Wang, Haizhou Li
Existing end-to-end speech large language models (LLMs) usually rely on large-scale annotated data for training, while data-efficient training has not been discussed in depth.
no code implementations • 17 Feb 2025 • Xu Wang, Yan Hu, Wenyu Du, Reynold Cheng, Benyou Wang, Difan Zou
Fine-tuning significantly improves the performance of Large Language Models (LLMs), yet its underlying mechanisms remain poorly understood.
no code implementations • 16 Feb 2025 • Fei Yu, Yingru Li, Benyou Wang
Value model-guided search is effective in steering the generation but suffers from scaling flaws: Its superiority diminishes with larger sample sizes, underperforming non-search baselines.
1 code implementation • 24 Jan 2025 • Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin
In this work, we introduce a new benchmark designed to assess the critique capabilities of LLMs.
no code implementations • 10 Jan 2025 • Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin
Despite their remarkable performance, the development of Large Language Models (LLMs) faces a critical challenge in scalable oversight: providing effective feedback for tasks where human evaluation is difficult or where LLMs outperform humans.
1 code implementation • 31 Dec 2024 • Wanlong Liu, Junying Chen, Ke Ji, Li Zhou, Wenyu Chen, Benyou Wang
(2) They suffer from limited task diversity due to the lack of a general RAG dataset.
1 code implementation • 28 Dec 2024 • Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang, Benyou Wang
Multimodal large language models (MLLMs) hold significant potential in the medical field, but their capabilities are often limited by insufficient data in certain medical domains, highlighting the need for understanding what kinds of images can be used by MLLMs for generalization.
1 code implementation • 25 Dec 2024 • Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Jianye Hou, Benyou Wang
To address this, we propose verifiable medical problems with a medical verifier to check the correctness of model outputs.
1 code implementation • 16 Dec 2024 • Yuhao Du, Shunian Chen, Wenbo Zan, Peizhao Li, Mingxuan Wang, Dingjie Song, Bo Li, Yan Hu, Benyou Wang
In this paper, we present BlenderLLM, a novel framework for training LLMs specifically for CAD tasks leveraging a self-improvement methodology.
no code implementations • 16 Dec 2024 • Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Abdulmohsen Alharthik, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, Zhuoheng Ma, Yuhao Du, He Zhang, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu
This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art offerings like GPT-4 or ChatGPT 3. 5, due to a predominant focus on mainstream languages (e. g., English and Chinese).
1 code implementation • 4 Dec 2024 • Juhao Liang, Zhenyang Cai, Jianqing Zhu, Huang Huang, Kewei Zong, Bang An, Mosen Alharthi, Juncai He, Lian Zhang, Haizhou Li, Benyou Wang, Jinchao Xu
The alignment of large language models (LLMs) is critical for developing effective and safe language models.
1 code implementation • 3 Dec 2024 • Kaixiong Gong, Kaituo Feng, Bohao Li, Yibing Wang, Mofan Cheng, Shijia Yang, Jiaming Han, Benyou Wang, Yutong Bai, Zhuoran Yang, Xiangyu Yue
Recently, multimodal large language models (MLLMs), such as GPT-4o, Gemini 1. 5 Pro, and Reka Core, have expanded their capabilities to include vision and audio modalities.
1 code implementation • 6 Nov 2024 • Dingjie Song, Sicheng Lai, Shunian Chen, Lichao Sun, Benyou Wang
The rapid progression of multimodal large language models (MLLMs) has demonstrated superior performance on various multimodal benchmarks.
no code implementations • 18 Oct 2024 • Zihao Cheng, Li Zhou, Feng Jiang, Benyou Wang, Haizhou Li
The rapid development of large language models (LLMs), like ChatGPT, has resulted in the widespread presence of LLM-generated content on social media platforms, raising concerns about misinformation, data biases, and privacy violations, which can undermine trust in online discourse.
1 code implementation • 17 Oct 2024 • Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, Benyou Wang
Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions.
no code implementations • 17 Oct 2024 • Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li
The success of large language models (LLMs) has prompted efforts to integrate speech and audio data, aiming to create general foundation models capable of processing both textual and non-textual inputs.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 14 Oct 2024 • Guorui Zheng, Xidong Wang, Juhao Liang, Nuo Chen, Yuping Zheng, Benyou Wang
In order to leverage the generalization capability of multilingual LLMs to efficiently scale to more resource-constrained languages, we explore the internal information flow of LLMs from a multilingual perspective using Mixture of Experts (MoE) modularity.
no code implementations • 12 Oct 2024 • Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong, Qi Liu
As large vision-language models (LVLMs) evolve rapidly, the demand for high-quality and diverse data to align these models becomes increasingly crucial.
Ranked #62 on
Visual Question Answering
on MM-Vet
2 code implementations • 10 Oct 2024 • Bofei Gao, Feifan Song, Zhe Yang, Zefan Cai, Yibo Miao, Qingxiu Dong, Lei LI, Chenghao Ma, Liang Chen, Runxin Xu, Zhengyang Tang, Benyou Wang, Daoguang Zan, Shanghaoran Quan, Ge Zhang, Lei Sha, Yichang Zhang, Xuancheng Ren, Tianyu Liu, Baobao Chang
However, existing benchmarks like GSM8K or MATH are now being solved with high accuracy (e. g., OpenAI o1 achieves 94. 8\% on MATH dataset), indicating their inadequacy for truly challenging these models.
1 code implementation • 17 Sep 2024 • Dingjie Song, Wenjun Wang, Shunian Chen, Xidong Wang, Michael Guan, Benyou Wang
The rapid advancement of Multimodal Large Language Models (MLLMs) has led to remarkable performances across various domains.
1 code implementation • 4 Sep 2024 • Xidong Wang, Dingjie Song, Shunian Chen, Chen Zhang, Benyou Wang
Expanding the long-context capabilities of Multi-modal Large Language Models~(MLLMs) is crucial for video understanding, high-resolution image understanding, and multi-modal agents.
no code implementations • 31 Aug 2024 • Ridong Han, Chaohao Yang, Tao Peng, Prayag Tiwari, Xiang Wan, Lu Liu, Benyou Wang
To demonstrate the latest representative progress in LLMs' information extraction ability, we assess the information extraction ability of GPT-4 (the latest version of GPT at the time of writing this paper) from four perspectives: Performance, Evaluation Criteria, Robustness, and Error Types.
no code implementations • 20 Aug 2024 • Jimin Huang, Mengxi Xiao, Dong Li, Zihao Jiang, Yuzhe Yang, Yifei Zhang, Lingfei Qian, Yan Wang, Xueqing Peng, Yang Ren, Ruoyu Xiang, Zhengyu Chen, Xiao Zhang, Yueru He, Weiguang Han, Shunian Chen, Lihang Shen, Daniel Kim, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram, Peng Lu, Guojun Xiong, Zhiwei Liu, Zheheng Luo, Zhiyuan Yao, Ruey-Ling Weng, Meikang Qiu, Kaleb E Smith, Honghai Yu, Yanzhao Lai, Min Peng, Jian-Yun Nie, Jordan W. Suchow, Xiao-Yang Liu, Benyou Wang, Alejandro Lopez-Lira, Qianqian Xie, Sophia Ananiadou, Junichi Tsujii
Financial LLMs hold promise for advancing financial tasks and domain-specific applications.
1 code implementation • 6 Aug 2024 • Pengcheng Chen, Jin Ye, Guoan Wang, Yanjun Li, Zhongying Deng, Wei Li, Tianbin Li, Haodong Duan, Ziyan Huang, Yanzhou Su, Benyou Wang, Shaoting Zhang, Bin Fu, Jianfei Cai, Bohan Zhuang, Eric J Seibel, Junjun He, Yu Qiao
Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields.
1 code implementation • 18 Jul 2024 • Junying Chen, Chi Gui, Anningzhe Gao, Ke Ji, Xidong Wang, Xiang Wan, Benyou Wang
This study introduces Chain-of-Diagnosis (CoD) to enhance the interpretability of LLM-based medical diagnostics.
1 code implementation • 27 Jun 2024 • Junying Chen, Chi Gui, Ruyi Ouyang, Anningzhe Gao, Shunian Chen, Guiming Hardy Chen, Xidong Wang, Ruifei Zhang, Zhenyang Cai, Ke Ji, Guangjun Yu, Xiang Wan, Benyou Wang
The rapid development of multimodal large language models (MLLMs), such as GPT-4V, has led to significant advancements.
no code implementations • 26 Jun 2024 • Wenya Xie, Qingying Xiao, Yu Zheng, Xidong Wang, Junying Chen, Ke Ji, Anningzhe Gao, Xiang Wan, Feng Jiang, Benyou Wang
Based on this, we construct a Chinese medical dataset called DoctorFLAN to support the entire workflow of doctors, which includes 92K Q\&A samples from 22 tasks and 27 specialists.
no code implementations • 24 Jun 2024 • Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, PengFei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang
In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese medical LLM.
1 code implementation • 2 Jun 2024 • Ke Ji, Junying Chen, Anningzhe Gao, Wenya Xie, Xiang Wan, Benyou Wang
In the quest for super-human performance, Large Language Models (LLMs) have traditionally been tethered to human-annotated datasets and predefined training objectives-a process that is both labor-intensive and inherently limited.
1 code implementation • 30 May 2024 • Ling-Hao Chen, Shunlin Lu, Ailing Zeng, Hao Zhang, Benyou Wang, Ruimao Zhang, Lei Zhang
This study delves into the realm of multi-modality (i. e., video and motion modalities) human behavior understanding by leveraging the powerful capabilities of Large Language Models (LLMs).
1 code implementation • 28 May 2024 • Yaoyao Xu, Xinjian Zhao, Xiaozhuang Song, Benyou Wang, Tianshu Yu
We introduce a pioneering methodology for boosting large language models in the domain of protein representation learning.
1 code implementation • 28 May 2024 • Chenyu Huang, Zhengyang Tang, Shixi Hu, Ruoqing Jiang, Xin Zheng, Dongdong Ge, Benyou Wang, Zizhuo Wang
This work also introduces IndustryOR, the first industrial benchmark for evaluating LLMs in solving practical OR problems.
1 code implementation • 26 May 2024 • Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen
For \textit{routing function}, we tailor two innovative routing functions according to the granularity: \texttt{TensorPoly-I} which directs to each rank within the entangled tensor while \texttt{TensorPoly-II} offers a finer-grained routing approach targeting each order of the entangled tensor.
1 code implementation • 21 May 2024 • Xuhan Huang, Qingning Shen, Yan Hu, Anningzhe Gao, Benyou Wang
By focusing on the processes LLMs undertake rather than the correctness of their final solutions, Mamo pioneers a novel evaluation paradigm.
1 code implementation • 14 May 2024 • Chenghao Zhu, Nuo Chen, Yufei Gao, Yunyi Zhang, Prayag Tiwari, Benyou Wang
The rapid advancement of Large Language Models (LLMs) highlights the urgent need for evolving evaluation methodologies that keep pace with improvements in language comprehension and information processing.
1 code implementation • 9 May 2024 • Junzhi Chen, Juhao Liang, Benyou Wang
The emergence of large language models (LLMs) has opened up unprecedented possibilities for automating complex tasks that are often comparable to human performance.
no code implementations • 29 Apr 2024 • Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang
Despite the advancements and impressive performance of Multimodal Large Language Models (MLLMs) on benchmarks, their effectiveness in real-world, long-context, and multi-image tasks is unclear due to the benchmarks' limited scope.
2 code implementations • 10 Mar 2024 • Gang Hu, Ke Qin, Chenhan Yuan, Min Peng, Alejandro Lopez-Lira, Benyou Wang, Sophia Ananiadou, Jimin Huang, Qianqian Xie
While the progression of Large Language Models (LLMs) has notably propelled financial analysis, their application has largely been confined to singular language realms, leaving untapped the potential of bilingual Chinese-English capacity.
1 code implementation • 6 Mar 2024 • Xidong Wang, Nuo Chen, Junyin Chen, Yidong Wang, Guorui Zhen, Chunxian Zhang, Xiangbo Wu, Yan Hu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang
Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources.
no code implementations • 4 Mar 2024 • Juhao Liang, Ziwei Wang, Zhuoheng Ma, Jianquan Li, Zhiyi Zhang, Xiangbo Wu, Benyou Wang
Large Language Models(LLMs) have dramatically revolutionized the field of Natural Language Processing(NLP), offering remarkable capabilities that have garnered widespread usage.
2 code implementations • 1 Mar 2024 • Xianghong Fang, Jian Li, Qiang Sun, Benyou Wang
Uniformity plays an important role in evaluating learned representations, providing insights into self-supervised learning.
no code implementations • 20 Feb 2024 • Haoran Li, Qingxiu Dong, Zhengyang Tang, Chaojun Wang, Xingxing Zhang, Haoyang Huang, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei
We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs).
2 code implementations • 20 Feb 2024 • Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xiong, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu, Jiajia Huang, Xiao-Yang Liu, Alejandro Lopez-Lira, Benyou Wang, Yanzhao Lai, Hao Wang, Min Peng, Sophia Ananiadou, Jimin Huang
Our evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals several key findings: While LLMs excel in IE and textual analysis, they struggle with advanced reasoning and complex tasks like text generation and forecasting.
1 code implementation • 18 Feb 2024 • Guiming Hardy Chen, Shunian Chen, Ruifei Zhang, Junying Chen, Xiangbo Wu, Zhiyi Zhang, Zhihong Chen, Jianquan Li, Xiang Wan, Benyou Wang
Large vision-language models (LVLMs) have shown premise in a broad range of vision-language tasks with their strong reasoning and generalization capabilities.
1 code implementation • 16 Feb 2024 • Guiming Hardy Chen, Shunian Chen, Ziche Liu, Feng Jiang, Benyou Wang
We further exploit these biases to conduct attacks on LLM judges.
no code implementations • 12 Feb 2024 • Yazhou Zhang, Mengyao Wang, Chenyu Ren, Qiuchi Li, Prayag Tiwari, Benyou Wang, Jing Qin
The value of text classification's future research has encountered challenges and uncertainties, due to the extraordinary efficacy demonstrated by large language models (LLMs) across numerous downstream NLP tasks.
no code implementations • 17 Dec 2023 • Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong
This paper explores preference distillation for large vision language models (LVLMs), improving their ability to generate helpful and faithful responses anchoring the visual context.
Ranked #66 on
Visual Question Answering
on MM-Vet
1 code implementation • 23 Nov 2023 • Wentao Ge, Shunian Chen, Guiming Hardy Chen, Junying Chen, Zhihong Chen, Nuo Chen, Wenya Xie, Shuo Yan, Chenghao Zhu, Ziyue Lin, Song Dingjie, Xidong Wang, Anningzhe Gao, Zhang Zhiyi, Jianquan Li, Xiang Wan, Benyou Wang
To this end, in our paper, we propose a new evaluation paradigm for MLLMs, which is evaluating MLLMs with per-sample criteria using potent MLLM as the judge.
1 code implementation • 16 Nov 2023 • Fei Yu, Anningzhe Gao, Benyou Wang
These findings offer a novel perspective on the role of outcome supervision in training value models for multi-step reasoning tasks and provide theoretical justification for its advantage in value estimation for guided decoding.
Ranked #47 on
Arithmetic Reasoning
on GSM8K
1 code implementation • 16 Nov 2023 • Junying Chen, Xidong Wang, Ke Ji, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang
We validate the new protocol in the domains where proprietary LLMs like ChatGPT perform relatively poorly, such as Traditional Chinese Medicine.
no code implementations • 13 Nov 2023 • Chen Zhang, Benyou Wang, Dawei Song
To this end, we propose an elastic language model (ElasticLM) that elastically adjusts the tradeoff according to the request stream.
1 code implementation • 18 Oct 2023 • Yaxin Fan, Feng Jiang, Benyou Wang, Peifeng Li, Haizhou Li
Recent studies primarily focused on the quality of FMs evaluated by GPT-4 or their ability to pass medical exams, no studies have quantified the extent of self-diagnostic atomic knowledge stored in FMs' memory, which is the basis of foundation models to provide factual and reliable suggestions.
1 code implementation • 17 Oct 2023 • Yazhou Zhang, Mengyao Wang, Youxi Wu, Prayag Tiwari, Qiuchi Li, Benyou Wang, Jing Qin
Large language models (LLMs) and their variants have shown extraordinary efficacy across numerous downstream natural language processing (NLP) tasks, which has presented a new vision for the development of NLP.
1 code implementation • 21 Sep 2023 • Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Juncai He, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models.
1 code implementation • 21 Aug 2023 • Chuyi Kong, Yaxin Fan, Xiang Wan, Feng Jiang, Benyou Wang
The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT dialogues, as evidenced by Vicuna.
2 code implementations • 17 Aug 2023 • Xidong Wang, Guiming Hardy Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li
We hope this benchmark provide first-hand experience in existing LLMs for medicine and also facilitate the widespread adoption and enhancement of medical LLMs within China.
1 code implementation • 6 Jun 2023 • Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang
Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e. g., BERT, one of the representative models.
no code implementations • 6 Jun 2023 • Yaochen Liu, Qiuchi Li, Benyou Wang, Yazhou Zhang, Dawei Song
Quantum theory, originally proposed as a physical theory to describe the motions of microscopic particles, has been applied to various non-physics domains involving human cognition and decision-making that are inherently uncertain and exhibit certain non-classical, quantum-like characteristics.
1 code implementation • NeurIPS 2023 • Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu, Benyou Wang, Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucci
Med-UniC reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities.
2 code implementations • 24 May 2023 • Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li
Experimental results demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consultation among open-source LLMs in GPT-4 evaluation, human evaluation, and medical benchmark datasets.
1 code implementation • 24 May 2023 • Hongbo Zhang, Xiang Wan, Benyou Wang
This gives us a hint that relational knowledge might not be redundant to the stored knowledge of PLMs, but rather be complementary.
1 code implementation • 23 May 2023 • Ridong Han, Chaohao Yang, Tao Peng, Prayag Tiwari, Xiang Wan, Lu Liu, Benyou Wang
To demonstrate the latest representative progress in LLMs' information extraction ability, we assess the information extraction ability of GPT-4 (the latest version of GPT at the time of writing this paper) from four perspectives: Performance, Evaluation Criteria, Robustness, and Error Types.
1 code implementation • 20 May 2023 • Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, Dawei Song
However, when the capacity gap between the teacher and the student is large, a curse of capacity gap appears, invoking a deficiency in distilling LMs.
1 code implementation • 10 May 2023 • Zhibin Lu, Qianqian Xie, Benyou Wang, Jian-Yun Nie
An inductive Word-grounded Graph Convolutional Network (WGCN) is proposed to learn word and document representations based on WGraph in a supervised manner.
1 code implementation • 2 May 2023 • Jianquan Li, Xidong Wang, Xiangbo Wu, Zhiyi Zhang, Xiaolong Xu, Jie Fu, Prayag Tiwari, Xiang Wan, Benyou Wang
Moreover, we also experimentally show the benefit of the proposed dataset in many aspects: (i) trained models for other QA datasets in a zero-shot fashion; and (ii) as external knowledge for retrieval-augmented generation (RAG); and (iii) improving existing pre-trained language models by using the QA pairs as a pre-training corpus in continued training manner.
1 code implementation • 20 Apr 2023 • Xiaokang Liu, Jianquan Li, Jingjing Mu, Min Yang, Ruifeng Xu, Benyou Wang
In this paper, we introduce novel K-center contrastive learning and adjustable decision boundary learning (CLAB) to improve the effectiveness of open intent classification.
1 code implementation • 20 Apr 2023 • Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li
This paper presents our efforts to democratize ChatGPT across language.
no code implementations • 18 Apr 2023 • Qianqian Xie, Zheheng Luo, Benyou Wang, Sophia Ananiadou
In this paper, we present a systematic review of recent advancements in BTS, leveraging cutting-edge NLP techniques from PLMs to LLMs, to help understand the latest progress, challenges, and future directions.
1 code implementation • 26 Mar 2023 • Fei Yu, Hongbo Zhang, Prayag Tiwari, Benyou Wang
This survey paper proposes a clearer view of natural language reasoning in the field of Natural Language Processing (NLP), both conceptually and practically.
1 code implementation • 23 Mar 2023 • Juhao Liang, Chen Zhang, Zhengyang Tang, Jie Fu, Dawei Song, Benyou Wang
Built upon the paradigm, we propose a retrieval model with modular prompt tuning named REMOP.
no code implementations • 24 Feb 2023 • Qiuchi Li, Benyou Wang, Yudong Zhu, Christina Lioma, Qun Liu
The emerging classical-quantum transfer learning paradigm has brought a decent performance to quantum computational models in many tasks, such as computer vision, by enabling a combination of quantum models and classical pre-trained neural networks.
1 code implementation • ICCV 2023 • Zhihong Chen, Shizhe Diao, Benyou Wang, Guanbin Li, Xiang Wan
Medical vision-and-language pre-training (Med-VLP) has shown promising improvements on many downstream medical tasks owing to its applicability to extracting generic representations from medical images and texts.
1 code implementation • 20 Dec 2022 • Ridong Han, Tao Peng, Benyou Wang, Lu Liu, Xiang Wan
Document-level relation extraction faces two overlooked challenges: long-tail problem and multi-label problem.
1 code implementation • 27 Oct 2022 • Guobing Gan, Peng Zhang, Sunzhu Li, Xiuqing Lu, Benyou Wang
In the era of deep learning, word embeddings are essential when dealing with text tasks.
1 code implementation • 23 Sep 2022 • Zhongwei Wan, Xin Liu, Benyou Wang, Jiezhong Qiu, Boyu Li, Ting Guo, Guangyong Chen, Yang Wang
The idea is to supplement the GNN-based main supervised recommendation task with the temporal representation via an auxiliary cross-view contrastive learning mechanism.
2 code implementations • COLING 2022 • Zhengyang Tang, Benyou Wang, Ting Yao
We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources.
1 code implementation • 20 Jul 2022 • Yi Yang, Chen Zhang, Benyou Wang, Dawei Song
To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets).
1 code implementation • 2 Jul 2022 • Benyou Wang, Xiangbo Wu, Xiaokang Liu, Jianquan Li, Prayag Tiwari, Qianqian Xie
However, the humor aspect of natural language is relatively under-investigated, especially in the age of pre-trained language models.
1 code implementation • ICLR 2022 • Yuxin Ren, Benyou Wang, Lifeng Shang, Xin Jiang, Qun Liu
A tiny version achieves $96. 7\%$ performance of BERT-base with $ {1}/{48} $ encoder parameters (i. e., less than 2M parameters excluding the embedding layer) and $2. 7 \times$ faster on inference.
1 code implementation • NeurIPS 2021 • Benyou Wang, Emanuele Di Buccio, Massimo Melucci
Word meaning may change over time as a reflection of changes in human society.
1 code implementation • 11 Oct 2021 • Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari, Zhao Li, Jie Fu
In this paper, we summarize the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.
no code implementations • ICLR 2021 • Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Grue Simonsen
Various Position Embeddings (PEs) have been proposed in Transformer based architectures~(e. g. BERT) to model word order.
no code implementations • 26 Oct 2020 • Zhenzhen Li, Jian-Yun Nie, Benyou Wang, Pan Du, Yuhan Zhang, Lixin Zou, Dongsheng Li
Distant supervision provides a means to create a large number of weakly labeled data at low cost for relation classification.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Chen Zhang, Qiuchi Li, Dawei Song, Benyou Wang
The state-of-the-art Aspect-based Sentiment Analysis (ABSA) approaches are mainly based on either detecting aspect terms and their corresponding sentiment polarities, or co-extracting aspect and opinion terms.
Aspect-Based Sentiment Analysis
Aspect Sentiment Triplet Extraction
+3
no code implementations • IEEE Transactions on Cybernetics 2020 • Wei Zhao, Benyou Wang, Min Yang, Jianbo Ye, Zhou Zhao, Xiaojun Chen, and Ying Shen
Movie recommendation systems provide users with ranked lists of movies based on individual’s preferences and constraints.
1 code implementation • ICLR 2020 • Benyou Wang, Donghao Zhao, Christina Lioma, Qiuchi Li, Peng Zhang, Jakob Grue Simonsen
The benefit of continuous functions over variable positions is that word representations shift smoothly with increasing positions.
no code implementations • 25 Sep 2019 • Peng Zhang, Xiaoliu Mao, Xindian Ma, Benyou Wang, Jing Zhang, Jun Wang, Dawei Song
We prove that by a mapping (via the trace operator) on the high-dimensional matching matrix, a low-dimensional attention matrix can be derived.
1 code implementation • NAACL 2019 • Qiuchi Li, Benyou Wang, Massimo Melucci
This paper seeks to model human language by the mathematical framework of quantum physics.
1 code implementation • 26 Feb 2019 • Benyou Wang, Qiuchi Li, Massimo Melucci, Dawei Song
To address this issue, we propose a new framework that models different levels of semantic units (e. g. sememe, word, sentence, and semantic abstraction) on a single \textit{Semantic Hilbert Space}, which naturally admits a non-linear semantic composition by means of a complex-valued vector word representation.
1 code implementation • 28 Aug 2018 • Peng Zhang, Zhan Su, Lipeng Zhang, Benyou Wang, Dawei Song
The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory.
no code implementations • WS 2018 • Qiuchi Li, Sagar Uprety, Benyou Wang, Dawei Song
A challenging task for word embeddings is to capture the emergent meaning or polarity of a combination of individual words.
no code implementations • 10 Feb 2018 • Benyou Wang, Li Wang, Qikang Wei, Lichun Liu
Text representation is a fundamental concern in Natural Language Processing, especially in text classification.
3 code implementations • 30 May 2017 • Jun Wang, Lantao Yu, Wei-Nan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, Dell Zhang
This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair.