no code implementations • EMNLP 2021 • Ruifeng Yuan, Zili Wang, Wenjie Li
Sentence fusion is a conditional generation task that merges several related sentences into a coherent one, which can be deemed as a summary sentence.
1 code implementation • 20 Mar 2025 • Jiaheng Liu, Dawei Zhu, Zhiqi Bai, Yancheng He, Huanxuan Liao, Haoran Que, Zekun Wang, Chenchen Zhang, Ge Zhang, Jiebin Zhang, Yuanxing Zhang, Zhuo Chen, Hangyu Guo, Shilong Li, Ziqiang Liu, Yong Shan, YiFan Song, Jiayi Tian, Wenhao Wu, Zhejian Zhou, Ruijie Zhu, Junlan Feng, Yang Gao, Shizhu He, Zhoujun Li, Tianyu Liu, Fanyu Meng, Wenbo Su, Yingshui Tan, Zili Wang, Jian Yang, Wei Ye, Bo Zheng, Wangchunshu Zhou, Wenhao Huang, Sujian Li, Zhaoxiang Zhang
With the growing number of long documents, dialogues, and other textual data, it is important to develop Long Context Language Models (LCLMs) that can process and analyze extensive inputs in an effective and efficient way.
no code implementations • 6 Mar 2025 • Houyi Li, Wenzheng Zheng, Jingcheng Hu, Qiufeng Wang, Hanshan Zhang, Zili Wang, Shijie Xuyang, Yuantao Fan, Shuigeng Zhou, Xiangyu Zhang, Daxin Jiang
Through extensive empirical studies involving grid searches across diverse configurations, we discover universal scaling laws governing these hyperparameters: optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes.
no code implementations • 20 Feb 2025 • M-A-P Team, Xinrun Du, Yifan Yao, Kaijing Ma, Bingli Wang, Tianyu Zheng, King Zhu, Minghao Liu, Yiming Liang, Xiaolong Jin, Zhenlin Wei, Chujie Zheng, Kaixin Deng, Shawn Gavin, Shian Jia, Sichao Jiang, Yiyan Liao, Rui Li, Qinrui Li, Sirun Li, Yizhi Li, Yunwen Li, David Ma, Yuansheng Ni, Haoran Que, Qiyao Wang, Zhoufutu Wen, Siwei Wu, Tyshawn Hsing, Ming Xu, Zhenzhu Yang, Zekun Moore Wang, Junting Zhou, Yuelin Bai, Xingyuan Bu, Chenglin Cai, Liang Chen, Yifan Chen, Chengtuo Cheng, Tianhao Cheng, Keyi Ding, Siming Huang, Yun Huang, Yaoru Li, Yizhe Li, Zhaoqun Li, Tianhao Liang, Chengdong Lin, Hongquan Lin, Yinghao Ma, Tianyang Pang, Zhongyuan Peng, Zifan Peng, Qige Qi, Shi Qiu, Xingwei Qu, Shanghaoran Quan, Yizhou Tan, Zili Wang, Chenqing Wang, Hao Wang, Yiya Wang, YuBo Wang, Jiajun Xu, Kexin Yang, Ruibin Yuan, Yuanhao Yue, Tianyang Zhan, Chun Zhang, Jinyang Zhang, Xiyue Zhang, Xingjian Zhang, Yue Zhang, Yongchi Zhao, Xiangyu Zheng, Chenghua Zhong, Yang Gao, Zhoujun Li, Dayiheng Liu, Qian Liu, Tianyu Liu, Shiwen Ni, Junran Peng, Yujia Qin, Wenbo Su, Guoyin Wang, Shi Wang, Jian Yang, Min Yang, Meng Cao, Xiang Yue, Zhaoxiang Zhang, Wangchunshu Zhou, Jiaheng Liu, Qunshu Lin, Wenhao Huang, Ge Zhang
To address this gap, we present SuperGPQA, a comprehensive benchmark that evaluates graduate-level knowledge and reasoning capabilities across 285 disciplines.
no code implementations • 26 Dec 2024 • Jingcheng Hu, Houyi Li, Yinmin Zhang, Zili Wang, Shuigeng Zhou, Xiangyu Zhang, Heung-Yeung Shum, Daxin Jiang
Existing variants for standard Multi-Head Attention (MHA), including SOTA methods like MLA, fail to maintain as strong performance under stringent Key-Value cache (KV cache) constraints.
1 code implementation • 18 Nov 2024 • Zili Wang, Robert Zhang, Kun Ding, Qi Yang, Fei Li, Shiming Xiang
Experimental results show that our continuous speculative decoding achieves a remarkable $2. 33\times$ speed-up on off-the-shelf models while maintaining the output distribution.
no code implementations • 7 Nov 2024 • Siming Huang, Tianhao Cheng, J. K. Liu, Jiaran Hao, Liuyihan Song, Yang Xu, J. Yang, Jiaheng Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Zhaoxiang Zhang, Jie Fu, Qian Liu, Ge Zhang, Zili Wang, Yuan Qi, Yinghui Xu, Wei Chu
To address the gap, we introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
1 code implementation • 25 Sep 2024 • Zeyu Huang, Zihan Qiu, Zili Wang, Edoardo M. Ponti, Ivan Titov
Central to this process is the reward model (RM), which translates human feedback into training signals for optimising LLM behaviour.
no code implementations • 10 Sep 2024 • Qi Yang, Binjie Mao, Zili Wang, Xing Nie, Pengfei Gao, Ying Guo, Cheng Zhen, Pengfei Yan, Shiming Xiang
These challenges encompass maintaining the content consistency between the input video and the generated audio, as well as the alignment of temporal and loudness properties within the video.
1 code implementation • 13 Aug 2024 • Zihan Qiu, Zeyu Huang, Shuang Cheng, Yizhi Zhou, Zili Wang, Ivan Titov, Jie Fu
The scaling of large language models (LLMs) has revolutionized their capabilities in various tasks, yet this growth must be matched with efficient computational strategies.
3 code implementations • 3 Aug 2024 • Zili Wang, Qi Yang, Linsu Shi, Jiazhong Yu, Qinghua Liang, Fei Li, Shiming Xiang
By characterizing attention maps of the network, we identify two key obstacles in AVS models: 1) attention dissipation, corresponding to the over-concentrated attention weights by Softmax within restricted frames, and 2) inefficient, burdensome transformer decoder, caused by narrow focus patterns in early stages.
no code implementations • 15 Jul 2024 • Zheyuan Zhou, Le Wang, Naiyu Fang, Zili Wang, Lemiao Qiu, Shuyou Zhang
However, there are two major challenges to the practical application of the current approaches: 1) the embedded models suffer the prohibitive computational and storage due to the memory bank structure; 2) the reconstructive models based on the MAE mechanism fail to detect anomalies in the unmasked regions.
Ranked #2 on
3D Anomaly Detection
on Anomaly-ShapeNet
1 code implementation • 26 Jun 2024 • Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu
Mixture-of-experts (MoE) is gaining increasing attention due to its unique properties and remarkable performance, especially for language tasks.
1 code implementation • 18 Jun 2024 • Haoze Wu, Zihan Qiu, Zili Wang, Hang Zhao, Jie Fu
Therefore, these tokens can acquire the necessary knowledge from any expert during inference and become less sensitive to the choice.
no code implementations • 11 Jun 2024 • Jun Gao, Qian Qiao, Ziqiang Cao, Zili Wang, Wenjie Li
In-context learning (ICL) facilitates Large Language Models (LLMs) exhibiting emergent ability on downstream tasks without updating billions of parameters.
no code implementations • 15 Mar 2024 • Zili Wang, Sean B. Andersson, Roberto Tron
Deep learning methods have been widely used in robotic applications, making learning-enabled control design for complex nonlinear systems a promising direction.
no code implementations • 29 Feb 2024 • Shangda Wu, Xu Tan, Zili Wang, Rui Wang, Xiaobing Li, Maosong Sun
Traditional deep learning often overlooks bytes, the basic units of the digital world, where all forms of information and operations are encoded and manipulated in binary format.
1 code implementation • 26 Feb 2024 • Ka Man Lo, Yiming Liang, Wenyu Du, Yuantao Fan, Zili Wang, Wenhao Huang, Lei Ma, Jie Fu
Additionally, the V-MoE-Base model trained with m2mKD achieves 3. 5% higher accuracy than end-to-end training on ImageNet-1k.
1 code implementation • 25 Feb 2024 • Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, Jingcheng Wu, Chenghua Lin, Qifeng Liu, Tao Jiang, Wenhao Huang, Wenhu Chen, Emmanouil Benetos, Jie Fu, Gus Xia, Roger Dannenberg, Wei Xue, Shiyin Kang, Yike Guo
It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the music is treated as a second language.
1 code implementation • 20 Feb 2024 • Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu
The Mixture of Experts (MoE) for language models has been proven effective in augmenting the capacity of models by dynamically routing each input token to a specific subset of experts for processing.
no code implementations • 23 Jan 2024 • Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Jianrong Tan
This paper proposes a novel garment transfer method supervised with knowledge distillation from virtual try-on.
no code implementations • 22 Dec 2023 • Ruifeng Yuan, Shichao Sun, Yongqi Li, Zili Wang, Ziqiang Cao, Wenjie Li
With the rapid development of large language models, AI assistants like ChatGPT have become increasingly integrated into people's works and lives but are limited in personalized services.
no code implementations • 29 Sep 2023 • Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Zheyuan Zhou, Kerui Hu
To address these issues, we propose the GSDC Transformer, an efficient and effective component for cue fusion in monocular multi-frame depth estimation.
1 code implementation • 1 Aug 2023 • Sirui Hong, Mingchen Zhuge, Jiaqi Chen, Xiawu Zheng, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, Jürgen Schmidhuber
Remarkable progress has been made on automated problem solving through societies of agents based on large language models (LLMs).
2 code implementations • 9 Jun 2023 • Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye, Lin Zhang, Jianchen Wang, Yixin Zhu, Sihang Jiang, Zhuozhi Xiong, Zihan Li, Weijie Wu, Qianyu He, Rui Xu, Wenhao Huang, Jingping Liu, Zili Wang, Shusen Wang, Weiguo Zheng, Hongwei Feng, Yanghua Xiao
New Natural Langauge Process~(NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs).
2 code implementations • 31 May 2023 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghao Xiao, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Zili Wang, Yike Guo, Jie Fu
Although SSL has been proven effective in speech and audio, its application to music audio has yet to be thoroughly explored.
1 code implementation • 24 May 2023 • Dongjie Yang, Ruifeng Yuan, Yuantao Fan, Yifei Yang, Zili Wang, Shusen Wang, Hai Zhao
Therefore, we propose a method called RefGPT to generate enormous truthful and customized dialogues without worrying about factual errors caused by the model hallucination.
no code implementations • 2 May 2023 • Gašper Beguš, Thomas Lu, Zili Wang
We introduce spontaneous concatenation: a phenomenon where convolutional neural networks (CNNs) trained on acoustic recordings of individual words start generating outputs with two or even three words concatenated without ever accessing data with multiple words in the input.
1 code implementation • 18 Apr 2023 • Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu
To address these limitations, we propose a novel virtual try-on method via progressive inference paradigm (PGVTON) that leverages a top-down inference pipeline and a general garment try-on strategy.
no code implementations • 7 Apr 2023 • Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Kang Wang
To save the computation increase caused by this hierarchical framework, we exploit the cross-scale Transformer to learn feature relationships in a reversed-aligning way, and leverage the residual connection of BEV features to facilitate information transmission between scales.
no code implementations • 29 Nov 2022 • Ruifeng Yuan, Zili Wang, Ziqiang Cao, Wenjie Li
Drawn inspiration from prefix-tuning, we are allowed to integrate the task knowledge from text summarization and question answering into a properly designed prefix and apply the merged prefix to query-focused summarization.
no code implementations • 20 Sep 2022 • Chang Sun, Zili Wang, Shuyou Zhang, Le Wang, Jianrong Tan
In the second stage, under the physical logic, the PE-NET is assembled by ES-NET and SP-NET and then fine-tuned with the small sample BMT dataset and composite loss function.
no code implementations • 3 Jul 2022 • Chang Sun, Zili Wang, Shuyou Zhang, Taotao Zhou, Jie Li, Jianrong Tan
To address this issue, a digital-twin-enhanced (DT-enhanced) metal tube bending forming real-time prediction method based on multi-source-input multi-task learning (MTL) is proposed.
no code implementations • 22 Mar 2022 • Zili Wang, Sean B. Andersson, Roberto Tron
We form and solve a nonlinear optimization problem with the sum of path lengths of the agent trajectories as the objective and subject to the original equilibria and global convergence conditions for formation control.
1 code implementation • COLING 2020 • Ruifeng Yuan, Zili Wang, Wenjie Li
We also introduce a hierarchical structure, which incorporates the multi-level of granularities of the textual information into the model.
no code implementations • 19 Oct 2020 • Yang Yang, Junmei Hao, Canjia Li, Zili Wang, Jingang Wang, Fuzheng Zhang, Rao Fu, Peixu Hou, Gong Zhang, Zhongyuan Wang
Existing work on tip generation does not take query into consideration, which limits the impact of tips in search scenarios.
1 code implementation • WS 2019 • Zili Wang
In stage 1 of the gendered pronoun resolution task, a variant of this model, trained in the fine-tuning approach, reduced the multi-class logarithmic loss to 0. 3033 in the 5-fold cross-validation of training set and 0. 2795 in testing set.