1 code implementation • COLING 2022 • Yiming Ju, Weikang Wang, Yuanzhe Zhang, Suncong Zheng, Kang Liu, Jun Zhao
To bridge the gap, we propose a new task: conditional question answering with hierarchical multi-span answers, where both the hierarchical relations and the conditions need to be extracted.
no code implementations • EMNLP 2021 • Yiming Ju, Yuanzhe Zhang, Zhixing Tian, Kang Liu, Xiaohuan Cao, Wenting Zhao, Jinlong Li, Jun Zhao
Multiple-choice MRC is one of the most studied tasks in MRC due to the convenience of evaluation and the flexibility of answer format.
no code implementations • 12 Nov 2024 • Yiming Ju, Huanhuan Ma
In 2022, with the release of ChatGPT, large-scale language models gained widespread attention.
4 code implementations • 24 Oct 2024 • Shuhao Gu, Jialing Zhang, Siyuan Zhou, Kevin Yu, Zhaohu Xing, Liangdong Wang, Zhou Cao, Jintao Jia, Zhuoyi Zhang, YiXuan Wang, Zhenchong Hu, Bo-Wen Zhang, Jijie Li, Dong Liang, Yingli Zhao, Songjing Wang, Yulong Ao, Yiming Ju, Huanhuan Ma, Xiaotong Li, Haiwen Diao, Yufeng Cui, Xinlong Wang, Yaoqi Liu, Fangxiang Feng, Guang Liu
Despite the availability of several open-source multimodal datasets, limitations in the scale and quality of open-source instruction data hinder the performance of VLMs trained on these datasets, leading to a significant gap compared to models trained on closed-source data.
no code implementations • 1 Oct 2024 • Yiming Ju, Ziyi Ni, Xingrun Xing, Zhixiong Zeng, Hanyu Zhao, Siqi Fan, Zheng Zhang
Supervised fine-tuning (SFT) is crucial for adapting Large Language Models (LLMs) to specific tasks.
no code implementations • 11 Sep 2024 • Hanyu Zhao, Li Du, Yiming Ju, ChengWei Wu, Tengfei Pan
With the availability of various instruction datasets, a pivotal challenge is how to effectively select and integrate these instructions to fine-tune large language models (LLMs).
1 code implementation • 13 Aug 2024 • Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, ChengWei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu, Xiangjun Huang, Jian Yang
In this paper, we present AquilaMoE, a cutting-edge bilingual 8*16B Mixture of Experts (MoE) language model that has 8 experts with 16 billion parameters each and is developed using an innovative training methodology called EfficientScale.
1 code implementation • 5 Jun 2024 • Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li
We plug this elastic bi-spiking mechanism in language modeling, named SpikeLM.
2 code implementations • 28 Sep 2023 • Yiming Ju, Xingrun Xing, Zhixiong Zeng
KLoB can serve as a benchmark for evaluating existing locating methods in language models, and can contributes a method to reassessing the validity of locality hypothesis of factual knowledge.
no code implementations • 31 Aug 2023 • Zhongtao Jiang, Yuanzhe Zhang, Yiming Ju, Kang Liu
We present a general framework for unsupervised text style transfer with deep generative models.
no code implementations • 24 Oct 2022 • Yiming Ju, Yuanzhe Zhang, Kang Liu, Jun Zhao
The opaqueness of deep NLP models has motivated the development of methods for interpreting how deep models predict.
no code implementations • ACL 2022 • Yiming Ju, Yuanzhe Zhang, Zhao Yang, Zhongtao Jiang, Kang Liu, Jun Zhao
Meanwhile, since the reasoning process of deep models is inaccessible, researchers design various evaluation methods to demonstrate their arguments.