Search Results for author: Shiwen Ni

Found 10 papers, 4 papers with code

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

no code implementations • 26 Mar 2024 • Yuelin Bai, Xinrun Du, Yiming Liang, Yonggang Jin, Ziqiang Liu, Junting Zhou, Tianyu Zheng, Xincheng Zhang, Nuo Ma, Zekun Wang, Ruibin Yuan, Haihong Wu, Hongquan Lin, Wenhao Huang, Jiajun Zhang, Wenhu Chen, Chenghua Lin, Jie Fu, Min Yang, Shiwen Ni, Ge Zhang

To bridge this gap, we introduce COIG-CQIA, a high-quality Chinese instruction tuning dataset.

Paper
Add Code

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

1 code implementation • 26 Feb 2024 • Shiwen Ni, Minghuan Tan, Yuelin Bai, Fuqiang Niu, Min Yang, BoWen Zhang, Ruifeng Xu, Xiaojun Chen, Chengming Li, Xiping Hu, Ye Li, Jianping Fan

In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain.

Language Modelling Large Language Model +2

Paper
Code

Layer-wise Regularized Dropout for Neural Language Models

no code implementations • 26 Feb 2024 • Shiwen Ni, Min Yang, Ruifeng Xu, Chengming Li, Xiping Hu

To solve the inconsistency between training and inference caused by the randomness of dropout, some studies use consistency training to regularize dropout at the output layer.

Abstractive Text Summarization Machine Translation +1

Paper
Add Code

History, Development, and Principles of Large Language Models-An Introductory Survey

no code implementations • 10 Feb 2024 • Zhibo Chu, Shiwen Ni, Zichong Wang, Xi Feng, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang, Wenbin Zhang

Language models serve as a cornerstone in natural language processing (NLP), utilizing mathematical methods to generalize language laws and knowledge for prediction and generation.

Language Modelling

Paper
Add Code

E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models

1 code implementation • 29 Jan 2024 • Jinchang Hou, Chang Ao, Haihong Wu, Xiangtao Kong, Zhigang Zheng, Daijia Tang, Chengming Li, Xiping Hu, Ruifeng Xu, Shiwen Ni, Min Yang

The integration of LLMs and education is getting closer and closer, however, there is currently no benchmark for evaluating LLMs that focuses on the Chinese K-12 education domain.

Ethics Multiple-choice

Paper
Code

Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models

no code implementations • 14 Nov 2023 • Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang

In this paper, we propose a new paradigm for fine-tuning called F-Learning (Forgetting before Learning), which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge.

Paper
Add Code

ELECTRA is a Zero-Shot Learner, Too

1 code implementation • 17 Jul 2022 • Shiwen Ni, Hung-Yu Kao

Numerically, compared to MLM-RoBERTa-large and MLM-BERT-large, our RTD-ELECTRA-large has an average of about 8. 4% and 13. 7% improvement on all 15 tasks.

Language Modelling SST-2 +1

Paper
Code

True or False: Does the Deep Learning Model Learn to Detect Rumors?

no code implementations • 1 Dec 2021 • Shiwen Ni, Jiawen Li, Hung-Yu Kao

It is difficult for humans to distinguish the true and false of rumors, but current deep learning models can surpass humans and achieve excellent accuracy on many rumor datasets.

Common Sense Reasoning

Paper
Add Code

HAT4RD: Hierarchical Adversarial Training for Rumor Detection on Social Media

no code implementations • 29 Aug 2021 • Shiwen Ni, Jiawen Li, Hung-Yu Kao

As such, the robustness and generalization of the current rumor detection model are put into question.

Paper
Add Code

DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

1 code implementation • 29 Aug 2021 • Shiwen Ni, Jiawen Li, Hung-Yu Kao

We compare the proposed method with other adversarial training methods and regularization methods, and our method achieves state-of-the-art on all datasets.

Adversarial Attack Adversarial Defense

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.