Search Results for author: Shiwen Ni

Found 10 papers, 4 papers with code

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

1 code implementation26 Feb 2024 Shiwen Ni, Minghuan Tan, Yuelin Bai, Fuqiang Niu, Min Yang, BoWen Zhang, Ruifeng Xu, Xiaojun Chen, Chengming Li, Xiping Hu, Ye Li, Jianping Fan

In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain.

Language Modelling Large Language Model +2

Layer-wise Regularized Dropout for Neural Language Models

no code implementations26 Feb 2024 Shiwen Ni, Min Yang, Ruifeng Xu, Chengming Li, Xiping Hu

To solve the inconsistency between training and inference caused by the randomness of dropout, some studies use consistency training to regularize dropout at the output layer.

Abstractive Text Summarization Machine Translation +1

History, Development, and Principles of Large Language Models-An Introductory Survey

no code implementations10 Feb 2024 Zhibo Chu, Shiwen Ni, Zichong Wang, Xi Feng, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang, Wenbin Zhang

Language models serve as a cornerstone in natural language processing (NLP), utilizing mathematical methods to generalize language laws and knowledge for prediction and generation.

Language Modelling

E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models

1 code implementation29 Jan 2024 Jinchang Hou, Chang Ao, Haihong Wu, Xiangtao Kong, Zhigang Zheng, Daijia Tang, Chengming Li, Xiping Hu, Ruifeng Xu, Shiwen Ni, Min Yang

The integration of LLMs and education is getting closer and closer, however, there is currently no benchmark for evaluating LLMs that focuses on the Chinese K-12 education domain.

Ethics Multiple-choice

Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models

no code implementations14 Nov 2023 Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang

In this paper, we propose a new paradigm for fine-tuning called F-Learning (Forgetting before Learning), which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge.

ELECTRA is a Zero-Shot Learner, Too

1 code implementation17 Jul 2022 Shiwen Ni, Hung-Yu Kao

Numerically, compared to MLM-RoBERTa-large and MLM-BERT-large, our RTD-ELECTRA-large has an average of about 8. 4% and 13. 7% improvement on all 15 tasks.

Language Modelling SST-2 +1

True or False: Does the Deep Learning Model Learn to Detect Rumors?

no code implementations1 Dec 2021 Shiwen Ni, Jiawen Li, Hung-Yu Kao

It is difficult for humans to distinguish the true and false of rumors, but current deep learning models can surpass humans and achieve excellent accuracy on many rumor datasets.

Common Sense Reasoning

HAT4RD: Hierarchical Adversarial Training for Rumor Detection on Social Media

no code implementations29 Aug 2021 Shiwen Ni, Jiawen Li, Hung-Yu Kao

As such, the robustness and generalization of the current rumor detection model are put into question.

DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

1 code implementation29 Aug 2021 Shiwen Ni, Jiawen Li, Hung-Yu Kao

We compare the proposed method with other adversarial training methods and regularization methods, and our method achieves state-of-the-art on all datasets.

Adversarial Attack Adversarial Defense

Cannot find the paper you are looking for? You can Submit a new open access paper.