Search Results for author: Zheyu Zhang

Found 7 papers, 4 papers with code

Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models

1 code implementation3 Aug 2023 Zheyu Zhang, Han Yang, Bolei Ma, David Rügamer, Ercong Nie

Large Language Models (LLMs) demonstrate remarkable performance on a variety of Natural Language Understanding (NLU) tasks, primarily due to their in-context learning ability.

Natural Language Understanding Question Answering

ModuleFormer: Modularity Emerges from Mixture-of-Experts

1 code implementation7 Jun 2023 Yikang Shen, Zheyu Zhang, Tianyou Cao, Shawn Tan, Zhenfang Chen, Chuang Gan

In our experiment, we found that the modular architecture enables three important abilities for large pre-trained language models: 1) Efficiency, since ModuleFormer only activates a subset of its modules for each input token, thus it could achieve the same performance as dense LLMs with more than two times throughput; 2) Extendability, ModuleFormer is more immune to catastrophic forgetting than dense LLMs and can be easily extended with new modules to learn new knowledge that is not included in the training data; 3) Specialisation, finetuning ModuleFormer could specialize a subset of modules to the finetuning task and the task-unrelated modules could be easily pruned for a lightweight deployment.

Language Modelling

mPLM-Sim: Unveiling Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models

no code implementations23 May 2023 Peiqin Lin, Chengzhi Hu, Zheyu Zhang, André F. T. Martins, Hinrich Schütze

Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining.

Open-Ended Question Answering Zero-Shot Cross-Lingual Transfer

Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance

1 code implementation18 May 2023 Zheyu Zhang, Tianping Zhang, Jian Li

To this end, we provide a fine-grained analysis of bias in GBDT and demonstrate that the bias originates from 1) the systematic bias in the gain estimation of each split and 2) the bias in the split finding algorithm resulting from the use of the same data to evaluate the split improvement and determine the best split.

Feature Importance feature selection

Discovering Customer-Service Dialog System with Semi-Supervised Learning and Coarse-to-Fine Intent Detection

no code implementations23 Dec 2022 Zhitong Yang, Xing Ma, Anqi Liu, Zheyu Zhang

Task-oriented dialog(TOD) aims to assist users in achieving specific goals through multi-turn conversation.

Intent Detection

Indoor room Occupancy Counting based on LSTM and Environmental Sensor

no code implementations5 Dec 2022 Zheyu Zhang

This paper realizes the estimation of classroom occupancy by using the CO2 sensor and deep learning technique named Long-Short-Term Memory.

OpenFE: Automated Feature Generation with Expert-level Performance

2 code implementations22 Nov 2022 Tianping Zhang, Zheyu Zhang, Zhiyuan Fan, Haoyan Luo, Fengyuan Liu, Qian Liu, Wei Cao, Jian Li

In the two competitions, features generated by OpenFE with a simple baseline model can beat 99. 3% and 99. 6% data science teams respectively.

Feature Importance

Cannot find the paper you are looking for? You can Submit a new open access paper.