Search Results for author: Mingxu Tao

Found 7 papers, 5 papers with code

Harder Tasks Need More Experts: Dynamic Routing in MoE Models

1 code implementation • 12 Mar 2024 • Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng

In this paper, we introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models, aiming to enhance computational efficiency and model performance by adjusting the number of activated experts based on input difficulty.

Computational Efficiency

Paper
Code

Probing Multimodal Large Language Models for Global and Local Semantic Representations

1 code implementation • 27 Feb 2024 • Mingxu Tao, Quzhe Huang, Kun Xu, Liwei Chen, Yansong Feng, Dongyan Zhao

The advancement of Multimodal Large Language Models (MLLMs) has greatly accelerated the development of applications in understanding integrated texts and images.

object-detection Object Detection +1

Paper
Code

Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering

no code implementations • 26 Feb 2024 • Mingxu Tao, Dongyan Zhao, Yansong Feng

Open-ended question answering requires models to find appropriate evidence to form well-reasoned, comprehensive and helpful answers.

Evidence Selection Open-Ended Question Answering +1

Paper
Add Code

MC^2: A Multilingual Corpus of Minority Languages in China

1 code implementation • 14 Nov 2023 • Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng

However, existing LLMs exhibit limited abilities in understanding low-resource languages, including the minority languages in China, due to a lack of training data.

Paper
Code

Lawyer LLaMA Technical Report

1 code implementation • 24 May 2023 • Quzhe Huang, Mingxu Tao, Chen Zhang, Zhenwei An, Cong Jiang, Zhibin Chen, Zirui Wu, Yansong Feng

Specifically, we inject domain knowledge during the continual training stage and teach the model to learn professional skills using properly designed supervised fine-tuning tasks.

Hallucination Retrieval

752

Paper
Code

A Frustratingly Easy Improvement for Position Embeddings via Random Padding

no code implementations • 8 May 2023 • Mingxu Tao, Yansong Feng, Dongyan Zhao

Since the embeddings of rear positions are updated fewer times than the front position embeddings, the rear ones may not be properly trained.

Extractive Question-Answering Position +1

Paper
Add Code

Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study

1 code implementation • 2 Mar 2023 • Mingxu Tao, Yansong Feng, Dongyan Zhao

Large pre-trained language models help to achieve state of the art on a variety of natural language processing (NLP) tasks, nevertheless, they still suffer from forgetting when incrementally learning a sequence of tasks.

Extractive Question-Answering Incremental Learning +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.