Search Results for author: Mingxu Tao

Found 7 papers, 5 papers with code

Harder Tasks Need More Experts: Dynamic Routing in MoE Models

1 code implementation12 Mar 2024 Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng

In this paper, we introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models, aiming to enhance computational efficiency and model performance by adjusting the number of activated experts based on input difficulty.

Computational Efficiency

Probing Multimodal Large Language Models for Global and Local Semantic Representations

1 code implementation27 Feb 2024 Mingxu Tao, Quzhe Huang, Kun Xu, Liwei Chen, Yansong Feng, Dongyan Zhao

The advancement of Multimodal Large Language Models (MLLMs) has greatly accelerated the development of applications in understanding integrated texts and images.

object-detection Object Detection +1

Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering

no code implementations26 Feb 2024 Mingxu Tao, Dongyan Zhao, Yansong Feng

Open-ended question answering requires models to find appropriate evidence to form well-reasoned, comprehensive and helpful answers.

Evidence Selection Open-Ended Question Answering +1

MC^2: A Multilingual Corpus of Minority Languages in China

1 code implementation14 Nov 2023 Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng

However, existing LLMs exhibit limited abilities in understanding low-resource languages, including the minority languages in China, due to a lack of training data.

Lawyer LLaMA Technical Report

1 code implementation24 May 2023 Quzhe Huang, Mingxu Tao, Chen Zhang, Zhenwei An, Cong Jiang, Zhibin Chen, Zirui Wu, Yansong Feng

Specifically, we inject domain knowledge during the continual training stage and teach the model to learn professional skills using properly designed supervised fine-tuning tasks.

Hallucination Retrieval

A Frustratingly Easy Improvement for Position Embeddings via Random Padding

no code implementations8 May 2023 Mingxu Tao, Yansong Feng, Dongyan Zhao

Since the embeddings of rear positions are updated fewer times than the front position embeddings, the rear ones may not be properly trained.

Extractive Question-Answering Position +1

Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study

1 code implementation2 Mar 2023 Mingxu Tao, Yansong Feng, Dongyan Zhao

Large pre-trained language models help to achieve state of the art on a variety of natural language processing (NLP) tasks, nevertheless, they still suffer from forgetting when incrementally learning a sequence of tasks.

Extractive Question-Answering Incremental Learning +3

Cannot find the paper you are looking for? You can Submit a new open access paper.