Search Results for author: Fei Yuan

Found 26 papers, 18 papers with code

Could Thinking Multilingually Empower LLM Reasoning?

1 code implementation16 Apr 2025 Changjiang Gao, Xu Huang, Wenhao Zhu, ShuJian Huang, Lei LI, Fei Yuan

In this paper, we explore the upper bound of harnessing multilingualism in reasoning tasks, suggesting that multilingual reasoning promises significantly (by nearly 10 Acc@$k$ points) and robustly (tolerance for variations in translation quality and language choice) higher upper bounds than English-only reasoning.

Answer Selection

Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning

1 code implementation21 Feb 2025 Wenhao Zhu, Pinzhen Chen, Hanxu Hu, ShuJian Huang, Fei Yuan, Jiajun Chen, Alexandra Birch

The focus of research into modelling long context has been on how to model position and there has been little investigation into other important aspects of language modelling such as instruction tuning.

Language Modelling

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

1 code implementation11 Feb 2025 Xu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei LI, ShuJian Huang, Fei Yuan

Previous multilingual benchmarks focus primarily on simple understanding tasks, but for large language models(LLMs), we emphasize proficiency in instruction following, reasoning, long context understanding, code generation, and so on.

Code Generation Instruction Following +1

WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages

1 code implementation24 Jan 2025 JIA YU, Fei Yuan, Rui Min, Jing Yu, Pei Chu, Jiayang Li, Wei Li, Ruijie Zhang, Zhenxiang Li, Zhifei Ren, Dong Zheng, Wenjian Zhang, Yan Teng, Lingyu Meng, Zhenjiang Jin, Jiantao Qiu, Shasha Wang, Zhongying Tu, Dahua Lin, Yu Wang, Yu Qiao, Yanfeng Wang, Conghui He

This paper introduces the open-source dataset WanJuanSiLu, designed to provide high-quality training corpora for low-resource languages, thereby advancing the research and development of multilingual models.

Diversity

A Controlled Study on Long Context Extension and Generalization in LLMs

1 code implementation18 Sep 2024 Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush

Broad textual understanding and in-context learning require language models that utilize full document contexts.

In-Context Learning

A Chinese Continuous Sign Language Dataset Based on Complex Environments

1 code implementation18 Sep 2024 Qidan Zhu, Jing Li, Fei Yuan, Jiaojiao Fan, Quan Gan

The current bottleneck in continuous sign language recognition (CSLR) research lies in the fact that most publicly available datasets are limited to laboratory environments or television program recordings, resulting in a single background environment with uniform lighting, which significantly deviates from the diversity and complexity found in real-life scenarios.

Diversity Sign Language Recognition

LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages

1 code implementation8 Jul 2024 Yinquan Lu, Wenhao Zhu, Lei LI, Yu Qiao, Fei Yuan

Large Language Models (LLMs) demonstrate remarkable translation capabilities in high-resource language tasks, yet their performance in low-resource languages is hindered by insufficient multilingual data during pre-training.

Data Augmentation Translation

MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

1 code implementation27 May 2024 Zixian Huang, Wenhao Zhu, Gong Cheng, Lei LI, Fei Yuan

In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance.

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

2 code implementations21 Mar 2024 Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.

Survey

Continuous Sign Language Recognition Based on Motor attention mechanism and frame-level Self-distillation

1 code implementation29 Feb 2024 Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

Changes in facial expression, head movement, body movement and gesture movement are remarkable cues in sign language recognition, and most of the current continuous sign language recognition(CSLR) research methods mainly focus on static images in video sequences at the frame-level feature extraction stage, while ignoring the dynamic changes in the images.

Sign Language Recognition

KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language Models

no code implementations5 Feb 2024 Fei Yuan, Chang Ma, Shuai Yuan, Qiushi Sun, Lei LI

We further theoretically prove that KS-Lottery can find the certified winning tickets in the embedding layer, fine-tuning on the found parameters is guaranteed to perform as well as full fine-tuning.

Translation

Question Translation Training for Better Multilingual Reasoning

1 code implementation15 Jan 2024 Wenhao Zhu, ShuJian Huang, Fei Yuan, Shuaijie She, Jiajun Chen, Alexandra Birch

A typical solution is to translate instruction data into all languages of interest, and then train on the resulting multilingual data, which is called translate-training.

Mathematical Reasoning Translation

Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

3 code implementations15 Nov 2023 Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu

Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of natural language(e. g., chemical molecular formula).

World Knowledge

How Vocabulary Sharing Facilitates Multilingualism in LLaMA?

1 code implementation15 Nov 2023 Fei Yuan, Shuai Yuan, Zhiyong Wu, Lei LI

Large Language Models (LLMs), often show strong performance on English tasks, while exhibiting limitations on other languages.

Extrapolating Large Language Models to Non-English by Aligning Languages

2 code implementations9 Aug 2023 Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.

Translation

Utility-Probability Duality of Neural Networks

no code implementations24 May 2023 Huang Bojun, Fei Yuan

In this perspective, training of the neural network corresponds to a utility learning process.

Text Generation

Extrapolating Multilingual Understanding Models as Multilingual Generators

no code implementations22 May 2023 Bohong Wu, Fei Yuan, Hai Zhao, Lei LI, Jingjing Xu

Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.

Denoising Language Modeling +6

Continuous sign language recognition based on cross-resolution knowledge distillation

1 code implementation13 Mar 2023 Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

It is then used to combine cross-resolution knowledge distillation and traditional knowledge distillation methods to form a CSLR model based on cross-resolution knowledge distillation (CRKD).

Knowledge Distillation Sign Language Recognition

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

1 code implementation20 Dec 2022 Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu

To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.

Machine Translation Translation

Temporal superimposed crossover module for effective continuous sign language

1 code implementation7 Nov 2022 Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

The ultimate goal of continuous sign language recognition(CSLR) is to facilitate the communication between special people and normal people, which requires a certain degree of real-time and deploy-ability of the model.

image-classification Image Classification +2

Continuous Sign Language Recognition via Temporal Super-Resolution Network

1 code implementation3 Jul 2022 Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

The sparse frame-level features are fused through the features obtained by the two designed branches as the reconstructed dense frame-level feature sequence, and the connectionist temporal classification(CTC) loss is used for training and optimization after the time-series feature extraction part.

Sign Language Recognition Super-Resolution +2

Multi-scale temporal network for continuous sign language recognition

no code implementations8 Apr 2022 Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

The time-wise feature extraction part performs temporal feature learning by first extracting temporal receptive field features of different scales using the proposed multi-scale temporal block (MST-block) to improve the temporal modeling capability, and then further encoding the temporal features of different scales by the transformers module to obtain more accurate temporal features.

Sign Language Recognition

Simpson's Bias in NLP Training

no code implementations13 Mar 2021 Fei Yuan, Longtu Zhang, Huang Bojun, Yaobo Liang

In most machine learning tasks, we evaluate a model $M$ on a given data population $S$ by measuring a population-level metric $F(S;M)$.

Multi-class Classification Sentence +1

Reinforced Multi-Teacher Selection for Knowledge Distillation

no code implementations11 Dec 2020 Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, Daxin Jiang

When multiple teacher models are available in distillation, the state-of-the-art methods assign a fixed weight to a teacher model in the whole distillation.

GPU Knowledge Distillation +1

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

no code implementations ACL 2020 Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages.

Boundary Detection Machine Reading Comprehension +2

Cannot find the paper you are looking for? You can Submit a new open access paper.