Search Results for author: Xian Li

Found 44 papers, 16 papers with code

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

no code implementations12 Mar 2024 Sainbayar Sukhbaatar, Olga Golovneva, Vasu Sharma, Hu Xu, Xi Victoria Lin, Baptiste Rozière, Jacob Kahn, Daniel Li, Wen-tau Yih, Jason Weston, Xian Li

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Arithmetic Reasoning Code Generation +6

Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR

no code implementations21 Feb 2024 Rui Zhou, Xian Li, Ying Fang, Xiaofei Li

In this work, we propose Mel-FullSubNet, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Self-Rewarding Language Models

2 code implementations18 Jan 2024 Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal.

Instruction Following Language Modelling

Branch-Solve-Merge Improves Large Language Model Evaluation and Generation

no code implementations23 Oct 2023 Swarnadeep Saha, Omer Levy, Asli Celikyilmaz, Mohit Bansal, Jason Weston, Xian Li

Large Language Models (LLMs) are frequently used for multi-faceted language generation and evaluation tasks that involve satisfying intricate user constraints or taking into account multiple aspects and criteria.

Language Modelling Large Language Model +1

Long Short-Term Planning for Conversational Recommendation Systems

no code implementations23 Oct 2023 Xian Li, Hongguang Shi, Yunfei Wang, Yeqin Zhang, Xubin Li, Cam-Tu Nguyen

Specifically, the recommendation predicts the long-term recommendation target based on the conversational context and the user history.

Attribute Recommendation Systems

Chain-of-Verification Reduces Hallucination in Large Language Models

1 code implementation20 Sep 2023 Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston

Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models.

Hallucination Text Generation

Fine-tune the pretrained ATST model for sound event detection

1 code implementation15 Sep 2023 Nian Shao, Xian Li, Xiaofei Li

In this work, we study the fine-tuning method of the pretrained models for SED.

 Ranked #1 on Sound Event Detection on DESED (using extra training data)

Event Detection Self-Supervised Learning +1

Self-Alignment with Instruction Backtranslation

2 code implementations11 Aug 2023 Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.

Instruction Following Language Modelling

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

2 code implementations7 Jun 2023 Xian Li, Nian Shao, Xiaofei Li

In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.

Audio Classification Audio Tagging +8

PV2TEA: Patching Visual Modality to Textual-Established Information Extraction

no code implementations1 Jun 2023 Hejie Cui, Rongmei Lin, Nasser Zalmout, Chenwei Zhang, Jingbo Shang, Carl Yang, Xian Li

Information extraction, e. g., attribute value extraction, has been extensively studied and formulated based only on text.

Attribute Attribute Value Extraction

Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach

1 code implementation26 May 2023 Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, Jinho D. Choi

We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention.

Attribute

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

no code implementations23 May 2023 Zeyu Leo Liu, Tim Dettmers, Xi Victoria Lin, Veselin Stoyanov, Xian Li

Large and sparse feed-forward layers (S-FFN) such as Mixture-of-Experts (MoE) have proven effective in scaling up Transformers model size for \textit{pretraining} large language models.

Avg Language Modelling +1

Large Language Model Programs

no code implementations9 May 2023 Imanol Schlag, Sainbayar Sukhbaatar, Asli Celikyilmaz, Wen-tau Yih, Jason Weston, Jürgen Schmidhuber, Xian Li

In recent years, large pre-trained language models (LLMs) have demonstrated the ability to follow instructions and perform novel tasks from a few examples.

Language Modelling Large Language Model +1

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation22 Dec 2022 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Language Modelling Meta-Learning +2

Clinicopathological correlation of p40/TTF1 co-expression in NSCLC and review of related literature

no code implementations14 Sep 2022 LiAn Yang, Ming Xiao, Xian Li, Ya-lan Wang

Mutations in STK11/LKB1 and NF1 genes have been found in ADC and SQC and are often associated with drug resistance and poor prognosis, but STK11/NF1 co-mutation has not been reported and more cases are needed to reveal the association.

Specificity

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations EACL 2021 Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Machine Translation Translation

ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection

no code implementations25 May 2022 Badr AlKhamissi, Faisal Ladhak, Srini Iyer, Ves Stoyanov, Zornitsa Kozareva, Xian Li, Pascale Fung, Lambert Mathias, Asli Celikyilmaz, Mona Diab

Hate speech detection is complex; it relies on commonsense reasoning, knowledge of stereotypes, and an understanding of social nuance that differs from one culture to the next.

Cultural Vocal Bursts Intensity Prediction Few-Shot Learning +1

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

no code implementations NAACL 2022 Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.

named-entity-recognition Named Entity Recognition +3

OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision

1 code implementation29 Apr 2022 Xinyang Zhang, Chenwei Zhang, Xian Li, Xin Luna Dong, Jingbo Shang, Christos Faloutsos, Jiawei Han

Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data.

Attribute Language Modelling

ATST: Audio Representation Learning with Teacher-Student Transformer

4 code implementations26 Apr 2022 Xian Li, Xiaofei Li

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

Audio Classification Instrument Recognition +5

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations20 Dec 2021 Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

1 code implementation26 Nov 2021 Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer

In this paper, we discuss approaches to detecting when models have beliefs about the world, and we improve on methods for updating model beliefs to be more truthful, with a focus on methods based on learned optimizers or hypernetworks.

Distributionally Robust Multilingual Machine Translation

1 code implementation EMNLP 2021 Chunting Zhou, Daniel Levy, Xian Li, Marjan Ghazvininejad, Graham Neubig

Multilingual neural machine translation (MNMT) learns to translate multiple language pairs with a single model, potentially improving both the accuracy and the memory-efficiency of deployed models.

Machine Translation Translation

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations ACL 2021 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Text Generation Transfer Learning +1

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

no code implementations ACL (IWSLT) 2021 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.

Transfer Learning Translation

Unaware Fairness: Hierarchical Random Forest for Protected Classes

no code implementations30 Jun 2021 Xian Li

Procedural fairness has been a public concern, which leads to controversy when making decisions with respect to protected classes, such as race, social status, and disability.

Fairness

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

no code implementations NeurIPS 2021 Hongyu Gong, Yun Tang, Juan Pino, Xian Li

We further propose attention sharing strategies to facilitate parameter sharing and specialization in multilingual and multi-domain sequence modeling.

speech-recognition Speech Recognition +2

Robust Optimization for Multilingual Translation with Imbalanced Data

no code implementations NeurIPS 2021 Xian Li, Hongyu Gong

We show that common training method which upsamples low resources can not robustly optimize population loss with risks of either underfitting high resource languages or overfitting low resource ones.

Machine Translation Translation

Adaptive Sparse Transformer for Multilingual Translation

no code implementations15 Apr 2021 Hongyu Gong, Xian Li, Dmitriy Genzel

Based on these insights, we propose an adaptive and sparse architecture for multilingual modeling, and train the model to learn shared and language-specific parameters to improve the positive transfer and mitigate the interference.

Machine Translation Transfer Learning +1

Improving Zero-Shot Translation by Disentangling Positional Information

1 code implementation ACL 2021 Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training.

Machine Translation Translation

Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms

1 code implementation29 Dec 2020 Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal

Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.

Active Learning

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations24 Oct 2020 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Text Generation +2

DeFuzz: Deep Learning Guided Directed Fuzzing

no code implementations23 Oct 2020 Xiaogang Zhu, Shigang Liu, Xian Li, Sheng Wen, Jun Zhang, Camtepe Seyit, Yang Xiang

Fuzzing is one of the most effective technique to identify potential software vulnerabilities.

Vulnerability Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.