Search Results for author: Xian Li

Found 52 papers, 20 papers with code

Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models

1 code implementation28 Oct 2024 Yilun Jin, Zheng Li, Chenwei Zhang, Tianyu Cao, Yifan Gao, Pratik Jayarao, Mao Li, Xin Liu, Ritesh Sarkhel, Xianfeng Tang, Haodong Wang, Zhengyang Wang, Wenju Xu, Jingfeng Yang, Qingyu Yin, Xian Li, Priyanka Nigam, Yi Xu, Kai Chen, Qiang Yang, Meng Jiang, Bing Yin

Shopping MMLU consists of 57 tasks covering 4 major shopping skills: concept understanding, knowledge reasoning, user behavior alignment, and multi-linguality, and can thus comprehensively evaluate the abilities of LLMs as general shop assistants.

Few-Shot Learning MMLU

To the Globe (TTG): Towards Language-Driven Guaranteed Travel Planning

no code implementations21 Oct 2024 Da Ju, Song Jiang, Andrew Cohen, Aaron Foss, Sasha Mitts, Arman Zharmagambetov, Brandon Amos, Xian Li, Justine T Kao, Maryam Fazel-Zarandi, Yuandong Tian

In this paper, we propose To the Globe (TTG), a real-time demo system that takes natural language requests from users, translates it to symbolic form via a fine-tuned Large Language Model, and produces optimal travel itineraries with Mixed Integer Linear Programming solvers.

Language Modelling Large Language Model

RNR: Teaching Large Language Models to Follow Roles and Rules

no code implementations10 Sep 2024 Kuan Wang, Alexander Bukharin, Haoming Jiang, Qingyu Yin, Zhengyang Wang, Tuo Zhao, Jingbo Shang, Chao Zhang, Bing Yin, Xian Li, Jianshu Chen, Shiyang Li

However, existing models trained on open-source IFT datasets only have the ability to follow instructions from users, and often fail to follow complex role and rules specified by developers, a. k. a.

Instruction Following

Better Alignment with Instruction Back-and-Forth Translation

no code implementations8 Aug 2024 Thao Nguyen, Jeffrey Li, Sewoong Oh, Ludwig Schmidt, Jason Weston, Luke Zettlemoyer, Xian Li

We propose a new method, instruction back-and-forth translation, to construct high-quality synthetic data grounded in world knowledge for aligning large language models (LLMs).

Diversity Translation +1

Self-Taught Evaluators

no code implementations5 Aug 2024 Tianlu Wang, Ilia Kulikov, Olga Golovneva, Ping Yu, Weizhe Yuan, Jane Dwivedi-Yu, Richard Yuanzhe Pang, Maryam Fazel-Zarandi, Jason Weston, Xian Li

Model-based evaluation is at the heart of successful model development -- as a reward model for training, and as a replacement for human evaluation.

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

1 code implementation12 Mar 2024 Sainbayar Sukhbaatar, Olga Golovneva, Vasu Sharma, Hu Xu, Xi Victoria Lin, Baptiste Rozière, Jacob Kahn, Daniel Li, Wen-tau Yih, Jason Weston, Xian Li

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Arithmetic Reasoning Code Generation +6

Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR

no code implementations21 Feb 2024 Rui Zhou, Xian Li, Ying Fang, Xiaofei Li

In this work, we propose Mel-FullSubNet, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

MEMORYLLM: Towards Self-Updatable Large Language Models

1 code implementation7 Feb 2024 Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian McAuley

We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently.

Model Editing

Self-Rewarding Language Models

3 code implementations18 Jan 2024 Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal.

Instruction Following Language Modelling

Branch-Solve-Merge Improves Large Language Model Evaluation and Generation

no code implementations23 Oct 2023 Swarnadeep Saha, Omer Levy, Asli Celikyilmaz, Mohit Bansal, Jason Weston, Xian Li

Large Language Models (LLMs) are frequently used for multi-faceted language generation and evaluation tasks that involve satisfying intricate user constraints or taking into account multiple aspects and criteria.

Language Modelling Large Language Model +1

Long Short-Term Planning for Conversational Recommendation Systems

no code implementations23 Oct 2023 Xian Li, Hongguang Shi, Yunfei Wang, Yeqin Zhang, Xubin Li, Cam-Tu Nguyen

Specifically, the recommendation predicts the long-term recommendation target based on the conversational context and the user history.

Attribute Conversational Recommendation +1

Chain-of-Verification Reduces Hallucination in Large Language Models

1 code implementation20 Sep 2023 Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston

Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models.

Hallucination Text Generation

Fine-tune the pretrained ATST model for sound event detection

1 code implementation15 Sep 2023 Nian Shao, Xian Li, Xiaofei Li

In this work, we study the fine-tuning method of the pretrained models for SED.

 Ranked #1 on Sound Event Detection on DESED (using extra training data)

Event Detection Self-Supervised Learning +1

Self-Alignment with Instruction Backtranslation

2 code implementations11 Aug 2023 Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.

Instruction Following Language Modelling

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

2 code implementations7 Jun 2023 Xian Li, Nian Shao, Xiaofei Li

In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.

Audio Classification Audio Tagging +8

Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach

1 code implementation26 May 2023 Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, Jinho D. Choi

We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention.

Attribute Attribute Mining

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

no code implementations23 May 2023 Zeyu Leo Liu, Tim Dettmers, Xi Victoria Lin, Veselin Stoyanov, Xian Li

Large and sparse feed-forward layers (S-FFN) such as Mixture-of-Experts (MoE) have proven effective in scaling up Transformers model size for \textit{pretraining} large language models.

Avg Language Modelling +1

Large Language Model Programs

no code implementations9 May 2023 Imanol Schlag, Sainbayar Sukhbaatar, Asli Celikyilmaz, Wen-tau Yih, Jason Weston, Jürgen Schmidhuber, Xian Li

In recent years, large pre-trained language models (LLMs) have demonstrated the ability to follow instructions and perform novel tasks from a few examples.

Language Modelling Large Language Model +1

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation22 Dec 2022 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Language Modelling Meta-Learning +2

Clinicopathological correlation of p40/TTF1 co-expression in NSCLC and review of related literature

no code implementations14 Sep 2022 LiAn Yang, Ming Xiao, Xian Li, Ya-lan Wang

Mutations in STK11/LKB1 and NF1 genes have been found in ADC and SQC and are often associated with drug resistance and poor prognosis, but STK11/NF1 co-mutation has not been reported and more cases are needed to reveal the association.

Specificity

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations EACL 2021 Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Decoder Machine Translation +1

ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection

no code implementations25 May 2022 Badr AlKhamissi, Faisal Ladhak, Srini Iyer, Ves Stoyanov, Zornitsa Kozareva, Xian Li, Pascale Fung, Lambert Mathias, Asli Celikyilmaz, Mona Diab

Hate speech detection is complex; it relies on commonsense reasoning, knowledge of stereotypes, and an understanding of social nuance that differs from one culture to the next.

Cultural Vocal Bursts Intensity Prediction Few-Shot Learning +1

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

no code implementations NAACL 2022 Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.

named-entity-recognition Named Entity Recognition +3

OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision

1 code implementation29 Apr 2022 Xinyang Zhang, Chenwei Zhang, Xian Li, Xin Luna Dong, Jingbo Shang, Christos Faloutsos, Jiawei Han

Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data.

Attribute Attribute Mining +2

ATST: Audio Representation Learning with Teacher-Student Transformer

4 code implementations26 Apr 2022 Xian Li, Xiaofei Li

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

Audio Classification Instrument Recognition +5

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations20 Dec 2021 Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

1 code implementation26 Nov 2021 Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer

In this paper, we discuss approaches to detecting when models have beliefs about the world, and we improve on methods for updating model beliefs to be more truthful, with a focus on methods based on learned optimizers or hypernetworks.

Distributionally Robust Multilingual Machine Translation

1 code implementation EMNLP 2021 Chunting Zhou, Daniel Levy, Xian Li, Marjan Ghazvininejad, Graham Neubig

Multilingual neural machine translation (MNMT) learns to translate multiple language pairs with a single model, potentially improving both the accuracy and the memory-efficiency of deployed models.

Machine Translation Translation

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations ACL 2021 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Decoder Text Generation +2

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

no code implementations ACL (IWSLT) 2021 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.

Transfer Learning Translation

Unaware Fairness: Hierarchical Random Forest for Protected Classes

no code implementations30 Jun 2021 Xian Li

Procedural fairness has been a public concern, which leads to controversy when making decisions with respect to protected classes, such as race, social status, and disability.

Fairness

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

no code implementations NeurIPS 2021 Hongyu Gong, Yun Tang, Juan Pino, Xian Li

We further propose attention sharing strategies to facilitate parameter sharing and specialization in multilingual and multi-domain sequence modeling.

speech-recognition Speech Recognition +2

Adaptive Sparse Transformer for Multilingual Translation

no code implementations15 Apr 2021 Hongyu Gong, Xian Li, Dmitriy Genzel

Based on these insights, we propose an adaptive and sparse architecture for multilingual modeling, and train the model to learn shared and language-specific parameters to improve the positive transfer and mitigate the interference.

Machine Translation Transfer Learning +1

Robust Optimization for Multilingual Translation with Imbalanced Data

no code implementations NeurIPS 2021 Xian Li, Hongyu Gong

We show that common training method which upsamples low resources can not robustly optimize population loss with risks of either underfitting high resource languages or overfitting low resource ones.

Machine Translation Translation

Improving Zero-Shot Translation by Disentangling Positional Information

1 code implementation ACL 2021 Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training.

Machine Translation Translation

Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms

1 code implementation29 Dec 2020 Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal

Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.

Active Learning

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations24 Oct 2020 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Decoder +3

Cannot find the paper you are looking for? You can Submit a new open access paper.