Search Results for author: Alon Albalak

Found 15 papers, 10 papers with code

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

2 code implementations • 8 Apr 2024 • Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu

We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture.

120

Paper
Code

A Survey on Data Selection for Language Models

1 code implementation • 26 Feb 2024 • Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang

A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training.

Unsupervised Pre-training

Paper
Code

Efficient Online Data Mixing For Language Model Pre-Training

no code implementations • 5 Dec 2023 • Alon Albalak, Liangming Pan, Colin Raffel, William Yang Wang

The data used to pretrain large language models has a decisive impact on a model's downstream performance, which has led to a large body of work on data selection methods that aim to automatically determine the most suitable data to use for pretraining.

Language Modelling

Paper
Add Code

RWKV: Reinventing RNNs for the Transformer Era

5 code implementations • 22 May 2023 • Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Stella Biderman, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Jiaju Lin, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Guangyu Song, Xiangru Tang, Bolun Wang, Johan S. Wind, Stanislaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Qinghua Zhou, Jian Zhu, Rui-Jie Zhu

This work presents a significant step towards reconciling trade-offs between computational efficiency and model performance in sequence processing tasks.

Ranked #22 on Natural Language Inference on WNLI

Computational Efficiency Natural Language Inference

11,580

Paper
Code

Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

1 code implementation • 20 May 2023 • Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang

We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations.

Logical Reasoning

169

Paper
Code

Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data

1 code implementation • NeurIPS 2023 • Alon Albalak, Colin Raffel, William Yang Wang

In this work, we focus on Few-shot Learning with Auxiliary Data (FLAD), a training paradigm that assumes access to auxiliary data during few-shot learning in hopes of improving generalization.

Few-Shot Learning

Paper
Code

CausalDialogue: Modeling Utterance-level Causality in Conversations

1 code implementation • 20 Dec 2022 • Yi-Lin Tuan, Alon Albalak, Wenda Xu, Michael Saxon, Connor Pryor, Lise Getoor, William Yang Wang

Despite their widespread adoption, neural conversation models have yet to exhibit natural chat capabilities with humans.

Dialogue Generation

Paper
Code

An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

no code implementations • 21 Oct 2022 • Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang

Transfer learning is an exciting area of Natural Language Processing that has the potential to both improve model performance and increase data efficiency.

Transfer Learning

Paper
Add Code

Data-Efficiency with a Single GPU: An Exploration of Transfer Methods for Small Language Models

no code implementations • 8 Oct 2022 • Alon Albalak, Akshat Shrivastava, Chinnadhurai Sankar, Adithya Sagar, Mike Ross

Multi-task learning (MTL), instruction tuning, and prompting have recently been shown to improve the generalizability of large language models to new tasks.

Multi-Task Learning

Paper
Add Code

Emotion Recognition in Conversation using Probabilistic Soft Logic

no code implementations • 14 Jul 2022 • Eriq Augustine, Pegah Jandaghi, Alon Albalak, Connor Pryor, Charles Dickens, William Wang, Lise Getoor

Creating agents that can both appropriately respond to conversations and understand complex human linguistic tendencies and social cues has been a long standing challenge in the NLP community.

Emotion Recognition in Conversation Logical Reasoning +2

Paper
Add Code

NeuPSL: Neural Probabilistic Soft Logic

no code implementations • 27 May 2022 • Connor Pryor, Charles Dickens, Eriq Augustine, Alon Albalak, William Wang, Lise Getoor

In this paper, we introduce Neural Probabilistic Soft Logic (NeuPSL), a novel neuro-symbolic (NeSy) framework that unites state-of-the-art symbolic reasoning with the low-level perception of deep neural networks.

Paper
Add Code

FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue

1 code implementation • 12 May 2022 • Alon Albalak, Yi-Lin Tuan, Pegah Jandaghi, Connor Pryor, Luke Yoffe, Deepak Ramachandran, Lise Getoor, Jay Pujara, William Yang Wang

Task transfer, transferring knowledge contained in related tasks, holds the promise of reducing the quantity of labeled data required to fine-tune language models.

Dialogue Understanding Domain Adaptation +1

Paper
Code

Addressing Issues of Cross-Linguality in Open-Retrieval Question Answering Systems For Emergent Domains

1 code implementation • 26 Jan 2022 • Alon Albalak, Sharon Levy, William Yang Wang

Open-retrieval question answering systems are generally trained and tested on large datasets in well-established domains.

Question Answering Retrieval +1

Paper
Code

D-REX: Dialogue Relation Extraction with Explanations

1 code implementation • NLP4ConvAI (ACL) 2022 • Alon Albalak, Varun Embar, Yi-Lin Tuan, Lise Getoor, William Yang Wang

Existing research studies on cross-sentence relation extraction in long-form multi-party conversations aim to improve relation extraction without considering the explainability of such methods.

Ranked #7 on Dialog Relation Extraction on DialogRE

Dialog Relation Extraction Relation +3

Paper
Code

Modeling Disclosive Transparency in NLP Application Descriptions

1 code implementation • EMNLP 2021 • Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang

Broader disclosive transparency$-$truth and clarity in communication regarding the function of AI systems$-$is widely considered desirable.

Fairness Language Modelling +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.