2 code implementations • 8 Apr 2024 • Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu
We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture.
1 code implementation • 26 Feb 2024 • Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training.
no code implementations • 5 Dec 2023 • Alon Albalak, Liangming Pan, Colin Raffel, William Yang Wang
The data used to pretrain large language models has a decisive impact on a model's downstream performance, which has led to a large body of work on data selection methods that aim to automatically determine the most suitable data to use for pretraining.
5 code implementations • 22 May 2023 • Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Stella Biderman, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Jiaju Lin, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Guangyu Song, Xiangru Tang, Bolun Wang, Johan S. Wind, Stanislaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Qinghua Zhou, Jian Zhu, Rui-Jie Zhu
This work presents a significant step towards reconciling trade-offs between computational efficiency and model performance in sequence processing tasks.
Ranked #22 on Natural Language Inference on WNLI
1 code implementation • 20 May 2023 • Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations.
1 code implementation • NeurIPS 2023 • Alon Albalak, Colin Raffel, William Yang Wang
In this work, we focus on Few-shot Learning with Auxiliary Data (FLAD), a training paradigm that assumes access to auxiliary data during few-shot learning in hopes of improving generalization.
1 code implementation • 20 Dec 2022 • Yi-Lin Tuan, Alon Albalak, Wenda Xu, Michael Saxon, Connor Pryor, Lise Getoor, William Yang Wang
Despite their widespread adoption, neural conversation models have yet to exhibit natural chat capabilities with humans.
no code implementations • 21 Oct 2022 • Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang
Transfer learning is an exciting area of Natural Language Processing that has the potential to both improve model performance and increase data efficiency.
no code implementations • 8 Oct 2022 • Alon Albalak, Akshat Shrivastava, Chinnadhurai Sankar, Adithya Sagar, Mike Ross
Multi-task learning (MTL), instruction tuning, and prompting have recently been shown to improve the generalizability of large language models to new tasks.
no code implementations • 14 Jul 2022 • Eriq Augustine, Pegah Jandaghi, Alon Albalak, Connor Pryor, Charles Dickens, William Wang, Lise Getoor
Creating agents that can both appropriately respond to conversations and understand complex human linguistic tendencies and social cues has been a long standing challenge in the NLP community.
no code implementations • 27 May 2022 • Connor Pryor, Charles Dickens, Eriq Augustine, Alon Albalak, William Wang, Lise Getoor
In this paper, we introduce Neural Probabilistic Soft Logic (NeuPSL), a novel neuro-symbolic (NeSy) framework that unites state-of-the-art symbolic reasoning with the low-level perception of deep neural networks.
1 code implementation • 12 May 2022 • Alon Albalak, Yi-Lin Tuan, Pegah Jandaghi, Connor Pryor, Luke Yoffe, Deepak Ramachandran, Lise Getoor, Jay Pujara, William Yang Wang
Task transfer, transferring knowledge contained in related tasks, holds the promise of reducing the quantity of labeled data required to fine-tune language models.
1 code implementation • 26 Jan 2022 • Alon Albalak, Sharon Levy, William Yang Wang
Open-retrieval question answering systems are generally trained and tested on large datasets in well-established domains.
1 code implementation • NLP4ConvAI (ACL) 2022 • Alon Albalak, Varun Embar, Yi-Lin Tuan, Lise Getoor, William Yang Wang
Existing research studies on cross-sentence relation extraction in long-form multi-party conversations aim to improve relation extraction without considering the explainability of such methods.
Ranked #7 on Dialog Relation Extraction on DialogRE
1 code implementation • EMNLP 2021 • Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang
Broader disclosive transparency$-$truth and clarity in communication regarding the function of AI systems$-$is widely considered desirable.