1 code implementation • 11 May 2022 • Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel
ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
Ranked #1 on
Few-Shot Text Classification
on RAFT
no code implementations • EMNLP (BlackboxNLP) 2021 • Jason Phang, Haokun Liu, Samuel R. Bowman
Despite the success of fine-tuning pretrained language encoders like BERT for downstream natural language understanding (NLU) tasks, it is still poorly understood how neural networks change after fine-tuning.
no code implementations • ACL 2021 • Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman
Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks.
1 code implementation • EMNLP 2020 • Alex Warstadt, Yian Zhang, Haau-Sing Li, Haokun Liu, Samuel R. Bowman
One reason pretraining on self-supervised linguistic tasks is effective is that it teaches models features that are helpful for language understanding.
1 code implementation • EMNLP (insights) 2020 • William Huang, Haokun Liu, Samuel R. Bowman
A growing body of work shows that models exploit annotation artifacts to achieve state-of-the-art performance on standard crowdsourced benchmarks---datasets collected from crowdworkers to create an evaluation task---while still failing on out-of-domain examples for the same task.
Natural Language Inference
Natural Language Understanding
+1
1 code implementation • EMNLP 2020 • Haokun Liu, William Huang, Dhara A. Mungra, Samuel R. Bowman
Performance on the Winograd Schema Challenge (WSC), a respected English commonsense reasoning benchmark, recently rocketed from chance accuracy to 89% on the SuperGLUE leaderboard, with relatively little corroborating evidence of a correspondingly large improvement in reasoning ability.
no code implementations • ACL 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman
However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman
Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.
Ranked #15 on
Zero-Shot Cross-Lingual Transfer
on XTREME
no code implementations • 1 May 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman
However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.
6 code implementations • ACL 2020 • Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman
We introduce jiant, an open source toolkit for conducting multitask and transfer learning experiments on English NLU tasks.
3 code implementations • TACL 2020 • Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Mohananey, Wei Peng, Sheng-Fu Wang, Samuel R. Bowman
We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLiMP), a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English.
1 code implementation • IJCNLP 2019 • Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretič, Samuel R. Bowman
We conclude that a variety of methods is necessary to reveal all relevant aspects of a model's grammatical knowledge in a given domain.
no code implementations • COLING 2018 • Meng Zou, Xihan Li, Haokun Liu, Zhi-Hong Deng
Neural encoder-decoder models have been widely applied to conversational response generation, which is a research hot spot in recent years.