2 code implementations • 28 Feb 2024 • Nihal V. Nayak, Yiyang Nan, Avi Trost, Stephen H. Bach
Overall, we show that learning with synthetic instruction tuning datasets is an effective way to adapt language models to new domains.
1 code implementation • 20 Dec 2022 • Martha Lewis, Nihal V. Nayak, Peilin Yu, Qinan Yu, Jack Merullo, Stephen H. Bach, Ellie Pavlick
In this work, we focus on the ability of a large pretrained vision and language model (CLIP) to encode compositional concepts and to bind variables in a structure-sensitive way (e. g., differentiating ''cube behind sphere'' from ''sphere behind cube'').
no code implementations • 30 Sep 2022 • Nihal V. Nayak, Ethan R. Elenberg, Clemens Rosenbaum
We adapt existing approaches from the few-sample model evaluation literature to actively sub-sample, with a learned surrogate model, the most informative data points for annotation to estimate the evaluation metric.
1 code implementation • 7 Apr 2022 • Nihal V. Nayak, Peilin Yu, Stephen H. Bach
We perform additional experiments to show that CSP improves generalization to higher-order attribute-attribute-object compositions (e. g., old white cat) and combinations of pretrained attributes and fine-tuned objects.
1 code implementation • ACL 2022 • Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Dragomir Radev, Mike Tian-Jian Jiang, Alexander M. Rush
PromptSource is a system for creating, sharing, and using natural language prompts.
2 code implementations • 8 Nov 2021 • Wasu Piriyakulkij, Cristina Menghini, Ross Briden, Nihal V. Nayak, Jeffrey Zhu, Elaheh Raisi, Stephen H. Bach
Machine learning practitioners often have access to a spectrum of data: labeled data for the target task (which is often limited), unlabeled data, and auxiliary data, the many available labeled datasets for other tasks.
3 code implementations • 18 Jun 2020 • Nihal V. Nayak, Stephen H. Bach
Zero-shot learning relies on semantic class representations such as hand-engineered attributes or learned embeddings to predict classes without any labeled examples.
Ranked #1 on Generalized Zero-Shot Learning on OntoNotes
no code implementations • RANLP 2019 • Anush Kumar, Nihal V. Nayak, Ch, Aditya ra, Mydhili K. Nair
Machine Translation systems have drastically improved over the years for several language pairs.
2 code implementations • WS 2018 • Nihal V. Nayak, Arjun R. Rao
Our system uses a logistic regression model to predict the likelihood of a student making a mistake while answering an exercise on Duolingo in all three language tracks - English/Spanish (en/es), Spanish/English (es/en) and French/English (fr/en).
Ranked #1 on Language Acquisition on SLAM 2018