1 code implementation • 17 Nov 2023 • Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi
Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques.
1 code implementation • NeurIPS 2023 • Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi
Our evaluations show that the best model in any given evaluation reaches on average 87% of ChatGPT performance, and 73% of GPT-4 performance, suggesting that further investment in building better base models and instruction-tuning data is required to close the gap.
1 code implementation • 15 May 2023 • Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan
Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs.
no code implementations • 20 Dec 2022 • Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew Peters
By converting instructions into modules, HINT models can effectively disregard the length of instructions and few-shot example inputs in terms of compute usage.
1 code implementation • 1 Dec 2022 • Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi
Obtaining labeled data to train a model for a task of interest is often expensive.
1 code implementation • 15 Mar 2022 • Hamish Ivison, Matthew E. Peters
We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder.
no code implementations • 20 Mar 2021 • Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon
As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.