Search Results for author: Hamish Ivison

Found 12 papers, 9 papers with code

Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning

no code implementations19 Aug 2024 Sriyash Poddar, Yanming Wan, Hamish Ivison, Abhishek Gupta, Natasha Jaques

Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences.

reinforcement-learning Reinforcement Learning

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

3 code implementations17 Nov 2023 Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques.

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

4 code implementations NeurIPS 2023 Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

Our evaluations show that the best model in any given evaluation reaches on average 87% of ChatGPT performance, and 73% of GPT-4 performance, suggesting that further investment in building better base models and instruction-tuning data is required to close the gap.

Instruction Following

HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation

no code implementations20 Dec 2022 Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew Peters

By converting instructions into modules, HINT models can effectively disregard the length of instructions and few-shot example inputs in terms of compute usage.

In-Context Learning parameter-efficient fine-tuning

Data-Efficient Finetuning Using Cross-Task Nearest Neighbors

1 code implementation1 Dec 2022 Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi

Obtaining labeled data to train a model for a task of interest is often expensive.

Hyperdecoders: Instance-specific decoders for multi-task NLP

1 code implementation15 Mar 2022 Hamish Ivison, Matthew E. Peters

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder.

Decoder parameter-efficient fine-tuning

Local Interpretations for Explainable Natural Language Processing: A Survey

no code implementations20 Mar 2021 Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.

Deep Learning Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.