no code implementations • 14 Mar 2024 • Matthew Finlayson, Xiang Ren, Swabha Swayamdipta
The commercialization of large language models (LLMs) has led to the common practice of high-level API-only access to proprietary models.
1 code implementation • 2 Oct 2023 • Matthew Finlayson, John Hewitt, Alexander Koller, Swabha Swayamdipta, Ashish Sabharwal
We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonzero true probability.
1 code implementation • 24 May 2023 • Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal
For example, both normalization and prompting methods for reducing SFC can be ineffective or even detrimental to task performance for some LMs.
1 code implementation • 31 Oct 2022 • Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin Kalyan
Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling.
Ranked #1 on Mathematical Reasoning on Lila (OOD)
1 code implementation • 5 Oct 2022 • Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal
On symbolic reasoning tasks, we can further decompose sub-tasks that are hard for LLMs into even simpler solvable sub-tasks.
1 code implementation • 19 Apr 2022 • Matthew Finlayson, Kyle Richardson, Ashish Sabharwal, Peter Clark
We propose Hard RegSet as a challenging instruction learning task, and a controlled environment for studying instruction learning.
1 code implementation • ACL 2021 • Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen, Yonatan Belinkov
Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts.