1 code implementation • 24 Apr 2024 • Nicholas Meade, Arkil Patel, Siva Reddy
On the other hand, while AFT models may appear safe on the surface, exhibiting refusals to a range of unsafe instructions, we show that they are highly susceptible to adversarial triggers.
1 code implementation • 16 Nov 2023 • Arkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi
Contemporary Large Language Models (LLMs) exhibit a high degree of code generation and comprehension capability.
1 code implementation • 18 Oct 2023 • Arkil Patel, Satwik Bhattamishra, Siva Reddy, Dzmitry Bahdanau
Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts.
no code implementations • 4 Oct 2023 • Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
In this work, we take a step towards answering these questions by demonstrating the following: (a) On a test-bed with a variety of Boolean function classes, we find that Transformers can nearly match the optimal learning algorithm for 'simpler' tasks, while their performance deteriorates on more 'complex' tasks.
1 code implementation • 22 Nov 2022 • Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom
(ii) When trained on Boolean functions, both Transformers and LSTMs prioritize learning functions of low sensitivity, with Transformers ultimately converging to functions of lower sensitivity.
1 code implementation • 23 Oct 2022 • Ankur Sikarwar, Arkil Patel, Navin Goyal
On analyzing the task, we find that identifying the target location in the grid world is the main challenge for the models.
1 code implementation • ACL 2022 • Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal
Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences.
3 code implementations • NAACL 2021 • Arkil Patel, Satwik Bhattamishra, Navin Goyal
Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containing one-unknown arithmetic word problems, such problems are often considered "solved" with the bulk of research attention moving to more complex MWPs.
Ranked #1 on Math Word Problem SolvingΩ on MAWPS
1 code implementation • CONLL 2020 • Satwik Bhattamishra, Arkil Patel, Navin Goyal
Transformers are being used extensively across several sequence modeling tasks.