Search Results for author: Satwik Bhattamishra

Found 12 papers, 8 papers with code

MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations

1 code implementation18 Oct 2023 Arkil Patel, Satwik Bhattamishra, Siva Reddy, Dzmitry Bahdanau

Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts.

In-Context Learning Semantic Parsing +1

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions

no code implementations4 Oct 2023 Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade

In this work, we take a step towards answering these questions by demonstrating the following: (a) On a test-bed with a variety of Boolean function classes, we find that Transformers can nearly match the optimal learning algorithm for 'simpler' tasks, while their performance deteriorates on more 'complex' tasks.

In-Context Learning

Structural Transfer Learning in NL-to-Bash Semantic Parsers

no code implementations31 Jul 2023 Kyle Duffy, Satwik Bhattamishra, Phil Blunsom

Large-scale pre-training has made progress in many fields of natural language processing, though little is understood about the design of pre-training datasets.

Machine Translation Semantic Parsing +2

DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization

no code implementations20 Jun 2023 Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov

With the increase in the scale of Deep Learning (DL) training workloads in terms of compute resources and time consumption, the likelihood of encountering in-training failures rises substantially, leading to lost work and resource wastage.

Model Compression Quantization +1

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

1 code implementation22 Nov 2022 Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom

(ii) When trained on Boolean functions, both Transformers and LSTMs prioritize learning functions of low sensitivity, with Transformers ultimately converging to functions of lower sensitivity.

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

1 code implementation ACL 2022 Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences.

Are NLP Models really able to Solve Simple Math Word Problems?

3 code implementations NAACL 2021 Arkil Patel, Satwik Bhattamishra, Navin Goyal

Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containing one-unknown arithmetic word problems, such problems are often considered "solved" with the bulk of research attention moving to more complex MWPs.

Math Math Word Problem Solving +1

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages

1 code implementation COLING 2020 Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

We find that while recurrent models generalize nearly perfectly if the lengths of the training and test strings are from the same range, they perform poorly if the test strings are longer.

On the Ability and Limitations of Transformers to Recognize Formal Languages

1 code implementation EMNLP 2020 Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model.

Cannot find the paper you are looking for? You can Submit a new open access paper.