Search Results for author: Gantavya Bhatt

Found 10 papers, 6 papers with code

How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

1 code implementation • ACL 2020 • Gantavya Bhatt, Hritik Bansal, Rishubh Singh, Sumeet Agarwal

Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic tasks.

Ranked #35 on Language Modelling on WikiText-103 (Validation perplexity metric)

Language Modelling Sentence

Paper
Code

Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?

1 code implementation • SCiL 2021 • Hritik Bansal, Gantavya Bhatt, Sumeet Agarwal

However, we observe that several RNN types, including the ONLSTM which has a soft structural inductive bias, surprisingly fail to perform well on sentences without attractors when trained solely on sentences with attractors.

Inductive Bias

Paper
Code

Systematic Generalization in Neural Networks-based Multivariate Time Series Forecasting Models

1 code implementation • 10 Feb 2021 • Hritik Bansal, Gantavya Bhatt, Pankaj Malhotra, Prathosh A. P

Systematic generalization aims to evaluate reasoning about novel combinations from known components, an intrinsic property of human cognition.

Inductive Bias Multivariate Time Series Forecasting +3

Paper
Code

Matryoshka Representation Learning

4 code implementations • 26 May 2022 • Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, Vivek Ramanujan, William Howard-Snyder, KaiFeng Chen, Sham Kakade, Prateek Jain, Ali Farhadi

The flexibility within the learned Matryoshka Representations offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for long-tail few-shot classification, all while being as robust as the original representations.

Ranked #25 on Image Classification on ObjectNet (using extra training data)

4k Image Classification +2

416

Paper
Code

Accelerating Batch Active Learning Using Continual Learning Techniques

no code implementations • 10 May 2023 • Arnav Das, Gantavya Bhatt, Megh Bhalerao, Vianne Gao, Rui Yang, Jeff Bilmes

A major problem with Active Learning (AL) is high training costs since models are typically retrained from scratch after every query round.

Active Learning Continual Learning

Paper
Add Code

LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning

1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Arnav M. Das, Gantavya Bhatt, Yinglun Zhu, Jeffrey Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak

Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.

Active Learning Benchmarking +1

Paper
Code

Effective Backdoor Mitigation Depends on the Pre-training Objective

no code implementations • 25 Nov 2023 • Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Jeff Bilmes

In this work, we demonstrate that the efficacy of CleanCLIP in mitigating backdoors is highly dependent on the particular objective used during model pre-training.

Paper
Add Code

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models

no code implementations • 12 Jan 2024 • Gantavya Bhatt, Yifang Chen, Arnav M. Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak

To mitigate the annotation cost of SFT and circumvent the computational bottlenecks of active learning, we propose using experimental design.

Active Learning Experimental Design +1

Paper
Add Code

Deep Submodular Peripteral Networks

no code implementations • 13 Mar 2024 • Gantavya Bhatt, Arnav Das, Jeff Bilmes

In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a contrastive-learning inspired GPC-ready strategy to connect and then tackle both of the above challenges.

Active Learning Contrastive Learning +1

Paper
Add Code

Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

1 code implementation • 31 Mar 2024 • Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover

A common technique for aligning large language models (LLMs) relies on acquiring human preferences by comparing multiple generations conditioned on a fixed context.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.