1 code implementation • ACL 2020 • Gantavya Bhatt, Hritik Bansal, Rishubh Singh, Sumeet Agarwal
Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic tasks.
Ranked #35 on Language Modelling on WikiText-103 (Validation perplexity metric)
1 code implementation • SCiL 2021 • Hritik Bansal, Gantavya Bhatt, Sumeet Agarwal
However, we observe that several RNN types, including the ONLSTM which has a soft structural inductive bias, surprisingly fail to perform well on sentences without attractors when trained solely on sentences with attractors.
1 code implementation • 10 Feb 2021 • Hritik Bansal, Gantavya Bhatt, Pankaj Malhotra, Prathosh A. P
Systematic generalization aims to evaluate reasoning about novel combinations from known components, an intrinsic property of human cognition.
4 code implementations • 26 May 2022 • Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, Vivek Ramanujan, William Howard-Snyder, KaiFeng Chen, Sham Kakade, Prateek Jain, Ali Farhadi
The flexibility within the learned Matryoshka Representations offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for long-tail few-shot classification, all while being as robust as the original representations.
Ranked #25 on Image Classification on ObjectNet (using extra training data)
no code implementations • 10 May 2023 • Arnav Das, Gantavya Bhatt, Megh Bhalerao, Vianne Gao, Rui Yang, Jeff Bilmes
A major problem with Active Learning (AL) is high training costs since models are typically retrained from scratch after every query round.
1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Arnav M. Das, Gantavya Bhatt, Yinglun Zhu, Jeffrey Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.
no code implementations • 25 Nov 2023 • Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Jeff Bilmes
In this work, we demonstrate that the efficacy of CleanCLIP in mitigating backdoors is highly dependent on the particular objective used during model pre-training.
no code implementations • 12 Jan 2024 • Gantavya Bhatt, Yifang Chen, Arnav M. Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak
To mitigate the annotation cost of SFT and circumvent the computational bottlenecks of active learning, we propose using experimental design.
no code implementations • 13 Mar 2024 • Gantavya Bhatt, Arnav Das, Jeff Bilmes
In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a contrastive-learning inspired GPC-ready strategy to connect and then tackle both of the above challenges.
1 code implementation • 31 Mar 2024 • Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover
A common technique for aligning large language models (LLMs) relies on acquiring human preferences by comparing multiple generations conditioned on a fixed context.