Search Results for author: Kshitij Gupta

Found 17 papers, 10 papers with code

Translate and Classify: Improving Sequence Level Classification for English-Hindi Code-Mixed Data

1 code implementation • NAACL (CALCS) 2021 • Devansh Gautam, Kshitij Gupta, Manish Shrivastava

To translate English-Hindi code-mixed data to English, we use mBART, a pre-trained multilingual sequence-to-sequence model that has shown competitive performance on various low-resource machine translation pairs and has also shown performance gains in languages that were not in its pre-training corpus.

Machine Translation Natural Language Inference +2

Paper
Code

CoMeT: Towards Code-Mixed Translation Using Parallel Monolingual Sentences

1 code implementation • NAACL (CALCS) 2021 • Devansh Gautam, Prashant Kodali, Kshitij Gupta, Anmol Goel, Manish Shrivastava, Ponnurangam Kumaraguru

Code-mixed languages are very popular in multilingual societies around the world, yet the resources lag behind to enable robust systems on such languages.

Machine Translation Translation

Paper
Code

Towards Detecting Political Bias in Hindi News Articles

no code implementations • ACL 2022 • Samyak Agrawal, Kshitij Gupta, Devansh Gautam, Radhika Mamidi

Political propaganda in recent times has been amplified by media news portals through biased reporting, creating untruthful narratives on serious issues causing misinformed public opinions with interests of siding and helping a particular political party.

Bias Detection Transfer Learning

Paper
Add Code

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.

Continual Pretraining Language Modelling

Paper
Add Code

Scaling Instructable Agents Across Many Simulated Worlds

no code implementations • 13 Mar 2024 • SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi, Zhitao Gong, Lucy Gonzales, Kshitij Gupta, Karol Gregor, Arne Olav Hallingstad, Tim Harley, Sam Haves, Felix Hill, Ed Hirst, Drew A. Hudson, Jony Hudson, Steph Hughes-Fitt, Danilo J. Rezende, Mimi Jasarevic, Laura Kampis, Rosemary Ke, Thomas Keck, Junkyung Kim, Oscar Knagg, Kavya Kopparapu, Andrew Lampinen, Shane Legg, Alexander Lerchner, Marjorie Limont, YuLan Liu, Maria Loks-Thompson, Joseph Marino, Kathryn Martin Cussons, Loic Matthey, Siobhan Mcloughlin, Piermaria Mendolicchio, Hamza Merzic, Anna Mitenkova, Alexandre Moufarek, Valeria Oliveira, Yanko Oliveira, Hannah Openshaw, Renke Pan, Aneesh Pappu, Alex Platonov, Ollie Purkiss, David Reichert, John Reid, Pierre Harvey Richemond, Tyson Roberts, Giles Ruscoe, Jaume Sanchez Elias, Tasha Sandars, Daniel P. Sawyer, Tim Scholtes, Guy Simmons, Daniel Slater, Hubert Soyer, Heiko Strathmann, Peter Stys, Allison C. Tam, Denis Teplyashin, Tayfun Terzi, Davide Vercelli, Bojan Vujatovic, Marcus Wainwright, Jane X. Wang, Zhengdong Wang, Daan Wierstra, Duncan Williams, Nathaniel Wong, Sarah York, Nick Young

Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI.

Paper
Add Code

Simple and Scalable Strategies to Continually Pre-train Large Language Models

1 code implementation • 13 Mar 2024 • Adam Ibrahim, Benjamin Thérien, Kshitij Gupta, Mats L. Richter, Quentin Anthony, Timothée Lesort, Eugene Belilovsky, Irina Rish

In this work, we show that a simple and scalable combination of learning rate (LR) re-warming, LR re-decaying, and replay of previous data is sufficient to match the performance of fully re-training from scratch on all available data, as measured by the final loss and the average score on several language model (LM) evaluation benchmarks.

Continual Learning Language Modelling

6,563

Paper
Code

Continual Pre-Training of Large Language Models: How to (re)warm your model?

2 code implementations • 8 Aug 2023 • Kshitij Gupta, Benjamin Thérien, Adam Ibrahim, Mats L. Richter, Quentin Anthony, Eugene Belilovsky, Irina Rish, Timothée Lesort

We study the warmup phase of models pre-trained on the Pile (upstream data, 300B tokens) as we continue to pre-train on SlimPajama (downstream data, 297B tokens), following a linear warmup and cosine decay schedule.

Language Modelling

6,563

Paper
Code

ARB: Advanced Reasoning Benchmark for Large Language Models

no code implementations • 25 Jul 2023 • Tomohiro Sawada, Daniel Paleka, Alexander Havrilla, Pranav Tadepalli, Paula Vidas, Alexander Kranias, John J. Nay, Kshitij Gupta, Aran Komatsuzaki

As a subset of ARB, we introduce a challenging set of math and physics problems which require advanced symbolic reasoning and domain knowledge.

Math

Paper
Add Code

Broken Neural Scaling Laws

1 code implementation • 26 Oct 2022 • Ethan Caballero, Kshitij Gupta, Irina Rish, David Krueger

Moreover, this functional form accurately models and extrapolates scaling behavior that other functional forms are incapable of expressing such as the non-monotonic transitions present in the scaling behavior of phenomena such as double descent and the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic.

Adversarial Robustness Continual Learning +8

Paper
Code

Data Augmentation for Automated Essay Scoring using Transformer Models

no code implementations • 23 Oct 2022 • Kshitij Gupta

It has been explored for a number of years, and it remains partially solved.

Automated Essay Scoring Data Augmentation +1

Paper
Add Code

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

no code implementations • 1 Oct 2022 • Kshitij Gupta

We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation.

Data Augmentation Language Modelling +3

Paper
Add Code

cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation

1 code implementation • 7 Jun 2022 • Kshitij Gupta, Devansh Gautam, Radhika Mamidi

We propose a pipeline that utilizes English-only vision-language models to train a monolingual model for a target language.

Knowledge Distillation Question Answering +1

Paper
Code

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

2 code implementations • 30 May 2022 • Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio

A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors.

Decision Making Inductive Bias

2,637

Paper
Code

Volta at SemEval-2021 Task 6: Towards Detecting Persuasive Texts and Images using Textual and Multimodal Ensemble

1 code implementation • SEMEVAL 2021 • Kshitij Gupta, Devansh Gautam, Radhika Mamidi

Memes are one of the most popular types of content used to spread information online.

Classification Multi-Label Classification +1

Paper
Code

ViTA: Visual-Linguistic Translation by Aligning Object Tags

1 code implementation • Workshop on Asian Translation 2021 • Kshitij Gupta, Devansh Gautam, Radhika Mamidi

Multimodal Machine Translation (MMT) enriches the source text with visual information for translation.

Ranked #1 on Multimodal Machine Translation on Hindi Visual Genome (Test Set) (using extra training data)

Multimodal Machine Translation Object +1

Paper
Code

Volta at SemEval-2021 Task 9: Statement Verification and Evidence Finding with Tables using TAPAS and Transfer Learning

1 code implementation • SEMEVAL 2021 • Devansh Gautam, Kshitij Gupta, Manish Shrivastava

We fine-tune TAPAS (a model which extends BERT's architecture to capture tabular structure) for both the subtasks as it has shown state-of-the-art performance in various table understanding tasks.

Logical Reasoning Transfer Learning

Paper
Code

IlliniMet: Illinois System for Metaphor Detection with Contextual and Linguistic Information

no code implementations • WS 2020 • Hongyu Gong, Kshitij Gupta, Akriti Jain, Suma Bhat

Metaphors are rhetorical use of words based on the conceptual mapping as opposed to their literal use.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.