Search Results for author: Neel Jain

Found 12 papers, 8 papers with code

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

no code implementations12 Feb 2025 Siddharth Singh, Prajwal Singhania, Aditya Ranjan, John Kirchenbauer, Jonas Geiping, Yuxin Wen, Neel Jain, Abhimanyu Hans, Manli Shu, Aditya Tomar, Tom Goldstein, Abhinav Bhatele

Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack.

Blocking Memorization

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

no code implementations9 Dec 2024 Neel Jain, Aditya Shrivastava, Chenyang Zhu, Daben Liu, Alfy Samuel, Ashwinee Panda, Anoop Kumar, Micah Goldblum, Tom Goldstein

A key component of building safe and reliable language models is enabling the models to appropriately refuse to follow certain instructions or answer certain questions.

LiveBench: A Challenging, Contamination-Free LLM Benchmark

1 code implementation27 Jun 2024 Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann Lecun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

In this work, we introduce a new benchmark for LLMs designed to be immune to both test set contamination and the pitfalls of LLM judging and human crowdsourcing.

Instruction Following Math

GenQA: Generating Millions of Instructions from a Handful of Prompts

no code implementations14 Jun 2024 Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer, Tianyi Zhou, Tom Goldstein

Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models.

Transformers Can Do Arithmetic with the Right Embeddings

1 code implementation27 May 2024 Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits.

Position

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

1 code implementation1 Sep 2023 Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-Yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein

We find that the weakness of existing discrete optimizers for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

1 code implementation23 Jun 2023 Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative.

Chatbot Language Modeling +1

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

2 code implementations NeurIPS 2023 Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein

In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model.

Cannot find the paper you are looking for? You can Submit a new open access paper.