no code implementations • 14 Oct 2024 • Alex Stein, Samuel Sharpe, Doron Bergman, Senthil Kumar, Bayan Bruss, John Dickerson, Tom Goldstein, Micah Goldblum
Moreover, these approaches often assume specific use-cases, for example that we know the labels of all historic events or that we only predict a pre-specified label and not the data's features themselves.
1 code implementation • 27 May 2024 • Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits.
1 code implementation • 21 Feb 2024 • Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein
It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements.
1 code implementation • 28 Feb 2023 • Alex Stein, Avi Schwarzschild, Michael Curry, Tom Goldstein, John Dickerson
It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individually rational.